Peter's Blog

Redefining the Impossible

Items filed under tagging


Based on comments left on this site I decided to look at TiddlyWiki, a personal Wiki system. I'm quite amazed by it as it is based on a single html file that does it all: it contains the code in javascript and also the CSS. It gives you a wiki editor plus storage for your wiki entries, all in the same file.

The entries in the Wiki are stored in 'Tiddlers', most of which are hidden when the html file is displayed. When you click on a Wikilink in the file the corresponding tiddler is displayed in the same page (not in a different page as is the case in a regular wiki). You can double-click on any open tiddler to open it in an editor. If you edit the file or add new tiddlers you can save it to your local hard disk, giving you a new version of the html file, complete with your edited tiddlers. The TiddlyWiki html file can be stored where you like, even on a USB flash key, giving you a pretty portable notebook with no software to install wherever you are using it. If you were so moved you could put the file on a web server, unmodifed, where folk could explore it wiki-fashion or download the thing to their own computer and edit it themselves. Cool. The TiddlyWiki site http://www.tiddlywiki.com, is itself a TiddlyWiki.

There are variations on the TiddlyWiki code that work with a server backend to store the tiddlers but the mainstream implementation is designed to work as a personal wiki system stored on your own computer (or flash key).

In a way the single file implementation bothers me because it means the wiki wouldn't scale very well: if I put 1000 entries in a wiki these are all embedded in one html file. This will take time to load into the computers memory where the browser will create a huge DOM model for it. FireFox is pretty memory hungry.

The ServerSide implementations that exist appear to load the tiddlers on demand so may not have this problem. However, in this case I am not sure how the search facility is implemented, whether it is delegated to a database engine. I have the code, I should look. There is a closed source Ruby-On-Rails version and in theory I could download the client side code and reverse engineer the server side code but that would be an objectionable way of doing things by my value system. I would only do that with an open-source system, maybe based on PHPTiddlyWiki creating PythonTiddlyWiki.

I think I could live with multiple TiddlyWiki files, one per subject, all bookmarked in FireFox: not such a big problem to context switch. In terms of Quick Blogging there is an ALT-J keypress that starts a new 'journal entry', leaving me to type what I want, press TAB, enter the tags, CTRL-Enter to finish. That's nice.

The way entries are stored in the html file is pretty clean: DIV's store the entries, with attributes holding the metadata: title/wikiname, modification date, modifier and tags. Easy to get the data out if need be.

I'll give it a try, see if I am still using it after a week.


Add a comment

Latest new toy is EverNote, an interesting note-taking tool. It has the potential to be a useful thing for making stream-of-consciousness type notes. It's paradigm is that of an infinitely long piece of paper (toilet roll?) that you type your notes into. It has a level of formatting roughly equivalent to html and you can insert pictures. It is very easy to use, just enough features. Each new note is tagged with the date and time of creation and you can assign it to multiple categories (aka tagging, a concept I like).

It supports XML export and it automatically backs its database up to an XML file every day. The XML is very comprehensive and includes any graphics you put in the notes. It is entirely feasable I could knock up some python to chuck these notes into a blog.

I am using the free version of EverNote, there is one you can pay for but that seems to be targeted at tablet users or people who want handwriting recognition (like OneNote but better designed).

If I had one criticism of it (looking this gift horse in the mouth) it is that it is a little too fiddly to assign notes to categories: you have to drag/drop or select them from a list. For real rapid note-taking I'd rather type keywords into a little box, with maybe some autocompletion thrown in. Instead I can just type the categories into my notes myself.

My latest note-taking paradigm is this:

.blah notes

This is a note about going blah. Don't do it: it's a waste of time.

The first line is a list of tags. The leading . denotes this. The rest of it is the body of the note. What's missing? A title. Who reads titles? The principle is, don't waste time thinking of titles, let the tags and the content do the talking. Dave Winer, illustrious inventor of blogging, rss et al is famed for his title-free posts. If you want a title, start the post with a headline (<h1>).


Filed under: tagging

4 Comments

I've discovered Thunderbird saved searches. They allow me to create items in the tree thing on the left that act like folders full of messages that meet certain search characteristics. For example, I can have an item called 'fred' containing all messages from 'fred'. Previously to do this I would create a folder and get my filtering rules to try to shuffle incoming messages into the right folder. In outlook this was always flaky, in thunderbird I've never bothered. Saved searches are much easier to set up and have the advantage that if you change the search terms then the changes are applied immediately, you don't have to run all your messages through filtering rules again. Also messages can meet more than one search pattern: I could have a search for everyone in project X and I could also have searches for specific people in project X: the same messages could appear in both.

It's working nicely with Microsoft Exchange via IMAP: the searches occur instantly, no real overhead. This may be because the IMAP is offloading the search to the server. If the search was done locally I doubt that it would be so transparent.

Hum, as a concept these stored searches could be considered an alternative to tagging. Instead of manually having to mark an article as being about, say, ubuntu, the stored search would automatically search for the word ubuntu in the article and list the matches. It is more fiddly to create a search than a tag but they would require less maintenance. Tagging does give the possibility of structuring articles. Something else to think about.


2 Comments

I realised that a lot of my old postings were not tagged by awtags because I hadn't been through them to categorise them and, worst still, I couldn't be bothered. This meant that the postings weren't indexed unless someone went way back through the blog history.

I decided to create a new tag for them called untagged. I wrote a python script to look for awtags with no tags assigned and to assign them to this new tag. I used python because I am far more confident in it than I am in php. There is not much point in making this a module or anything because it only needs doing once if I am methodical about giving tags to new postings. I could also have done this in raw SQL but the version of MySQL on Site5 does not support nested selects.

Once the 'untagged' tag is in place the only chore is to remove this tag from postings that I generate new tags for using my search facility. This can be done through the awtags administration interface (e.g. search for tag 'whatever' and remove tag 'untagged'). Also, now I can easily list the untagged articles, it is much easier to see what tags need adding.

   1  #
   2  # Assign a tag to nodes with no awtags
   3  #
   4  import MySQLdb
   5  import DBTable
   6  
   7  o = MySQLdb.Connect( 'localhost', '<mysql user name', '<password>', '<mysql database name>')
   8  
   9  oAwNodeDB = DBTable.DBTable( o, 'awtags_node')
  10  
  11  oAwNodeDB.Select()
  12  
  13  oTaggedNodes = {}
  14  
  15  while 1:
  16      oRow = oAwNodeDB.FetchOne()
  17      if not oRow:
  18          break
  19      oTaggedNodes[oRow['nid']] = 1
  20  
  21  oNodeDB = DBTable.DBTable( o, 'node')
  22  
  23  oNodeDB.Select( "SELECT * FROM node WHERE node.type = 'blog'")
  24  
  25  oNodes = {}
  26  
  27  while 1:
  28      oRow = oNodeDB.FetchOne()
  29      if not oRow:
  30          break
  31      oNodes[oRow['nid']] = 1
  32  
  33  for nNid in oNodes.keys():
  34      if not oTaggedNodes.has_key( nNid):
  35          oDict = { 'nid': nNid,
  36                          'tid': 84}
  37          oAwNodeDB.Insert( oDict)

This assumes the 'untagged' tag has a tid of 84: you should create your own tag and see what number it is.

This uses the DBTable module I wrote a while back. I discovered to my delight that the python odbc and MySQLdb modules had virtually identical interfaces so this module worked largely unchanged. I had to tweek it a bit because the field types were recorded as numbers instead of strings. Here is the modified version. It should work with odbc as well.

   1  #
   2  # Database wrapper class.
   3  #
   4  class DBTable:
   5      """
   6      Wrapper for database table
   7      """
   8      FIELD_TYPE = 0
   9  
  10      def __init__( self, oConnection, strTable):
  11          self.oConnection = oConnection
  12          self.strTable = strTable
  13  
  14          oCursor = oConnection.cursor()
  15          oCursor.execute( "SELECT * FROM %s" % strTable)
  16  
  17          self.oFields = [ oField[0] for oField in oCursor.description]
  18          self.oFieldDescription = dict( [ (oField[0],
  19                                   oField[1:]) for oField in oCursor.description])
  20  
  21      def Select( self, strQuery = None):
  22          """
  23          Select records from query
  24  
  25          Takes either SQL of select statement or a dictionary containing
  26          field names and values to find.
  27          """
  28          self.oCursor = self.oConnection.cursor()
  29          if strQuery == None:
  30              self.oCursor.execute( "SELECT * FROM %s WHERE 1" % (self.strTable))
  31          elif type( strQuery) == type(""):
  32              self.oCursor.execute( strQuery)
  33          else:
  34              #
  35              # assume query is a dict
  36              #
  37              self.oCursor.execute( "SELECT * FROM %s WHERE %s" % (self.strTable,
  38                                                       self.DictToWhere( strQuery)))
  39  
  40      def FetchOne( self):
  41          """
  42          Get next row of results
  43          Returns a dictionary holding field names and values.
  44          """
  45          oRow = self.oCursor.fetchone()
  46          if oRow:
  47              """
  48              Build a dictionary to map field name->value
  49              """
  50              return dict([(self.oFields[i], oRow[i]) for i in range(len(oRow))])
  51          else:
  52              return None
  53  
  54      def Insert( self, oDict):
  55          """
  56          Insert a row in the database
  57          Takes a dictionary holding field names and values.
  58          """
  59          strFields = oDict.keys()
  60          strValues = []
  61          for strField in strFields:
  62              strValue = oDict[strField]
  63              strType = self.oFieldDescription[strField][DBTable.FIELD_TYPE]
  64  
  65              strValues.append( self.FormatField( strField, strValue))
  66  
  67          strSQL = "INSERT INTO %s ( %s) VALUES(%s);" % ( self.strTable,
  68                                                          ", ".join( strFields),
  69                                                          ", ".join( strValues))
  70          print strSQL
  71          oCursor = self.oConnection.cursor()
  72          oCursor.execute( strSQL)
  73          self.oConnection.commit()
  74  
  75      def Update( self, oDictWhere, oDictNew):
  76          """
  77          Update a row in the database
  78          Takes a dictionary holding field names and values to find
  79          and dictionary to replace it with.
  80          """
  81          strFields = oDictNew.keys()
  82          strValues = []
  83          for strField in strFields:
  84              strValue = oDictNew[strField]
  85              strType = self.oFieldDescription[strField][DBTable.FIELD_TYPE]
  86  
  87              strValues.append( "%s = %s" % (strField, self.FormatField( strField, strValue)))
  88  
  89          strSQL = "UPDATE %s SET %s WHERE %s;" % ( self.strTable,
  90                                                    ", ".join( strValues),
  91                                                    self.DictToWhere( oDictWhere))
  92          print strSQL
  93          oCursor = self.oConnection.cursor()
  94          oCursor.execute( strSQL)
  95          self.oConnection.commit()
  96  
  97      def InsertOrUpdate( self, oDictWhere, oDictNew):
  98          """
  99          Seek record in database, add it if not found, update it if found.
 100          """
 101          self.Select( oDictWhere)
 102          if self.FetchOne():
 103              self.Update( oDictWhere, oDictNew)
 104          else:
 105              self.Insert( oDictNew)
 106  
 107      def Delete( self, oDict):
 108          """
 109          Delete row based on dictionary contents
 110          Takes a dictionary holding field names and values.
 111          """
 112          strSQL = "DELETE FROM %s WHERE %s;" % ( self.strTable, self.DictToWhere( oDict))
 113  #        print strSQL
 114          oCursor = self.oConnection.cursor()
 115          oCursor.execute( strSQL)
 116          self.oConnection.commit()
 117  
 118      def DictToWhere( self, oDict):
 119          """
 120          Convert dictionary to WHERE clause.
 121          """
 122          strFields = oDict.keys()
 123          strExpressions = []
 124  
 125          for strField in strFields:
 126              strValue = oDict[strField]
 127              strType = self.oFieldDescription[strField][DBTable.FIELD_TYPE]
 128  
 129              strValue = self.FormatField( strField, strValue)
 130  
 131              strExpressions.append( '%s = %s' % (strField, strValue))
 132  
 133          return " AND ".join( strExpressions)
 134  
 135      def FormatField( self, strField, strValue):
 136          """
 137          Format a field for an sql statement.
 138          """
 139          strType = self.oFieldDescription[strField][DBTable.FIELD_TYPE]
 140          if strType == 'STRING':
 141              return "'%s'" % str(strValue).replace( "'", "''")
 142          elif strType == 'NUMBER' or strType == 3:
 143              return '%d' % int( strValue)
 144          else:
 145              #
 146              # Treat as a string.
 147              #
 148              return "'%s'" % str(strValue).replace( "'", "''")
 149  

Filed under: awtags drupal mysql tagging

Add a comment

Tagging is the concept of assigning tags to items to organise them into categories. Compared to the traditional way of storing items in a strict hierarchy of folders tagging is more flexible as items can be stored under multiple categories. awtags provides tagging on this site.


Filed under: tagging

Add a comment

awTags was adding an entry to the navigation menu called 'My Tags'. This was irritating me because it was presented to anonymous users and was the only reason for the navigation menu to appear. Looking in 'awtags.module', there are no options to control it so I changed the source so it will only appear for logged-in users:

   1  /*
   2   * Implementation of hook_menu
   3   */
   4  function awTags_menu($may_cache) {
   5    global $user;
   6  
   7    $items = array();
   8  
   9    if ($may_cache) {
  10  
  11      // pcw: only logged in users can have 'my tags'
  12      if( $user->uid) {
  13         // /usertags/tags (my tags)
  14         $items[] = array(
  15         'path' => "usertags/$user->uid",
  16         'title' => t('my tags'),
  17         'access' => user_access('access tags'),
  18         'callback' => '_awtags_page',
  19         'callback arguments' => $user->uid,
  20         'type' => MENU_DYNAMIC_ITEM);
  21     }
  22  
  23     ... rest of function unchanged.
  24  }

I tested this in IE where I am anonymous (like all IE users) and no change. Forgot to flush the damn Drupal cache for the umpteenth time: the menu's are cached. I took the time to knock up a php script to flush the cache for me so I don't have to fiddle with the mysql command line:

<?php
include_once 'includes/bootstrap.inc';
include_once 'includes/common.inc' ;

db_query('DELETE FROM {cache}');

echo( "Done");

?>

Save the above in a file called FlushCache.php on your server and just open it in a browser to flush the cache. It may be advisable to set up your .htaccess so that only you can access the file:

<Files "FlushCache.php">
  order deny,allow
  deny from all
  allow from [my ip address]
</Files>

Add a comment

More awtags notes:

  • I am getting more traffic now I have awtags installed. The tag listings get google matches, maybe because lots of related keywords are on the same pages. Ideally google would only index individual postings but I am not sure how to persuade robots.txt to do this without accidently driving all traffic away. The slow testing cycle is the problem, I may not know I've broken it all for a few weeks.
  • I thought it would be nice to have a sticky article at the top of each tags listing to give some general notes about the tag. The most descriptive article is most likely to be the last in the list as the entries are in reverse chronological order. I have created a test page, not promoted to front page, tagged it with 'awtags' and made it sticky. This posting you are reading is partly for blogging reasons and also to produce a later entry that should still appear below the sticky article. Anyway, test it out here. UPDATE: the test failed sad so I changed the sql in the function awTagsAPI_GetNodesForTag in awTags.inc to make it consider the sticky bit as follows:
      $sql = "SELECT DISTINCT n.nid, n.title, tn.tid FROM {node} n " .
          "INNER JOIN {awtags_node} tn ON n.nid = tn.nid WHERE n.status = 1 AND " .
          "tn.tid = '" .
          $tag . "' ORDER BY sticky DESC, created DESC";
    

Filed under: awtags drupal google tagging

Add a comment

I use bloglines for my rss aggregation. On it I subscribe to my own rss feeds to reassure myself that they are reaching the outside world. Since I installed the awtags module my postings have all contained links to their tag entries. This doesn't really bother me, it may encourage people to visit my site if they see useful links in the rss feed.

Bloglines tries to deliver articles that have changed and it appears to do this by comparing the contents of the rss file with its previous contents. If there are any small changes bloglines displays the article in the same way as it does for new articles.

I don't like to hastle people when I edit articles to correct spelling mistakes or whatever so I have altered the drupal ping module to only ping (i.e. tell the outside world) if a new posting is created, not if it is modified. However, bloglines appears to poll my feed and so the slightest change will result in articles appearing as if the are new. Today I assigned some tags to some older articles using awtags and this was suficient for the them to be displayed by blogines as new articles as they were still in the rss feed.

This is the long way of apologising to anyone who thinks I a winding them up by republishing old articles with no noticable changes.

I see I have a new subscribed on bloglines. Welcome, hope you find my whitterings interesting. I am getting more visitors since I started using awtags, especially from technorati.


Add a comment

Two concepts I have been contemplating recently are starting to blur together: outliners and tags.

I have read of people who have moved from Outliners to Wikis as a means of organising their notes. I have found Wiki's rather simplistic and unattractive, especially as web-based editors are sluggish compared to desktop applications.

However, now I have changed my site to use awTags I see how tags can be used to implement a wiki. But this is better than a wiki: a link leads to a list of related articles rather than just a single page. In the manner of an outliner, an article leads on to a list of child articles according to which tag you follow. The article itself can be tagged with a number of different tags representing different concepts that the article itself can be filed under and these links can be regarded as parent relationships. This is just what TheBrain is trying to do: Mind Mapping but without the fancy graphics.

I have altered my wilki module accordingly and have used it right there. By typing in something as simple as

[wilki|tags/wilki]

I created a link to all my wilki module related postings filed under the wilki tag. Alternatively, I could have linked to the wilki introductory article and from there a reader, once they know what the wilki module is, can follow the wilki tag if they so desire.

This is all pretty abstract and I will have to see if it is actually of any use in the real world. On a practical level it will be easier for me to reference other articles by linking to the tag name, rather than trying to find the drupal node number of a specific article I am thinking of. The downside is that going to the introductory article is probably a better pattern to follow, especially as articles are listed in reverse chronological order, putting the most informative introduction at the very end.


2 Comments

Another awTags feature is pinging technorati with tags (or something like that). From my statcounter logs I see I am already getting more traffic from there.


3 Comments