Peter's Blog

Redefining the Impossible

Items filed under rss


I'm looking at AideRSS an 'intelligent' RSS service. It has bang whizzo technology to rate articles in RSS feeds so you can choose to look at only the best. I just ran it on this blog (see here) and it thinks that of the last fifteen posts (i.e. an RSS bucketful) 26% are good, 26% are great and 26% are the best. I think these are all the same 26% (i.e. 4 posts) in each category and the ratings are a bit polarised, either 10/10 or 0/10 according to whether or not they have been cited on Bloglines (whatever that means). The results seem arbitrary, some wow posts are rated and not others.

Now I have more reading time on my hands I could do with a nice intelligent rss filtering service. For example, on the whole I like Engadget except for the endless new MP3 players and mobile phones they reel off. Likewise on the whole I like Slashdot which is where I discovered AideRSS but only maybe one in five stories there warrant my attention (e.g. software patents: not a must read for me).

Anyway, I'll try setting it up and seeing whether it filters on what it thinks I like or what everyone else seems to like. There is a difference.

Todo: look at bloglines, see how to cite my own posts.


Filed under: aiderss rss


Google personalised home page has got me back into Slashdot. I got bored with slashdot after using an RSS feed that only showed article headlines as teasers to the content. Ironically Google home page only shows headlines.

On a whim I tried logging into my old slashdot account for the first time in what must be four or five years and it remembered me!

I did some metamoderating (moderating the moderators) as it helps while away the time.


Filed under: google rss

4 Comments

I have tried a few rss reading options for my ipaq. All these allow articles to be downloaded to the pda for reading offline when out of wifi range or with the wifi turned off to save the battery.

  • avantgo: an online service that generates feeds suitable for display offline on a pda. It's free if you use less than 2M if data a day. You can chose from specially formatted content such as the guardian newspaper or any rss feed
  • feederreader: an rss aggregater that supports enclosures i.e. it can download podcasts. I found this over complicated and fiddly. It is free although donations are encouraged.
  • egress: a commercial rss reader but I have found it much better than the others and it isn't that expensive. It also supports enclosures.

I wanted an rss reader that supported enclosures so I could download podcasts directly to the pda where I can listen to them anywhere, no messing with synchronisation. I have even plugged the pda into my hifi with good results.

I first used feederreader for this but I finally abandoned it because it gave the files it downloaded meaningless names containing just numbers. Egress gives them the name the author gave them which better describe the contents. Ok I could launch the playback from within feederreader but I don't want to, I don't want to have too many apps open at once or the pda gets flaky

I then found Egress to be a pretty good way to read rss in itself: the buttons in the pda step through the articles nicely. I haven't bothered with avantgo since I started using egress. I'm still in the trial period but i'm sure I will buy it.


Filed under: pocketpc rss wifi

4 Comments

Bought myself a new PDA, an HP Ipaq rx1950 Pocketpc. It was a bit of an impulse buy, a bit of retail therapy I felt in need of a few weeks back. The model I bought was essentially the cheapest I could get locally that had integrated wifi.

I used to use a palm tungsten T2 but it's digitiser does not work properly, it takes about five goes to get past the initial calibration and it's downhill from there. I still had a hankering for a pda, a portable notetaking device to serve as a backup for my failing memory.

Following my experiences with the palm and bluetooth I decided that wifi was pretty much essential, bluetooth was slow and the bluetooth stacks in windows complex and buggy.

The rx1950 runs Windows Mobile 2005, the latest name for Windows CE/pocketpc. It is a stylus driven thing, not my first preference but keyboard models are more expensive and have smaller screens.

Bullet point review:

  • first impression when I took it out the box was how light it is, much lighter than my old palm. It's fairly thin as well, maybe half an inch thick. It would fit better in a trouser pocket if not for the carry case that comes with it which is almost as thick again.
  • nice solid build quality: HP after all.
  • stylus has a tendency to fall out: I put it in the carrying case the wrong way round to restrain it or I would certainly lose it.
  • no charging cradle: comes with a USB cable and a mains adapter that plugs into the USB cable: rather awkward and fiddly and annoying that it cannot charge from the USB port. There are aftermarket USB cables available that will charge it.
  • synchronises with the pc using activesync and I was amazed to discover that it won't sync over the wifi. Apparently microsoft decided it was a security hole and they couldn't think of a way to plug it so they just removed the feature. This leaves USB or infra-red so I'm using USB when I feel the need. Many pocketpc apps are available as cab files that can be downloaded directly to the device over the wifi and installed. Some producers are not enlightened to this and ship apps with windows installers that use activesync and hence can only be installed at home base.
  • the device has 32M of 'program memory' for running programs data and this isn't really enough. The symptoms of this seem to be the o/s terminating apps that are not in the foreground. If you have, say, windows media player playing something then you can only run maybe one more program before something gets randomly zapped. Microsoft have tried to create a paradigm where you don't have separate applications but flip between different modes without worrying about having to close apps down. Apps are closed automatically by the OS as memory runs low and should be designed to save their state such that when they are reopened they are in the same state as when they were shut down. Unfortunately it looks like developers use standard development techniques and applications being suddenly terminated by the OS leaves the user high and dry.
  • it has another 32M for storing programs but I bought a 1G SD card and I put everything in that.
  • the screen is just about big enough. It is ok for reading without scrolling too much. For most web browsing it is awful unless I use http://skweezercom or google mobile to strip out any fancy stuff.
  • the character recognisers that come with it are not much good: the handwriting recognition (transcriber) is slow and inaccurate. I am using something called Tengo which is a bit like predictive text on a phone so involves hitting only six big buttons. It has some clever design features and there is something about it that is kinda fun. I am writing this review with it so you may spot predictive-text style wrong word errors.
  • a frequently used feature is the reset button: windows mobile is a typical windows o/s and isn't sophisticated enough to offer robust task management. Applications can lock it hard and banging on an unresponsive plastic screen is particularly fruitless.
  • it has speaker and microphone. The speaker is just about loud enough to listen to podcasts. I haven't tried skype on it, the skype site doesn't list the rx1950 as supported and the cpu may not be fast enough.
  • has 3.5mm audio jack and the sound to my ears was pretty good. It is a good mp3 player.
  • battery life is very good, one charge gives a good day of use.

As a device for browsing rss feeds while watching tv, taking notes, listening to music, watching videos or whatever it is just fine. It is more convenient than a laptop and I can carry it in my pocket so is far more mobile. I can see that it is a quirky platform and will probably have been killed off by mobile phones and mp3 players within two years time.

I'll write about the applications I have installed in it some other time.


7 Comments

I like using a web base aggregator so that I can access it from whereever I may be and it will remember my subscrptions and what articles I have already read. For some time I've been toying with the idea of installing an RSS aggregator on my dedicated server. This would be something to replace bloglines but which I could tweak to my liking. I haven't been in a big hurry to do this as bloglines has been ok thus far. However, the other day I was using it and I realised how slow it can be, waiting for it to download new articles. I thought this was an excellent reason to seek alternatives.

I had a look around for web based aggragators and found the wikipedia list of news aggregators and from this I found Gregarius. Installation was very straightforward:

  • Create mysql database, user and password for it
  • untar Gregarius into domain directory
  • edit config file to tell it details of the database
  • access it through web server.

It's pretty standard LAMP stuff.

It's nice to use, very configurable, much more so than bloglines. You can control when the feeds get aggregated (for now I plan to press the refresh button to manually trigger aggregation), how many articles per page, you can read as a 'river of news' (articles from different sources mixed togther by creation date) or read them grouped by feed. Other aggregators only support the 'river of news' format which I don't really like as I don't like to mentally context switch between unrelated topics between each posting: for example, first post is a dilbert cartoon, followed by bbc news about untimely deaths, followed by an engadget piece about a new mp3 player, it's all over the place. I prefer to go through my feeds in the order that is interesting to me at the time (cherry pick, Dilbert first normally).

Gregarius supports a few themes and plugins and so far these have fixed the minor quibbles I had. For example, I wanted to read postings in order of creation rather than newest first which means reading down the page. However, the 'mark all as read and get some more' button was at the top of the page some once I got to the bottom I had to scroll to the top to get more articles. Someone had already had this problem and there was a plugin available to put an extra button at the bottom of the page.

One little hack I had to do was with the 'lilina' theme which supports a button to collapse the left sidebar which will be useful on my Dell d410 with a 12" screen. However, this theme also collapses the articles down to titles by default which I did not want. I fixed this be editing the themes/lilina/item.php file as follows:

    <!--pcw <div class="content" id="c<?php echo rss_item_id(); ?>" style="display:none">-->
    <div class="content" id="c<?php echo rss_item_id(); ?>" style="display:block"

i.e. by default the article contents is display:block rather than display:none.

What I most miss from bloglines is being able to press 'j' to go to the next article.

What I dislike about Gregarius most is that it isn't spelt Gregarious.

ToDo: figure out how to get cron to prune old articles from the database. It can be done explicitly through the admin pages but I don't want more chores.


Filed under: bloglines gregarius rss

3 Comments

I had a play with googles online rss reader. It is very fancy but I don't think it would tempt me away from bloglines:

  • You only see one article at a time. Bloglines shows you all the articles from a feed in one page which I think is better for spotting the interesting stuff. You can step through the articles with the j button in both bloglines and the google reader (coincidence? Group homage to vi?)
  • I don't want to go through a mix of articles from all feeds in chronological order. I like to go through feed by feed. If I open the list of feeds in the google reader it occupies half the screen on my d410.
  • It is a bit buggy: at one point the 'loading' splash thing got stuck on. But like most google software it is beta.
  • It carefully strips the colour coding and indentation out of syntax highlighted code. Bloglines strips the indentation. I am not sure why they do this, is it the presentation police?

What I am looking for is the mysthical rss aggregator with bayesian filtering that will learn what interests you and only show that. I'd write one if there were only nine days in every week.

Conclusion: google see rss as the future and want to slap adverts all over my blatherings.


Filed under: bloglines google rss

1 Comment

If my rss subscribers are getting a flood of seemingly duplicate postings it is because I decided to reformat my posts a tad to emphasise the awtags links below the node bodies. I edited the awtags source to change the word 'tags:' to the more informative 'Related Topics:' and I edited the awTags_TagLinks css style to delineate the links from the node body:

   1  .awTags_TagLinks {
   2      padding: 5px;
   3      margin: 20px 10px 10px 10px;
   4      border-top: 1px solid black;
   5  }
   6  
   7  .sticky .awTags_TagLinks {
   8      visibility: hidden;
   9      display: inline;
  10  }
Toggle Line Numbers

This also hides awtags from sticky nodes which I use at the top of tag descriptions.


Filed under: awtags blogging drupal rss


I decided to update my python podcast script ipydder.py. Based on the comments for the old version I ditched the horrible sax parser and used the lovely Beautiful Soup. This has simplified it a great deal, 49 lines of code if I delete all the comments.

   1  #!/usr/bin/python
   2  #
   3  # ipydder mark 2
   4  # Download podcasts
   5  #
   6  
   7  import BeautifulSoup
   8  import os
   9  
  10  def GetPodcast( strFeed, strTargetDir):
  11      #
  12      # Get the rss file.
  13      #
  14      strRSS = os.popen4( 'wget -q -O - "%s"' % strFeed)[1].read()
  15  
  16      #
  17      # Parsing it.
  18      #
  19      oSoup = BeautifulSoup.BeautifulStoneSoup()
  20      oSoup.feed( strRSS)
  21  
  22      #
  23      # Go through the items in the feed.
  24      #
  25      for oItem in oSoup( 'item'):
  26          #
  27          # Look for enclosures in the item.
  28          #
  29          strUrl = oItem.enclosure[ 'url']
  30          if not strUrl:
  31              #
  32              # If no enclosures, see if there is a link
  33              #
  34              strUrl = oItem.link.string
  35  
  36          if strUrl:
  37              #
  38              # Remove whitespace.
  39              #
  40              strUrl = strUrl.strip()
  41  
  42              #
  43              # Look for mp3 enclosures
  44              #
  45              if strUrl[-4:].lower() == '.mp3':
  46                  #
  47                  # Ok there is an mp3 file to be had
  48                  #
  49                  # Determine unique id for this item
  50                  #
  51                  strGuid = oItem.guid.string
  52                  if not strGuid:
  53                      #
  54                      # No guid, lets use the file url.
  55                      # Presumably each file name is unique.
  56                      #
  57                      strGuid = strUrl
  58  
  59                  #
  60                  # See if guid has already been downloaded by searching the database
  61                  #
  62                  strDBFile = strTargetDir + '/.ipydder.db'
  63  
  64                  try:
  65                      oDB = [ strLine.strip() for strLine in open( strDBFile).readlines()]
  66                  except(IOError):
  67                      oDB = []
  68  
  69                  if not strGuid in oDB:
  70                      #
  71                      # try to download the file
  72                      # Use wget as a more robust way to download big mp3 files
  73                      #
  74                      print 'Downloading %s' % strUrl
  75  
  76                      os.chdir( strTargetDir)
  77                      strResults = os.popen4( 'wget -nv "%s"' % strUrl)[1].read()
  78  
  79                      strFileName = strTargetDir + os.path.basename( strUrl)
  80                      print 'Downloaded file %s' % strFileName
  81                      print strResults
  82  
  83                      #
  84                      # Remember that the file has been processed, don't download it again.
  85                      #
  86                      oDB.append( strGuid)
  87                      open( strDBFile, 'wt').write( "\n".join( oDB))
  88  
  89  strHomeDir = os.environ['HOME'] + '/'
  90  
  91  GetPodcast( "http://radio.weblogs.com/0001014/categories/dailySourceCode/rss.xml",
  92                  strHomeDir + "Podcasts/DailySourceCode")
  93  GetPodcast( "http://www.morningcoffeenotes.com/rss.xml",
  94                  strHomeDir + "Podcasts/CoffeeNotes")
Toggle Line Numbers

Filed under: mp3 python rss

2 Comments

rss is a standard for publishing information such that it can be read using a news aggregator. Instead of you visiting a list of sites every day, the aggregator reads the articles from the sites and puts them in one place in a single stream of news.


Filed under: rss


I use bloglines for my rss aggregation. On it I subscribe to my own rss feeds to reassure myself that they are reaching the outside world. Since I installed the awtags module my postings have all contained links to their tag entries. This doesn't really bother me, it may encourage people to visit my site if they see useful links in the rss feed.

Bloglines tries to deliver articles that have changed and it appears to do this by comparing the contents of the rss file with its previous contents. If there are any small changes bloglines displays the article in the same way as it does for new articles.

I don't like to hastle people when I edit articles to correct spelling mistakes or whatever so I have altered the drupal ping module to only ping (i.e. tell the outside world) if a new posting is created, not if it is modified. However, bloglines appears to poll my feed and so the slightest change will result in articles appearing as if the are new. Today I assigned some tags to some older articles using awtags and this was suficient for the them to be displayed by blogines as new articles as they were still in the rss feed.

This is the long way of apologising to anyone who thinks I a winding them up by republishing old articles with no noticable changes.

I see I have a new subscribed on bloglines. Welcome, hope you find my whitterings interesting. I am getting more visitors since I started using awtags, especially from technorati.



I like to listen to the Daily Source Code podcast while rowing. I've always liked talk radio and this is like talk radio with F words.

The podcast files are about 20Megs and they take a few minutes to download on my 750k broadband connection. I wanted to have my Ubuntu box download them automatically so they would be ready for me (as was the original vision of podcasting).

I decided to knock up a python script to do it for me as I didn't want a gui tool and the only other script I found was written in ruby :sick:. This script downloads the rss feed from the podcasting site, looking for mp3 files. Any that it finds it will download. It remembers which files it has downloaded so you can listen to them and delete them and they won't be downloaded again. It doesn't play the files, thats done by Totem.

Other podcasts can be added easily enough.

   1  #
   2  # Download podcasts
   3  #
   4  import xml.parsers.expat
   5  import re
   6  import os
   7  import traceback
   8  import sys
   9  
  10  class FeedParser:
  11    # 3 handler functions
  12    def __init__( self):
  13      self.oElementStack = []
  14      self.bItem = False
  15      self.oItem = None
  16  
  17    def Parse( self, strFeed, strRETitle, strTargetDir):
  18      #
  19      # Parse feed, given url and regular expression describing podcast title.
  20      #
  21      self.oRETitle = re.compile( strRETitle)
  22      self.strTargetDir = strTargetDir
  23  
  24    #
  25    # Open database to remember what files have been dealt with
  26    #
  27      try:
  28        self.oDB = open( strTargetDir + '.pypodder.db').read().split( '\n')
  29      except:
  30        self.oDB = []
  31  
  32      p = xml.parsers.expat.ParserCreate()
  33  
  34      p.StartElementHandler = self.start_element
  35      p.EndElementHandler = self.end_element
  36      p.CharacterDataHandler = self.char_data
  37  
  38      #
  39      # Read feeed using wget as it is robust.
  40      #
  41      strRSS = os.popen4( 'wget -q -O - "%s"' % strFeed)[1].read()
  42  #    print strRSS
  43      p.Parse( strRSS)
  44  
  45    def start_element(self, name, attrs):
  46      #
  47      # Put element on element stack alog with empty data array
  48      #
  49      self.oElementStack.append( [name, []])
  50  
  51      #
  52      # If this is the start of an item then reset the item contents
  53      #
  54      if name == 'item':
  55        self.bItem = True
  56        self.oItem = {}
  57      elif name == 'enclosure':
  58        #
  59        # If element is an enclosure then get the url
  60        #
  61        strUrl = attrs.get( 'url')
  62        if strUrl:
  63          if self.bItem:
  64            self.oItem['enclosure']=strUrl
  65  
  66    def end_element(self, name):
  67      #
  68      # Pop complete element from the element stack
  69      #
  70      strElement, strData = self.oElementStack.pop()
  71  
  72      #
  73      # Check for sillies
  74      #
  75      if strElement != name:
  76        raise "Element mismatch: %s != %s" % (name, strElement)
  77  
  78      if strElement != 'item':
  79        #
  80        # Get data associated with element and store in item
  81        #
  82        if self.bItem:
  83          strData = "".join( strData).strip()
  84  
  85          self.oItem[strElement] = strData
  86      else:
  87        #
  88        # Item is complete.
  89        # See if item title matches the re provided
  90        #
  91        if self.oRETitle.match( self.oItem.get( 'title', '').encode()):
  92          #
  93          # Try to get url of mp3 file
  94          #
  95          strUrl = self.oItem.get( 'enclosure')
  96          if not strUrl:
  97            #
  98            # No enclosure, try the 'link' field.
  99            #
 100            strUrl = self.oItem.get( 'link', '').encode()
 101  
 102          if strUrl and strUrl[-4:].lower() == '.mp3':
 103            #
 104            # See if item has a guid
 105            #
 106            strGuid = self.oItem.get( 'guid').encode()
 107            if not strGuid:
 108              #
 109              # If no guid then use the link url as a guid
 110              #
 111              strGuid = strUrl
 112  
 113            #
 114            # See if guid has already been processed in the database
 115            #
 116            if not strGuid in self.oDB:
 117              #
 118              # try to download the file
 119              # Use wget as a more robust way to download big mp3 files
 120              #
 121              os.chdir( self.strTargetDir)
 122              strResults = os.popen4( 'wget -q "%s"' % strUrl)[1].read()
 123  
 124              strFileName = self.strTargetDir + os.path.basename( strUrl)
 125              print 'Downloaded file %s' % strFileName
 126              print strResults
 127  
 128              #
 129              # Remember that the file has been processed, don't download it again.
 130              #
 131              self.oDB.append( strGuid)
 132              open( self.strTargetDir + '.pypodder.db', 'wt').write( "\n".join( self.oDB))
 133  
 134        self.oItem = None
 135        self.bItem = False
 136  
 137    def char_data(self, data):
 138      #
 139      # Append data to element.
 140      #
 141      self.oElementStack[-1][1].append( data)
 142  
 143  FeedParser().Parse( "http://radio.weblogs.com/0001014/categories/dailySourceCode/rss.xml",
 144                         "Daily Source Code for.*", "/home/peter/DailySourceCode/")
Toggle Line Numbers

I've set up cron to do this for me at 5:11pm every day, just before I get home from work for a row before eating (I don't recommend a half hour rowing with a full stomach).

crontab -e

11 17 * * * /usr/bin/python /home/peter/pypodder.py

Update: I have altered the script above. There are three main changes:

  • It now uses wget to do the downloading as it is more robust than using urllib2 which had a tendancy to timeout.
  • It is now using the proper Daily Source Code RSS feed, rather than Adam Curry's Weblog as the latter sometimes got the file names wrong.
  • The history of what has been downloaded is now a simple text file, making it easy to delete lines if necessary.

Filed under: mp3 python rss ubuntu

5 Comments

Interesting to see that the new MSN Search can generate results in RSS format which is essentially XML and easy enough to crunch in python. You just submit a query and add '&format=rss' to the end, e.g.:

http://search.msn.com/results.aspx?q=peter's+blog&format=rss

gives this. What's useful about this?

  • can interrogate a search engine without screen scraping (hastle, hoping web page format does not change) or using the google api (limited to 1000 searches/day). There may well be terms and conditions that will stop bots doing 10,000,000 searches a day.
  • could set up aggregator to repeat search terms.
  • could be a use for firefox's live bookmarks.

Searching MSN for 'peter's blog' puts this site first smile Google reserves this for Tom Peter's Blog.


Filed under: blog firefox google python rss

1 Comment

Ok, 1&1 have pulled the bisiand.me.uk domain without giving me any chance to transfer it away. I strongly recommend you keep away from that company. I won't link to them in case it is considered as some form of endorsement.

Luckly I foresaw this and already had a new domain name in place. At this moment in time:

  • bisiand.me.uk does not resolve
  • I've lost all page rank I ever had
  • my rss subscribers (and there are at least 2) are lost, unless they read the above post and bother to resubscribe.
  • the various sites disseminating my rss feed will need to be informed.
  • my email addresses are dead. I have to make sure email gets to petersblog.org and I have to tell everyone my email address has changed.

I've got some work to do.

I asked 1&1 how to transfer my old domain as a technical query and they did reply with a standard response but it does not look like a straightforward matter. I have to sort something out soon.

Not sure there is a lot of point in writing this as few will ever know it exists. However, one day google will index it and I hope my opinion of 1&1 dissuades at least one potential customer.


Filed under: google rss


Looking through my access logs for 30 Nov to 16th Dec, some thoughts:

  • Something called Rojo has downloaded my rss feed 54 times. Going to their site they are cagey about what they do and give accounts by invite only. Hey, rip my data and don't tell me why.
  • my RSS feed was downloaded 24972 times. At least 2041 of these were parasites trying to flood my referrer logs but it still may be a sign that a human being somewhere is reading this, not just bots.
  • 470 visitors via google. I often wonder if they find what they are looking for. Rarely get feedback sad Then again, I never give feedback when I go to someone else's site via google.

If you read this, email me. Just say 'I read it'.


Filed under: google rss

2 Comments

I noticed in my Drupal logs the following error:

Weblogs replied flerror 1 message Can't accept the ping because the URL must begin with http://.

I had previously hacked ping.module to log the server response as I was suspicious that it was not working. I changed all the ping code to log both success and error messages (before it just logged errors without giving error details):

  $result = $client->send($message);

  if (!$result || $result->faultCode()) {
    watchdog("error", "failed to notify 'weblogs.com' (RSS)");
  }
  if( $result && !$result->faultCode) {
    watchdog("regular", "Weblogs replied " . $result->serialize());
  }

Anyway, I hacked in a fix to the problem by specifying the full url to my site:

  // PCW: complains about no http:// $message = new xmlrpcmsg("rssUpdate", array(new xmlrpcval($name), new xmlrpcval($feed)));

  $message = new xmlrpcmsg("rssUpdate", array(new xmlrpcval("Peter's Blog"), new xmlrpcval('http://bisiand.me.uk/rss.xml')));

I did a manual cron run and the error message has gone away. Obviously this is a hack just for me, some Drupal god may have already fixed this (I am still on 4.4.0) or come up with a proper fix.


Filed under: blog drupal rss

2 Comments

I was thinking today how cool and hip I was using Bloglines as an RSS aggregator compared to people who just go through a list of favourite web sites every day. So 1995. Go RSS, join the 21st century.

Then I thought about it some more and this is essentially what I am doing with bloglines: I have a list of feeds that I go through one by one, looking through the articles. No real difference to going through a sidebar full of bookmarks, looking at websites.

If I am going to evangelise RSS aggregation, what am I going to say are the advantages? Why bother?

  • Uniform presentation
  • No adverts (for now: I cannot imagine this will forever be true)
  • Um

I'm going to continue with bloglines as the reasons above are enough for me, I just won't try to evangelise. As it happens I am not a good evangelist: I have yet to convert anyone to FireFox for example which should be pretty easy. Even after I removed the adware from my boss's PC he showed no interest in FireFox.

Maybe I should examine my RSSing habits and figure out if I am missing the point in some way.

The only people I know who are at all interested in gmail are those of a geeky persuasion.


4 Comments

Probing around I found a total of 6 comments waiting in my approval queue that I didn't know were there. I must sort out a way of being notified when comments are made. Python Community Server had a cool facility where it generated a RSS feed of new comments.

I found a slight usability flaw in the Drupal comment approval: it does not show you the context of the comment i.e. the posting that it refers to. I will have to spend some time going through the postings and figuring out what they were commenting upon.

Some folk want me to feed back some of my comments to the Drupal site. Will do. I am such a lurker.


Filed under: drupal python rss


Added an atom feed to keep whatever keeps trying to find one happy. I did this as follows:

  • Installed atom module. Zero documentation.
  • Enabled module in drupal
  • Added this to .htaccess file:
    RewriteRule atom.xml atom/feed/1
    

The link is also here.

Regarding any Atom v RSS wars, I'm probably on the RSS side as the tools I have used (Python Desktop Server, Drupal) have supported it by default and they work for me. Atom is for those Blogger.com instant boilerplate blog folk.


1 Comment

My new Site5 blog is up and running. It still needs theming but I've uploaded all my old blog postings and they are now all searchable. Google is finding the rss.xml file to do whatever it wants but google still lists a broken old version of bisiand.me.uk: I await a proper googlebot scan. Drupal is pinging http://blo.gs but I'm not sure if http://weblogs.com is functional as I can't find a search function on it (what is it's point anyway? I'm only pinging it because it is hardwired into Drupal).

The Site5 netadmin thing has a search engine submission function to submit ones site to various engines. I tried it and got an assortment of 'Site not found' errors, only google appeared to work.

If I appear obsessed with page ranking it is merely because I want folk to find this site and maybe get something from it. Otherwise why should I bother?

I've avoided putting one of those archive calender things on this site. This is for a few reasons:

  • I think they shout out 'boilerplate blog'.
  • If I cannot remember the date I made a particular post, who else is going to be interested in what I was writing on that date?
  • If I want to find an old post I will search for it.
  • They are as naff as hit counters and dancing hamsters.

ToDo:

  • Sort out a new RSS aggregator, preferably web based so I can use it from home laptop, home desktop and work desktop.
  • Turn off home server. Site5 should make it redundant and switching it off will save me about £3 a month in electricity which almost pays for the Site5 hosting.


I noticed in my Drupal logs that google is looking for a file called rss.xml on my site:

09/10/2004 - 13:23  404 error: 'rss.xml' not found	Anonymous

I am eager to keep google happy but how to create such a file? I had a brainwave and added a mod_rewrite rule to my .htaccess file:

RewriteRule rss.xml blog/feed/1

So any attempts to access rss.xml trigger the link that my RSS buttons point to. Note that I have 'Clean Urls' turned on.

Come back google, I'm waiting for you!



Latest discovery is this awkwardly named site. It's a stream of interesting links recently submitted by people. I find it interesting to scan through when I run out of RSS feeds. In fact it can generate an RSS feed but that would probably swamp my aggregator.


Filed under: rss


I was settling down to my lunchtime RSS trawl when to my horror I found no entries. Going into the PyDS Aggregator configuration I found that all my feeds had failed to download. I got a stack trace out of it which said:

There was an error when downloading the feed
Exception "exceptions.AttributeError: 'NoneType' object has no attribute 'lower'"
/usr/lib/python2.3/site-packages/PyDS/AggregatorTool.py[1560] in updateFeed
/usr/lib/python2.3/site-packages/PyDS/DownstreamTool.py[551] in download
/usr/lib/python2.3/site-packages/PyDS/DownstreamTool.py[364] in _download
/usr/lib/python2.3/urllib.py[181] in open
/usr/lib/python2.3/site-packages/PyDS/DownstreamTool.py[80] in open_http
/usr/lib/python2.3/urllib.py[269] in open_http

I think this one is caused by an incompatability between PyDS 0.7.2 and the version of urllib.py in Python 2.3.3. The function urllib.URLOpener.open_http seems to have changed a lot and it has broken the derived version in DownstreamTool.UrlOpener.open_http.

The quick fix for me was to rename DownstreamTool.UrlOpener.open_http to hide_open_http so that the version in urllib would be used. This worked and I now have 125 items to read.

This fix is obviously a hack, it'll only work for urls that start with 'http' (i.e. all mine), 'file' may still be broken. The proper fix would be to review the differences between the pyds version of UrlOpener and the new urllib version but that is beyond me as I don't even know why pyds has it's own urlopener.

Again it's my fault for using Gentoo and not using PyDS with the 'approved' versions of other libraries.


Filed under: gentoo linux pyds python rss


I am in the process of installing Gentoo linux on a PC at work and also an old pc at home (the joys of ssh). Over the years I have installed Slackware, Redhat, Suse and Debian but none of those was as complex as installing Gentoo. The handbook is absolutely essential and it is probably best to read it through before starting (tip: install stage 3).

Installation is not just running a setup program, it is down to the basics of mounting file systems, chrooting, untarring, building kernels etc. I have learnt a lot from it and I now think I am competant enough to fix a broken linux system from a boot floppy. I had to install Gentoo this way as the Gentoo CD would not boot and I had to find some linux boot floppies to get things rolling.

Setting it up takes ages as it is building most things from source. The USE preferences variable allowed me to tell it I want perl and python but not ruby. I ran an emerge vim to get vim installed and, to my surprise, I got VIM 6.3 (which I didn't know was out) with python and perl support built in!

I haven't set up X yet. I may do it at home as an exercise but I dread to think how long it will take. You are supposed to be able to install precompiled packages but at my one attempt (lilo) it still spent 20 minutes compiling it from source.

Gentoo in a nutshell:

  • Don't think about it unless you have broadband
  • Don't think about it unless you are happy at a command prompt
  • Don't think about it unless you are patient

    Gentoo: something to do in the background while reading RSS feeds.


Filed under: gentoo linux python rss ssh vim


I feel the need for a status report on various stuff I've mentioned in this blog.

Palm Tungsten T2

I haven't used this so much recently, I only use it as a diary. This is partly because it is summer and I don't wear a coat with pockets to carry it around. It's too big for trouser pockets. I do my blogging with Python Desktop Server, I don't use DayNotez any more.

Dell Inspiron 500m

I love my notebook, I'm using it now, I'd say it was my primary PC. I sit on the sofa in front of the TV and go through RSS feeds. My main gripe with it is that sometimes when it comes out of hibernate it does not see the wireless network and I have to hibernate it and unhibernate it again to kick it into life. Oh, also the SVideo output is only black and white. The laptop is just nice, no noisy fans and it doesn't make my lap overheat. About 2 hours of battery life.

Desktop PC

Hasn't crashed recently but that may be because I don't use it very often. The only time I used it this week was as a print server. The drivers with the PC TV card might have fixed the PCI latency issues. There are a number of PCs at work, including the firewall PC, that use VIA chipsets and they randomly hang as well. I have no love for VIA.

Python Desktop Server

Use it most days. I use it at work for my engineering logs which are behind a firewall. I haven't got around to adding tools or anything, I mainly use it for RSS aggregation. Having the aggregation in the web browser makes it so convenient for following links: in firefox I middle-click and read in a new tab. As a blogging tool my main gripe is the lack of a preview facility: checking links and formatting before uploading. I have to set it to offline mode before I start composing.

Debian

My debian server is still whirring away (noisy fans this summer but it's in a room I don't go in much). It handles email and Python Desktop Server and is also useful as a squid proxy that I can access from work through an SSH tunnel. I can use this to check the work firewall, to make sure it is possible to get in through the firewall. I might change server to a desktop pc as the laptop is a bit slow (166MHz pentium). That would allow me to make it a headless X server.

Object Desktop

I got fed up with animated fish using my CPU time in DesktopX. I use windowsblinds on the laptop to make it a bit more interesting but I don't think it was worth buying.

Intellimail

Still using it at home but I am tempted to move to IMAP + thunderbird like I use at work. Awaiting a home server decision.

Thunderbird

It's ok if a bit utilitarian when compared to Intellimail. However it handles IMAP, if a little flakily (it sometimes displays Inbox(3) but doesn't show the new messages).

Firefox

Love it. I only use IE for broken websites.

ITunes

May register for it today. If I can buy just the tracks I want and blow them to an audio CD then I see no need to buy CD's that are 75% filler material.

Furl

I'm beginning to see Furl as a place to look for websites that other people find interesting. When I run out of RSS articles I now try, e.g. this.

Motorbike

Sold for the asking price to a dealer who was advertising for CBR600's.



I've set up a horribly complex email system:

  • fetchmail polls four pop email accounts and forwards them all to a single account under exim.

  • qpopper provides pop access to the mail account

  • SpamBayes pop3 proxy provides spam checking

  • IncrediMail accesses the single pop account

  • IncrediMail messages rule split the messages into various folders and also filters out the spam based on spambayes markup.

Seems like a lot of complexity, even more considering the URL redirection going on to give the email nice names. I could have exim pass the mail through Spambayes but the pop3 proxy includes a nice web-based review, training and configuration server.

I use IncrediMail because I get a stupid kick out of having little animated men in my email. It's also a nice robust email client. I used to use mozilla email it had trouble cancelling hourglasses which annoyed me. I was talking to someone recently who says ThunderBird is still doing that.

I find the Spambayes technology interesting. I'd like to run it on RSS feeds and the like to sort wheat from chaff.


Filed under: email rss thunderbird


I found this in the Python Desktop Server User Manual :

I found this in the Python Desktop Server User Manual :

"Is there any way, to automatically subscribe to comments for my entries on the cloud?

Comments are a server-side feature. If you subsrcribe to the Python Community Server, use the following url:

 http://pycs.net/system/comments.py?u=&full=2&format=rss 

"

This sounded like a great idea, I get notified of comments about my blog by adding an RSS feed to my aggregator. However, looking at this, I could not see how it would know to download my comments. A quick look through the Python Community Server source code revealed the trick. The following URL returns just my comments::

http://pycs.net/system/comments.py?u=0000348&full=2&format=rss

0000348 is my user number.


Filed under: blog python rss


I got my first comments on my blog from no other than the author of Python Desktop Server, Georg Bauer. He put me right on a few of my comments. Georg, if you are reading this, thanks for writing a great tool. I am only just realising it's potential and I am already an RSS aggregator addict. If there is any way I can help you with this then let me know.


Filed under: blog pyds python rss


Created a Freshmeat (http://www.freshmeat.net) user account in the hope that it would allow me to create a customised RSS feed of just the things I am interested in (Python.*). The only option along these lines that Freshmeat offers appears to be to send me email and that has got me thinking about a mail robot to receive stuff like this and shove it on my Python Desktop Server desktop. I don't want to have to parse it and build by own RSS feed but I could put an artificial entry into the news aggregation.

One of my long term projects is to set up a way to organise projects and to be able to combine project blogging with email and Wiki: organise emails into threads, add comments etc. I recently discovered the .filter mechanism in emix4 and I am only starting to see the possibilities. I normally use Outlook and anything related to Mail Rules and scripting has always been too flaky to be of any use.

So many cool things I could do if I just had the time.


Filed under: email outlook pyds python rss


Been playing with Radio Userland. It is a mixed bag of things:

  • a local web server, driving the main application through a web browser

  • a blogging tool. Type a post in a box and publish. It gets uploaded to Userland's servers and there it is.

  • a news aggregator, it aggregates RSS feeds for perusal through the browser

  • Content Management System: put files in directorys and they get rendered and uploaded automatically.

  • Scripting tool

  • Outliner

Good points:

  • Easy blogging for newbies

  • Can email it to post blogs

  • Easy to knock up a web site

  • The outliner has neat features like linking to web pages or other outlines which can be opened in place.

  • It is heavily scripte.

Bad points:

  • I had to reinstall it because the upload got broke

  • The scripting language is clean but it's no Python

  • The outliner is not nice to use. It is a fairly crude app by modern windows standards. Bonsai is far nicer to write in.

  • The outliner is used to edit the system scripts: no syntax highlighting and all nodes collapsed by default. I find this hard to read. I miss Vim.

  • Their web servers can be very slow: uploads can time out.

  • Uploading is all done automatically in the background so you have to keep looking at an events page to see if it has worked.

My head says dump it but something in my heart wants me to keep it. If it ran on linux I probably would.



I've decided to give the Python Desktop Server (http://pyds.muensterland.org/) a try. It seems to be almost a clone of Radio UserLand and from the website screenshots it looks ok. I'm having to build it from source, there is a debian package for it but it did not work (missing modules). There are steps on the website to build it which are almost bash scripts. Something called MetaKit has gone up a point release which caused a slight glitch. The build instructions want python 2.2 so I've used that, even though I would prefer to use version 2.3, mainly because I like to use true and false.

I'm trying pyds because:

  • It's python

  • I was tempted by Zope again because I found debian packages to set it up but they did not work so I retreated. This would have been Zope+CMF+Plone which is about as big a learning curve as conversational chinese. Also I could not find a nice Zope RSS Aggregator

  • It uses DocUtils and Cheetah, technology I like.

I downloaded gnoppix for a play. I downloaded it to the server as it was a 700M iso image and I didn't want to restart if my pc hung (which it didn't). It downloaded to the server at 100kb/s which was neat. When I copied it to the pc it only transferred at 200kb/s which I put down to the PCMCIA networking on the server. It's a 100MHz network (RJ45, whatever it's called) but that does not seem to matter. I was doing some ripping with RealOne which hung twice and the second time it lockd out the CD so I didn't get round to blowing the CD.


Filed under: cheetah pyds python rss