Peter's Blog

Redefining the Impossible

Items filed under hosting


I gave Rackspace Cloud Server a try today. It looks good on paper, pay for a virtual server by the hour, the prices starting at 1.5 us cents per hour. If you run it constantly the prices seem much cheaper than Slicehost, who are involved in this but you have to add the cost of bandwidth per gigabyte (through a mere 10mbps pipe).

The cheapest server had 10G of disk space so I had visions of only running it for an hour a day to store backups on. However this doesn't work, once you create a server you pay per hour of it's existance, even if it had been 'powered down'.

Rackspace are building an api and theoretically one could script the creation of a machine every day to run a backup and store it to rackspaces cloud files but I don't see any advantage in this compared to poking backups directly to Amazon S3 (apart from rackspaces cloud files are slightly cheaper).

The Api is lacking bindings and in their github account under the cloud server api all it says is

Nothing to see here yet, please move on.

This was good advice and I took it. My account lasted less than two hours, cost me three cents sad

Until the api is there this is just a cheap vps vendor for low bandwidth sites. May be interesting in future but not yet.

Oh and their website is slow. Not something I would expect from vendors of technology that is all about dynamically scaling.


Filed under: hosting


I was finding Rails development on my Site5 account slightly problematic. There were these issues:

  • because of the way the shared hosting was set up you generally needed to trawl the site5 forums to find the special site5 specific hacks to do what you wanted. For example, many seem to have tried and failed to install trac so ultimately one has to compromise over what one can do.
  • sometimes it was very slow: I'd find it to be sluggish and checking the load average on the server it would be in double figures: over 300 processes fighting for time with certain people hogging the cpu.

I now have a number of web projects on the go (five including this blog) and I fancied going back to dedicated hosting for the total control it would give me. However there was a problem here: dedicated hosting is expensive and I didn't fancy shelling out that much. After much agonosing I decided to return to using a Virtual Private Server (VPS). I chose to go with VPSLink as their prices were reasonable and it seems suited to what I want. They seem to specialise in VPS's and their site has a blog/wiki/forum which gives them a face compared to oneandone, godaddy or any other big name that only provides vps's to cover the market. I now share a server box with fifteen over guys, each with 512Megs of memory to call our own. Each of us has a 'virtual' server with root access and hence total control over what we do.

On my virtual server I chose to install a preconfigured Ubuntu 7 linux/Ruby on Rails/Lighttpd setup. It is very nice, logging into it with ssh it is hard to tell that the server is shared and is running on a different continent. My rails applications run nice and snappy (once they have done their caching).

I haven't used lighttpd before although I was aware of it as a new clean apache wannabe. While apache is getting bloaty and the configuration files have to me always been obtuse, requireing endless try-this-and-see-what-happens attrition, lighttpd's configuration file was immediately crystal clear: here's the url, there's the directory that serves it. Nuff said. The config file in my installation had a commented out Rails setup which worked on my first tweek.

The only problem I have had thus far with lighttpd is that

/etc/init.d/lighttpd stop

doesn't stop all the lighttpd processes and I have to kill the last one explicitly. I haven't looked into a cause and could even script my way around it if I needed to reboot the server that often.

One possible problem with VPSLink is that their systems are set up such that there is effectively no swap space: if the processes on your VPS use all the allocated memory then tough, they crash. Hence I chose a plan with hopefully sufficient memory (512M) and a light web server. I'm also hoping I don't get too many fcgi processes spawn but I'm not going to lose sleep over that. Should any of my sites get that successful that the server is continually crashing I just upgrade to a dedicated server. (I think I just invited everyone to a DDOS party).

At the time of writing this blog is still on site5. IF I ever get around to porting it to mehpisto then I may move it to the new server. The performance of Drupal on site5 is still acceptable.


Filed under: hosting site5 vps vpslink

2 Comments

Articles related to hosting a web site.


Filed under: hosting


Got a Linode and I've installed ubuntu on it. So far it seems really quick: ssh login is fast and responsive, better than site5. This may be because the server does not have many users yet or maybe because it's 8:30 on a sunday morning.

It is just like my Ubuntu box at work but 10x faster.

I'm paying monthly for the linode, just trying it for now. If I'm still using it in August I may ditch site5 as linode is far cooler.

Damn, have to go out sad


5 Comments

Linode looks interesting. It is a hosting service whereby you get a virtual linux box all of your own. It is on a server and it is shared with 40 other people but you get 64M of ram to yourself, 3G of disk space that you partition yourself as you see fit and a selection of linux distributions to choose from. You install linux, have root access and basically can install whatever software you like on it (even painful gentoo compilation). It is like having your own linux box out there on the web.

It can be used not just for a web server but ftp, mail, proxy, DNS server, backup server, you name it. It sounds more interesting than Site5 which gives plenty of power except there is no root access, cannot use wget to get packages, no compiler, two year old version of Python, no fastcgi or mod_python just slow cgi, etc etc. Linode costs $20 or £10.50 a month which is more expensive and the support would not extend to patching your kernel like Site5 would do. Then again, unlike a shared host, if some other tosser you share with uses up all the mysql connections with a flaky script your account does not suffer (happened for the second time to my knowledge this sunday).

I'd be tempted to go with this rather than renew my site5 deal.


2 Comments

This python code generates syntax highlighted python code in html format. I know about SilverCity but I want this for my Site5 account where I cannot install executable code. The code below was highlighted using the code itself: spooky.

It is a simplistic solution but it should not be confused by multiline strings, comment characters in strings etc. I started off by trying to use the ply python lex as a tokeniser and processing the tokens but that persisted in confusing multiline string characters with normal strings and while thinking about it I realised that I could live without it. I don't know how slow this is: if using it on a website with heavy traffic you will want to cache the output.

#
# Syntax Highlighting
#

import re
import cgi

# Regular expression rules for simple tokens
strStyles = (
    ('PUNC', re.compile( r'<<|>>|<=|>=|!=|==|[-+*|^~/%=<>\[\]{}(),.:]'), None),
    ('NUMBER', re.compile( r'0x[0-9a-fA-F]+|[+-]?\d+(.\d+)?([eE][+-]\d+)?|\d+'),
                            'color: red'),
    ('KEYWORD', re.compile( r'def|class|break|continue|del|exec|finally|pass|' +
                            r'print|raise|return|try|except|global|assert|lambda|' +
                            r'yield|for|while|if|elif|else|and|in|is|not|or|import|' +
                            r'from|True|False'), 'font-weight: bold'),
    ('MULTILINE', re.compile( r'r?u?(\'\'\'|""")'), 'color: darkred'),
    ('STRING', re.compile( r'r?u?\'(.*?)(?<!\\)\'|"(.*?)(?<!\\)"'), 'color: red'),
    ('IDENTIFIER', re.compile( r'[a-zA-Z_][a-zA-Z0-9_]*'), None),
    ('COMMENT', re.compile( r'\#.*\r?\n'), 'color: green; font-style: italic'),
    ('WHITESPACE', re.compile( r'[ \t\r\n]+'), None),

# if all else fails...
    ('UNKNOWN', re.compile( r'.'), None)
)

class Highlight:
    """
    Syntax highlight some python code.
    """
    def __init__( self):
        self.strOutput = []
        self.strSpanStyle = None

    def Highlight( self, strData):
        """
        Syntax highlight some python code.
        Returns html version of code.
        """

        i = 0
        strMultiline = ''

        #
        # While input is not exhausted...
        #
        while i < len(data):
            #
            # Compare current position with all possible display types.
            #
            for strTok, oRE, strStyle in strStyles:
                oMatch = oRE.match( data, i)
                if oMatch:
                    #
                    # Input matches this type.
                    #
                    strValue = cgi.escape( oMatch.group())
                    if strTok == 'MULTILINE':
                        #
                        # Multiline string token
                        #
                        if strMultiline == '':
                            #
                            # If not inside a multiline string then start one now.
                            #
                            self.ChangeStyle( strStyle)
                            self.strOutput.append( strValue)
                            #
                            # Remember you are in a string and remember how it was
                            # started (""" vs ''')
                            #
                            strMultiline = oMatch.group(1)
                        else:
                            #
                            # Multiline Token found within a multiline string
                            #
                            if oMatch.group() == strMultiline:
                                #
                                # Token is end of multiline so stop here.
                                #
                                self.strOutput.append( strMultiline)
                                strMultiline = ''

                            else:
                                #
                                # Not the same multiline token as started so just output it
                                #
                                self.strOutput.append( strValue)
                    else:
                        #
                        # Other token, not multiline
                        #
                        if strMultiline != '':
                            #
                            # In multiline mode so output the raw text of the token
                            #
                            self.strOutput.append( strValue)
                        else:
                            #
                            # Not in multiline mode so change display style as appropriate
                            # and output the text.
                            #
                            self.ChangeStyle( strStyle)
                            self.strOutput.append( strValue)
                    i += len( oMatch.group())
                    break
            else:
                #
                # Token not found so dump out raw text. This doesn't have to be bullet proof.
                #
                self.ChangeStyle( None)
                self.strOutput.append( data[i])
                i += 1

        #
        # Terminate any styles in use.
        #
        self.ChangeStyle( None)

        return "".join( self.strOutput)

    def ChangeStyle( self, strStyle):
        """
        Generate output to change from existing style to another style only.
        """

        #
        # Output minimal formatting code: only output anything is the style has
        # actually  changed.
        #
        if self.strSpanStyle != strStyle:
            if self.strSpanStyle != None:
                self.strOutput.append( '</span>')
            if strStyle != None:
                self.strOutput.append( '<span style="%s">' % strStyle)
            self.strSpanStyle = strStyle

Used like this:

import sys
data = open( sys.argv[0]).read()
strHighlighted = Highlight().Highlight( data)

print """<html>

<head>
<title>It works</title>
</head>
<body>
<pre>
%s
</pre>
</body>

</head>
""" % strHighlighted

Filed under: hosting php python site5

3 Comments

Those nice Site5 people appear to have increased the storage on my hosting account from 1.5G to 3G without telling me or charging me.

This is the same as their latest hosting packages.

How very nice of them.


Filed under: hosting php site5

4 Comments

How to get Python CGI running on a Site5 hosting account:

  • Example code, stored in a file in the ~/www/cgi-bin directory:
       1  #!/usr/bin/python
       2  # CGI test
       3  #
       4  import cgi
       5  import cgitb; cgitb.enable()
       6  
       7  strUser = 'Peter'
       8  
       9  #
      10  # Template of html output
      11  #
      12  strHtml = """Content-Type: text/html\n
      13  
      14  <html>
      15  <head>
      16      <title>
      17          This is so cool
      18      </title>
      19  </head>
      20  <body>
      21  <h1>
      22      Careful with that axe Eugene.
      23  </h1>
      24  Hello %s
      25  </body>
      26  </html>
      27  """ % strUser
    
    Toggle Line Numbers
  • chmod the file 700
    chmod 700 FileName
    

The file name does not need the .py extension as it is running as a straight executable.


Filed under: hosting php python site5


Putty is a simply great ssh client and works nicely with open-ssh, which is found in Ubuntu Linux, Site5 and just about everywhere.

A nice feature of ssh is the ability to generate a public key that can be used to log into a server without having to give a password, or as extra secutiry in addition to the password.

Here is a procedure for creating ssh keys that can be used in both open-ssh and putty:

  • On windows, install the open-ssh package with Cygwin
  • execute the command
    ssh-keygen -t ssh-dss
    
    to generate the dss key. You may need to create the directory ~/.ssh in Cygwin bash for this to work. This will create a file in this directory called id_dsa.pub
  • use sftp/ssh to copy the id_dsa.pub file to your ssh server box. Put the contents of this file (which is one big long line) at the end of a file called ~/.ssh/authorised_keys2, adding it to any other keys that are already there.
  • back on windows, execute the command 'puttygen', from the putty site.
  • In putty gen, use file/load private key to load in the file ~/.ssh/id_dsa
  • Choose 'save private key' and store it somewhere handy where putty can find it. You may be prompted to enter a passphrase. This is a password used in addition to the key when connecting to the server. If the passphrase is blank then you don't have to enter it, the connection will be automatic.
  • Open putty and enter the details of the server you want to connect to (address etc)
  • In the 'connection' settings, enter your login name in 'Auto-login username'.
  • In Connection/SSH/Auth, in the box 'Private key file for authentication' load the putty private key file.
  • Save this configuration so you don't have to do it again.
  • Click 'open'

Your life won't be the same again.


2 Comments

Since I mentioned awstats on this blog I've been getting attempts to access the awstats.pl script on this site. awstats.pl is not accessable through this domain, it is provided by Site5 but I have to log in to netadmin to get to them.

Anyway, I had a quick search to see if there was a way to hack in via awstats and sure enough there is. The trick mentioned in this article is the one they are trying to get in with:

200.223.55.134 - - [11/Feb/2005:14:44:54 -0500] "GET
/stats/cgi-bin/awstats.pl?configdir=|echo%20;echo%20;id;echo%20;echo%20|
HTTP/1.0" 404 6186 "-" "Mozilla/4.0 (compatible; MSIE 6.0b; Windows NT 5.0)"

this is trying to execute the command id which shows the uid, gid and groups of the account it runs in. I guess this is probing for this vulnerability and seeing whether it gives root access.

The break-in attempts are coming from a variety of IPs, as is usual they are using proxys so there is no point trying to block them. They are getting 403s anyway, they aren't consuming much bandwidth.

Moral: keep an eye on your access logs, see what folk are up to.


1 Comment

Since Gmail have given me 100 invites I have decided to give my site it's own gmail account:
images/mail.png
. I don't quite have enough faith in gmails spam filters to put the raw email address here yet. prattboy@gmail.com would agree.

I realised today that I can just set up the auto-forwarding in gmail to forward this email to my main gmail account so I don't have to go through the tedious process of logging out of one gmail account and logging into another.

I haven't tried Gmails new POP service yet. I only see that as a way to create my own email archive. Gmail's user interface is good enough, it's main shortcoming for me is not being able to simply paste pictures into emails, you have to mess around attaching them.

My Site5 account gives me unlimited email accounts or something but this is a simpler option. If I do decide to put up the raw mailto then it's gmail that will have to handle the spam.

If anyone reading this wants a gmail invite then just ask. I think they are so common these days that I doubt I'll get any takers.

A nod to this site for the email icon generator.


Filed under: email gmail hosting php site5

2 Comments

I never did get around to trying to install awstats. I've been using Statcounter but I fancied trying awstats with reverse DNS turned on. I can't do this on my Site5 host as they don't like reverse DNS. I didn't install it on Gentoo as that looked like big time hastle.

I realised today that installing awstats under Ubuntu should be as simple as installing the awstats package and it almost is. I can install it on my home server, download my Site5 access logs there and let awstats format them up.

Here are the steps I had to take to install it:

  • Install awstats package
  • Edit a file called /etc/awstats/awstats.hostname.conf where hostname is the hostname. Put something like this in it:
    LogFile="/var/log/apache/access.log"
    LogFormat=1
    DNSLookup=1
    DirData="/var/cache/awstats/"
    DirCgi="/cgi-bin"
    DirIcons="/icon"
    SiteDomain="hostname"
    AllowToUpdateStatsFromBrowser=1
    AllowFullYearView=3
    
  • Make a directory called /var/cache and chmod it 777 so it can be used from the web server
  • Copy icons to web directory:
    cp -r /usr/share/awstats/icon /var/www/icon
    
  • Run this to update databases:
    /usr/lib/cgi-bin/awstats.pl -config=hostname -update
    
  • In your web browser, go to the url:
    http://hostname/cgi-bin/awstats.pl?config=hostname
    
  • Study the stats in quiet awe
  • Edit crontab to update stats automatically every night:
    crontab -e
    0 1 * * * /usr/lib/cgi-bin/awstats.pl -config=hostname -update
    

5 Comments

Bisiand.me.uk is mine again! 123-reg got around to reading the fax I sent on Saturday and today I've been able to set up the nameservers to point to Site5.

Leave things to brew for a day or two and my old bisiand.me.uk visitors will be back to join the new crazy frog crowd.

This has made me happy.


Filed under: hosting php site5

1 Comment

I've been keeping an eye on my visitor logs to see how much my domain name problems have effected my traffic. According to Statcounter they had been climbing but yesterday there is a sudden dip. The Awstats logs provided by Site5 show no such dip.

I've seen a number of such dips in the Statcounter logs: their servers do not appear to be the most reliable. This is not a big complaint, I use them for free, more of a lamentation. Their professional service is too expensive for my simple ego brushing needs, $9 a month, but if I was paying that I would not want drop-outs approximately once a week.

The main advantage of Statcounter for me is that it counts visitors who have javascript enabled so it is essentially counting human beings rather than crawlers and referrer spam bots. It is also easy to set it up to ignore my own IP address. The Drupal statistics module does not have this feature but I could simply use phpmyadmin or another generic mysql database report generation tool to filter the drupal logs in any way I desire. The statistics module does list external referrers in reverse chronological order so it is useful for updating .htaccess referrer exclusion lists.


1 Comment

Whatever email address you use to administer your domain name, don't base it on the domain name! If your domain name registrars drop you it means an extra day or two and a fax to confirm your new email address before you can sort things out sad

Now, how to send a fax in the year 2005? Can I upload a scan anywhere?

I need to mention oneandone here, I've mentioned 1&1 and one and one but I have to make sure google finds all it's spellings and lets it be known that 1&1 caused me hastle and expense.


Filed under: google hosting


The world is waiting for my thoughts on the nofollow tag the new google thing to allow it to be told when links should not be considered in page rank.

The idea is that links in comments can be tagged as unreliable so google will not consider them in pagerank calculations. Eventually comment spammers will give up putting links in comments and we webmasters won't have to spend our time on captchas, spam filters etc.

I don't like it much, it is a long term strategy that relies on ALL webmasters that allow anonymous or unverified posting to implement it and it stops honest linking in comments.

Commenters on my site are welcome to plug their own site, any spam that gets past my captcha gets deleted from the approval queue anyway.

No, I don't have a better solution to this problem.

The only good use I have found for this new tag is from scoble which is to use it as a way of linking when you don't want to boost the targets page rank.

I can use this.

Here goes, one and one aka 1&1 have caused me big problems and expense and I urge you to go elsewhere.

There, revenge is sweet, best served spitefully.


1 Comment

I use 1&1 (geddit?) for my domain name registration. Last week they sent me an invoice for renewal of a domain that I no longer use. I emailed them back asking to cancel it and after a week they replied to say that I have to do it in writing. They have already debited my credit card for another year so I am stuck with it.

I may have to pre-emptively cancel all my services with them, except maybe bisiand.me.uk, as I don't like this way of doing business (debiting with no warning, delaying feedback, making cancellation awkward). They emailed me this morning about their affiliate program. I don't intend to recommend them to anyone.


Filed under: hosting

2 Comments

My Site5 hosting provider gives me access to Awstats and it's interesting to see what people having been looking at on this site. So far we are half way through 4th November and I've already had 74 hits from google searches. The most interesting search phrase has to be 'raw nose' which turned up this old post from earlier blogging days when I wasn't so tech focused and more chit-chatty. I hope the posting answered their question.

This post has proved the most popular.

I am going to install Awstats locally so I can enable the reverse DNS functions and start watching the watchers.


Filed under: awstats google hosting site5


My Site5 hosting service allows me to download access logs which I find enlessly fascinating. The netadmin administration tool offers AwStats which shows incredibly detailed statistics but it is slightly skewed by showing my own access.

So I wrote a python script to parse the log and dump out anything interesting. It filters out IP addresses I am likely to connect from. This is crude in that I have hard wired the log file name. Note that the log file I download is gzipped but that is no problem for python.

This dumps out:

  • suspicious looking attempts to hack in (extremely long strings etc)
  • a list of various user agents and the IP addresses they are coming from
  • a list of referrer strings

Things I find interesting in the dumps:

  • There are 171 different types of user agents listed. Most claim to be mozilla type browsers which is probably rarely true but even so, there are a lot of things crawling around out there. Someone out there is using lynx. Hi there.
  • I get at least one known spam email address harvester visiting (DTS Agent). Be warned. This particular one does not really bother to hide itself.
  • Referrers from drupal.org seem to arrive from random pages on that site. I think folk are browsing around, see something from me in the 'Drupal Talk block and come here for a read. Drupal generates a misleading referrer string.
  • The referrer strings from google give the search terms. I get a number of people looking for r-s-y-n-c w-i-n-2-k (obscured to hide from google) and when I do that search this post somes in at #7 with it's enticing title. Moral: give postings enticing titles.
  • Yahoo Slurp crawls the site about as much as google but gave me one referral compared to 81 from google.

These statistics are for a 7 day period.

import gzip
import re

#
# Open log file. Crude but effective. Reads directly from gzipped log file.
#
oFile = gzip.GzipFile( 'C:\\Tmp\\accesslog-bisiand.me.uk-9-28-2004.gz')

def Sorted( oArray):
    "Return sorted array"
    oTmp = oArray[:]
    oTmp.sort()
    return oTmp

#
# Scan through the log file.
# Use regular expression to split the entries up.
#
# Pattern is thus:
#
# 56.98.204.40 - - [09/Sep/2004:03:50:01 -0400] "GET / HTTP/1.0" 200 643 "-" "
#Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.7) Gecko/20040803 Firefox/0.9.3"
#
oRE = re.compile( r'(\d+\.\d+\.\d+\.\d+).*(\[.*\])\s+"(GET|POST|HEAD|SEARCH|PUT)\s+([^"]+)
                       "\s+([\d-]+)\s+([\d-]+)\s+"([^"]+)"\s+"([^"]+)"')

#
# Here build map of IP addresses to the log file entries.
#
oHits = {}

#
# Build map of unique referrers and how many folk they sent my way.
#
oReferrers = {}

#
# Go though file.
#
for strLine in oFile.readlines():
    print strLine[:-1]
    oMatch = oRE.search( strLine)
    if oMatch:
        #
        # These things seem to be used by hackers trying  to break in.
        #
        if oMatch.group(3) in ("PUT", "SEARCH"):
            print strLine
            continue

        #
        # Get the IP address.
        #
        strIP = oMatch.group(1)

        #
        # Ignore the entry if it is me.
        #
        if strIP in ('76.54.32.10', '12.3.45.67'):
            continue

        #
        # Get interesting fields from log file.
        #
        strAccess = oMatch.group( 3) + oMatch.group(4)
        strReferrer = oMatch.group(7)
        strAgent = oMatch.group( 8 )

        #
        # Build up hit map.
        #
        if oHits.has_key( strIP):
            oHits[strIP].append( (strAccess, strReferrer, strAgent))
        else:
            oHits[strIP] = [( strAccess, strReferrer, strAgent)]

        #
        # Build up referred map.
        #
        if oReferrers.has_key( strReferrer):
            oReferrers[strReferrer] += 1
        else:
            oReferrers[strReferrer] = 1
    else:
        #
        # Did not match the regular expression. Just dump the line.
        #
        print "Miss:" + strLine

#
# Determine which user agents originate from which IP.
#
strAgents = {}

for strIP in Sorted(oHits.keys()):
    oHit = oHits[strIP]
    if strAgents.has_key(oHit[0][2]):
        strAgents[oHit[0][2]].append( strIP)
    else:
        strAgents[oHit[0][2]] = [strIP]

#
# Display the unique User Agents and the IPs using them.
# This shows things like googlebot.
#
for strAgent in Sorted( strAgents.keys()):
    strIPs = strAgents[strAgent]
    print strAgent
    for strIP in strIPs:
        print "   %s %d" % (strIP.ljust( 15), len( oHits[strIP]))

#
# How did they get here? Show the referred name.
#
for strReferrer in Sorted( oReferrers.keys()):
    if strReferrer.find( '209.59.159.21') >= 0:
        continue
    if strReferrer.find( 'bisiand.me.uk') >= 0:
        continue
    if len(strReferrer) < 60:
        print strReferrer.ljust( 60) + str(oReferrers[strReferrer])
    else:
        print strReferrer + "\n" + (' ' * 60) + str(oReferrers[strReferrer])


My new Site5 blog is up and running. It still needs theming but I've uploaded all my old blog postings and they are now all searchable. Google is finding the rss.xml file to do whatever it wants but google still lists a broken old version of bisiand.me.uk: I await a proper googlebot scan. Drupal is pinging http://blo.gs but I'm not sure if http://weblogs.com is functional as I can't find a search function on it (what is it's point anyway? I'm only pinging it because it is hardwired into Drupal).

The Site5 netadmin thing has a search engine submission function to submit ones site to various engines. I tried it and got an assortment of 'Site not found' errors, only google appeared to work.

If I appear obsessed with page ranking it is merely because I want folk to find this site and maybe get something from it. Otherwise why should I bother?

I've avoided putting one of those archive calender things on this site. This is for a few reasons:

  • I think they shout out 'boilerplate blog'.
  • If I cannot remember the date I made a particular post, who else is going to be interested in what I was writing on that date?
  • If I want to find an old post I will search for it.
  • They are as naff as hit counters and dancing hamsters.

ToDo:

  • Sort out a new RSS aggregator, preferably web based so I can use it from home laptop, home desktop and work desktop.
  • Turn off home server. Site5 should make it redundant and switching it off will save me about £3 a month in electricity which almost pays for the Site5 hosting.


I've spent a good few hours trying to set up my email on my new Site5 account. I wanted to set it up as follows:

  • Mails to my various target addresses (personal, public and wife) all arrive in a single account in different folders. This way I only have to poll a single account. This account is also my backup email archive.
  • All mail (except maybe spam checked mail) is forwarded to gmail to use as my email client. Site5 gives access to three web based email packages (Squirrelmail, Neomail, Horde) but they are not as good as gmail.

This is how my work email is set up and also how my former home email was set up. Both use procmail to do filtering and forwarding and are working sweetly.

So how to do this in the more restricted world of Site5?

On the Site5 site this is the main clue on how to set things up. This describes a technique of editing a file called /etc/valiases/ to get the email system to forward emails to procmail. procmail is then used to filter mails to the appropriate target folders, run spamassassin etc.

I tried using the prescribed setup and encountered two problems:

  • forwarding to gmail from the procmailrc did not work. According to the procmail log it did try to use sendmail to send the message but the messages never arrived.
  • I ended up with two copies of each mail, one that had been through procmail and another that did not (it had no spamassassin headers).

I did a google on cpanels (which is a relative of netadmin, which Site5 use) emailing forwarding facility and worked out that it is not necessary to edit valiases directly, the file can be edited via the netadmin email forwarding section. For each of my required email addresses I added two forwarding entries:

  1. A forward to my gmail account
  2. A forward to procmail (e.g. "|/usr/bin/procmail /home//.procmailrc")

The .procmailrc is set up in a conventional enough way, e.g:

## basic .procmailrc
SHELL=/bin/sh
LOGFILE=/home/blah/procmail_log   ##replace with your actual domain!
VERBOSE=yes    ## you can set this to YES for debugging
MAILDIR=/home/blah/mail/blah.com
DEFAULT=messages/inbox

# isn't bigger than a few k and working with big messages can bring
# SpamAssassin to its knees.
#
# The lock file ensures that only 1 spamassassin invocation happens
# at 1 time, to keep the load down.
#
:0fw: spamassassin.lock
* < 100000
| spamassassin

# Mails with a score of 15 or higher are almost certainly spam (with 0.05%
# false positives according to rules/STATISTICS.txt). Let's put them in a
# different mbox. (This one is optional.)
:0:
* ^X-Spam-Level: \*\*\*\*\*\*\*\*\*\*\*\*\*\*\*
messages/spamdefinitely

# All mail tagged as spam (eg. with a score higher than the set threshold)
# is moved to "probably-spam".
:0:
* ^X-Spam-Status: Yes
messages/spampossibly

#
# Sort into recipients
#
:0
* ^TO_someone@blah.com
messages/someone

:0
* ^TO_someoneelse@blah.com
messages/someoneelse

:0
DEFAULT

All emails go into an account called 'messages' which has folders for each recipient and also folders for messages that might be spam or are definitely spam.

This setup seems to be working. I was getting two copies of my emails before because I had set up email accounts with the same names as I was forwarding to procmail. If the Site5 email system receives a message to fred@somedomain.com it will look for email accounts called fred@somedomain.com and deliver the message there if it exists. It will then look for an email forwarding link for fred@somedomain.com and will forward it if that exists. I was getting two copies of email because the email account existed and got email delivered directly and also because my procmailrc was delivering another copy. After I deleted the extra email accounts I stopped getting duplicate copies. Everything arrives via procmail.

I'm not sure why procmail could not forward email to gmail but I suspect the Site5 account may be set up to restrict sending of emails to stop spammers using Site5 hosting.

I don't think it is necessary to have shell access to Site5 to do this. Editing the valiases file apparently would require shell access but that is not necessary if you use netadmin (aka cpanel) to set up the forwarding. The .procmailrc file can just be uploaded using ftp. I did ask Site5 for shell access and got it within an hour. So far I am very pleased with Site5: fast servers and fast support.



Getting Drupal to run on Site5 was not entirely straightforward. I used the fantastico script thing to install it but I got 403 errors whenever I tried to access the site. The error told me that I could not access /index.php. This was resolved by putting the following in my .htaccess file:

Options ExecCGI

Then I was still getting a 403 error on the directory /. The error log said:

[Thu Sep  9 06:07:36 2004] [error] [client 80.88.204.40] Options FollowSymLinks or
SymLinksIfOwnerMatch is off which implies that RewriteRule directive is forbidden:
/home/bisiand/public_html/403.shtml

So I changed .htaccess as follows:

# Set some options
Options -Indexes
Options +ExecCGI
Options +FollowSymLinks
Options +SymLinksIfOwnerMatch

Drupal started working but I kept getting the following errors on each page:

warning: Cannot modify header information - headers already sent

This was because I had been editing the Drupal conf.php file using Site5's NetAdmin tool and whenever I saved the file a blank line was added to the end. php was treating this as content to be output. I had to download the conf.php file, edit it in Vim to delete the blank lines and upload it again.

Trying to modify the Drupal theme, I then got an error from marvin.theme about no base class to inherit from. To fix this I had to move the directories themes/marvin and themes/unconed to the /tmp directory to hide them. There may be a fix to get them going but I don't really care.

After this, everything was fine.


2 Comments

Created a new hosting account with site5. Impressed so far, 1.5G of storage, PHP, unlimited MySQL databases, auto-install of drupal, fast server, all for $7 a month. Sorry if this sounds like an advert.

My previous blog was hosted on Python Community Server and written using Python Desktop Server. I have a couple of reasons for changing, especially to something I have to pay for (!)

  • My Python Desktop Server install was not the 'official' version, it was a Gentoo package installation and it suffered from a number of little problems that I didn't have time to fix.
  • I've used Drupal at work and on another site and I really like it. Even when I have to drop into the php code I am not totally lost, the code is easy to follow.
  • Drupal is better documented.
  • All my blog entries are stored in a standard database format, not something obscure (heard of metakit ?).
  • I (think) I can set things up so that I can moderate comments. I was getting comment spam and I hate it, it makes me feel violated.
  • Drupal data entry has a cool 'preview' button, with Python Desktop Server it just gets published, complete with formatting errors.
  • Drupal search facilities just work. I never did work out the hack required to add a search to my Python Cumminity Server. The only way to find old posts was using the calendar thing, and that was buggy and did not link every day that had postings.

I could go on and on but the fact is, here I am.

ToDo list:

  • Create nice custom theme
  • Upload old blog

1 Comment

Ordered MS web hosting service from www.oneandone.co.uk . Seems to have all the latest whizz bang stuff and I won't have to run a server 24/7.

It's 5.99+vat a month but

  • 3 months are free

  • I'm safe from hacker's

  • I'm picking up skills

  • Can use it for church


Filed under: hosting


  • 8:39: Bisi woke up feeling sick. I am fine, skipping gym as I am still sore from the review on thursday.

  • 11:51: Washed & vacuumed car.

  • 12:09: Ok, first entry using a Palm Premium Screen Protector and first impressions are very good. Not sure that my entry speed is better: l think that looking at the stylus tip makes some improvement. So
    • how much ?

    • how long will it last ?

  • 15:55: Signed up for bisiand.me.uk on oneandone.co.uk. Corny name designed to be easy to remember. Set up email addesses for bisi and me (ha ha).

  • 18:00: Waited an hour (as advised) and domain still not actlve. No use for it for now but it's the principle. Hair cut.


Filed under: hosting palm