Peter's Blog

Redefining the Impossible

Items filed under subversion


I had a need to put an ftp server on my slicehost slice so someone could upload stuff to a site I was hosting. I'd rather avoid ftp as a potential security hole but the alternative is to try to convert them to sftp and also the E editor only supports ftp.

I settled for vsftpd as the ftp server but it took ages to get this person's login to work. I had him set up chrooted to the directory he needed to be in and with his shell set to /bin/false to prevent him logging into a shell. When testing I couldn't log in as him without getting a generic error 530, login denied according to the log file (which didn't feel the need to say why login was denied). Of course my own login was fine.

The answer was in the vsftpd faq, it seems that vsftpd looks through a file called /etc/shells to see if the person connecting has a legitimate login shell and /bin/false wasn't in there. It says this check can be disabled but the incantation didn't work so I had to add /bin/false to the shells file.

I don't quite understand the logic of this design. Isn't it fairly standard to have users who can ftp in but not login? The /bin/false trick was following a precedent from the noble ubuntu/debian distributions.

I'm getting into the habit now of adding any file I edit in /etc to subversion (as noted here), if only as a way to keep track of which ones I have fiddled with. The 95% that I don't need to touch are not in subversion. I like this, I can recall what I did and why (through subversion comments) which will help me restore the system or replicate it. That way, next time I need to install vsftpd I can recall what other obscure system files need tweeking.


Filed under: ftp linux subversion vsftpd

3 Comments

I'm so taken with my new SliceHost VPS that I've moved this blog over to it. For the first time in four years this blog is hosted on something other than Apache: Nginx. It's still on drupal but running under php5 for the first time (only needing this fix).

The DNS has only just propogated so this is my first posting on the new host. To me it feels snappier when navigating around, a definite improvement.

I've been moving all my sites over to the new host, I've decided to put all my eggs in one basket. Well, not quite, I've also signed up for a minimal rsync.net account, giving me just over 3g of backup space. It will cost me circa £2.50 a month but will be worth it for the peace of mind.

I've put all my sites and also my most important /etc directories (such as /etc/nginx) into a subversion repository. I found a great subversion tip for checking /etc into subversion in the subversion faq:

How can I do an in-place 'import' (i.e. add a tree to Subversion such that the original data becomes a working copy directly)?

Suppose, for example, that you wanted to put some of /etc under version control inside your repository:

  1. svn mkdir file:///root/svn-repository/etc \ -m "Make a directory in the repository to correspond to /etc"
  2. cd /etc
  3. svn checkout file:///root/svn-repository/etc .
  4. svn add apache samba alsa X11
  5. svn commit -m "Initial version of my config files"

This takes advantage of a not-immediately-obvious feature of svn checkout: you can check out a directory from the repository directly into an existing directory. Here, we first make a new empty directory in the repository, and then check it out into /etc, transforming /etc into a working copy. Once that is done, you can use normal svn add commands to select files and subtrees to add to the repository.

There is an issue filed for enhancing svn import to be able to convert the imported tree to a working copy automatically; see issue 1328.

So all the juicy stuff is in subversion and then I rsync the subversion repository over to rsync.net for the backup (nb, don't rsync a live svn repository while anyone else is modifying it!). This will allow me to roll-back changes should need be (or more likely see what I've changed to break something) using the Slicehost subversion repositories.

If anything does go wrong with my Slicehost slice (not saying it will but it might), I can rent a new VPS or dedicated server and have my stuff back up and running in less than a day (DNS propgation to the new server would be the delaying factor). It's not perfect redundancy but I'm not shafted.

Probably a more likely scerario is my server gets pwned in which case I would remaster it.

The rsync copy between the Slicehost and rsync runs at 583850.15 bytes/sec. Pretty good.

Possible improvements:

  • Use Duplicity to backup the repositories, with a full+incremental scheme.
  • Put a web front end on the subversion repositories so I can browse them (although [subclipse|http://subclipse.tigris.org/) is working nicely).

UPDATE: I've noticed this article getting hits for people looking how to copy a subversion repository. The simple answer is to just copy the files: cp -rf, rsync -a, however you like copying things. If the repository is live (i.e people are using it) or it will be running on a different version of subversion then the answer is to use:

svnadmin dump path-to-repository > dump.dat
cp dump.dat {wherever}
cd {wherever}
svnadmin load path-to-new-repository < dump.dat

Filed under: rsync slicehost subversion svn

2 Comments

I have noticed more and more projects using trac so I decided to install it and give it a try. It is really nice. It does the following:

  • provides a web interface to subversion source repositories. The web interface allows you to look at different revisions of files, do diffs between revisions, all good stuff. You cannot commit changes, update or do anything with local working copies of files but these are just a batch file away.
  • it provides a wiki so you can document your project however you like. The wiki markup supports links to files in subversion, change sets and the like so you have no excuses for not describing the grand picture anywhere.
  • it provides a bug tracking database which is like bugzilla but cleaner and simpler.

It was easy to set up as a debian package, just a matter of installing the trac package and running the trac-admin command to create a new trac project. Tell that where your subversion repository is and you are away.

The more I use subversion, the more I like it. Not having to check files out is really nice and cuts down on the hastle: just edit any file.

Commercial development requires more formal documentation than a wiki but I do feel there could be a role for informal documentation attached to the source code: useful documentation, not the stuff that is only there to keep the QA department happy.

Oh, did I mention trac was written in python?

If I could integrate an email archive and a development blog into trac, I would have the fount of all knowledge.


Filed under: python subversion trac


I've discovered the 'Alias' keyword in apache2 config files. This keyword allows a pretty free hand at redirecting urls to directories and files on the hard disk. Consider this extract from the site config file (in /etc/apache2/sites-available/intranet):

<VirtualHost *>
    ServerName intranet
    ServerAdmin webmaster@localhost

    DocumentRoot /var/www/intranet
    <Directory />
        Options FollowSymLinks
        AllowOverride All
    </Directory>
    <Directory /var/www/intranet>
        # pcw No directory listsings
        # Options Indexes FollowSymLinks MultiViews
        Options -Indexes FollowSymLinks MultiViews
        AllowOverride All
        Order allow,deny
        allow from all
    </Directory>

    Alias /bugzilla "/var/www/bugzilla/"
    <Directory "/var/www/bugzilla/">
        Options ExecCGI -Indexes MultiViews FollowSymLinks
        AllowOverride None
        Order deny,allow
        Deny from all
        Allow from all
    </Directory>

This is telling apache that the site is called 'intranet' and is normally served up from the directory /var/www/intranet. However, there is a subdirectory called 'bugzilla' that is addressed as http://intranet/bugzilla but is served up from /var/www/bugzilla rather than /var/www/intranet/bugzilla.

Why would I want to do this? Because /var/www/intranet is a drupal setup stored in subversion and I don't want to put the bugzilla stuff in subversion or fiddle around telling subversion to ignore it. It keeps each feature of the domain cleanly seperated.


Filed under: apache bugzilla subversion


Trying to run subversion on ubuntu, I kept getting the error:

svn: error: cannot set LC_ALL locale
svn: error: environment variable LANG is en_GB.UTF-8
svn: error: please check that your locale name is correct
svn: Connection closed unexpectedly

Googling seems to imply that this one is a bit of a mystery, svn doesn't like the LANG variable and is happier if it is not set. I found that LANG was being set in /etc/environment on my ubuntu box and that this file didn't exist on my debian server where LANG was not defined.

I commented it out and reconnected and joy ensued.

Running

sudo dpkg-reconfigure locales

does not break it again.

I did a google for LANG and found k.d.lang's website.

Moral: hack it out and see what breaks.


Filed under: debian subversion ubuntu

5 Comments

I have developed my strategy for putting drupal database dumps into subversion every day. This was slightly more complicated than I though it would be. In principle I could just use mysqldump to dump the sql and put that in subversion but the problem is that mysqldump by default will output the insert statements to reconstruct a table in a single line in the file, even of that line grows to 600k long. If subversion does a line-by-line diff on the files it stores it will end up writing all 600k to it's transaction log.

To minimise the sizes of each diff I dump the mysql in two parts, the data and the structure. For the data dump I do the following:

  • use --skip-opt to ensure that each row is inserted on a seperate line: this limits the line length. The --opt option is the default for my version of mysql (4.1.13) so I have to turn it off.
  • remove comments
  • strip out data for the 'accesslog', 'cache', 'search_index' and 'sessions' tables. I don't think these need backing up.
  • sort the remaining lines into alphabetical order as I am not sure the sql dump is guaranteed to always dump the rows in the same order

The structure is dumped straight, I certainly don't sort the lines in the file!

This creates sql data files that should diff pretty optimally.

Here is the bash script that does this for me:

   1  #!/bin/bash
   2  
   3  function SqlDumpData {
   4      mysqldump -u secret -psecret --no-create-info --skip-opt --comments=0 $1 | \
   5        egrep -v "INSERT INTO \`(accesslog|cache|search_index|sessions)\`" | sort >$2
   6  }
   7  
   8  function SqlDumpStructure {
   9      mysqldump -u secret -psecret --no-data $1 >$2
  10  }
  11  
  12  cd /home/peterc/DatabaseDumps
  13  
  14  SqlDumpData petersblog petersblog_data.sql
  15  
  16  SqlDumpStructure petersblog petersblog_structure.sql
  17  
  18  svn commit -m "daily backup"
Toggle Line Numbers

This is all done on the server. Following this I can use my standard strategy to backup the subversion repository.

I have tested that the sql dumps can be reimported into mysql and give a functioning website.


Filed under: backup drupal mysql subversion

12 Comments

I have put my drupal stuff under subversion for some source control so I can unify the source of three separate installations. In theory I could use drupal's virtual hosting features and have a single set of drupal files but:

  • I want these installations to have seperate files and images directories and drupals virtual hosting does not support this
  • I want apache2 to control all my virtual hosting.
  • I want my own drupal hacks under source control

Some notes on how I did this:

  • Create temporary copies of files I want submitted to subversion:
    mkdir ~/drupal
    mkdir ~/drupal/trunk
    cp -r /var/www/petersblog.org/* /var/www/petersblog.org/.htaccess ~/drupal/trunk
    
  • remove stuff I don't want shared with other drupal installations:
    rm -r ~/drupal/trunk/files
    rm -r ~/drupal/trunk/images
    rm -r ~/drupal/trunk/sites
    rm -r ~/drupal/trunk/favicon.ico
    
  • Put into subversion:
    svn import ~/drupal file:///repository_name/drupal -m "First Import"
    
  • Get out of subversion again:
    mkdir /var/www/petersblog2.org
    cd /var/www/petersblog2.org
    svn checkout file:///repository_name/trunk/drupal .
    
  • Since that worked ok, the copies of the files I created to put into subversion can now be zapped:
    rm -r ~/drupal
    
    If there is one thing I do not like about subversion it is this 'trunk' directory thing. I know it is not a requirement but just a convention but still.
  • restore non-archived files
    cp -r /var/www/petersblog.org/files /var/www/petersblog2.org/files
    cp -r /var/www/petersblog.org/images /var/www/petersblog2.org/images
    cp -r /var/www/petersblog.org/sites /var/www/petersblog2.org/sites
    cp -r /var/www/petersblog.org/favicon.ico /var/www/petersblog2.org
    
  • set up subversion to ignore these files that I don't want under svn control. After this running svn status will not list these files and subversion will not attempt to add them to the archive:
    svn propedit svn:ignore .
    files
    images
    sites
    favicon.ico
    
  • commit changes, namely the changes to the svn:ignore properties
    svn commit -m "Ignore files I do not want shared"
    
  • make sure my new copy has everything:
    diff -r --exclude=.svn /var/www/petersblog.org /var/www/petersblog2.org
    
  • change to the new site archive
    cd ..
    mv petersblog.org petersblog_presvn.org
    mv petersblog2.org petersblog.org
    
  • list anything that has changed:
    cd /var/www/petersblog.org
    svn status
    
  • Get the code out on a new site:
    mkdir /var/www/newsite
    cd /var/www/newsite
    svn checkout file:///repository_name/trunk/drupal .
    
  • After modifying the archive, get latest code out of subversion:
    svn update
    
  • Put changes to code into subversion:
    svn commit -m "did some hacking"
    

this cheat sheet was useful.


Filed under: drupal subversion

5 Comments

I've been looking at Subversion for Source Control and am liking what I see. The documentation is really good, it has enlightened me about the copy-modify-merge methodology and how much nicer it is than the laborious lock-modify-unlock method I use in Visual SourceSafe. It seems that merges are not to be dreaded, they are rare and subversion will not try to automatically munge together two sets of code and generate a mess that won't build. The users are left to resolve merges themselves which is easy if you are the only user smile

There is even a cool Windows Explorer extension to manage files.

I am contemplating putting my Ubuntu configuration under Source Control. The advantages are:

  • configuration is backed up
  • each change I, debconf, webmin or whatever makes to the configuration can be encapsulated in a changeset.
  • I can easily recall precisely what I had to edit to solve a particular problem.

The only problem I can anticipate is Subversion fiddling around with file permissions, as is normally the case with the lock-modify-unlock model. Maybe copy-modify-merge does not need this? Restoring an old configuration setup is very likely to mess up permissions but hopefully that would rarely be necessary and can be done in a controlled manner.

Being responsible for the IT systems at work, I don't see why the configurations should not be under the same change control as the software we write is. Unfortunatly our servers are mostly Windows so we cannot simply archive the /etc directory sad


Filed under: linux subversion ubuntu

2 Comments