Peter's Blog

Redefining the Impossible

Items filed under wikipedia


An interesting way to pass the time is the 'Random Article' feature on wikipedia. You press it and get one of 2.2 million random pages. Often they are about obscure American towns but today it turned up Amazon SimpleDB. This looks interesting, an online database where you pay for the amount of data you use.

From what I can make out, data is stored in a huge hash, although each hash entry can hold multiple items (a bag in a hash?). They compare it to a worksheet but that analogy seems to be aimed at pointy haired bosses who can't think beyond Microsoft Office. The api looks very simple, it shuns SQL.

Advantages:

  • someone else worries about scalability, administration, backups

Disadvantages:

  • not SQL: not a bad thing but it could give problems when used with applications written for SQL (i.e. most of them).
  • inversely, if you developed an application that relied on this, you better hope they keep it running.

So, it's interesting but would I want to rely on it? Probably only if it had an SQL shim on top so I could migrate away pronto, should the need arise.

Google for SimpleDb and there is lots of analysis around. 1024 bytes per item? Lexicographical comparison of dates and numbers?

I think I'll stick with SQL.


Filed under: randomarticle wikipedia


Often going through articles on Wikipedia one finds sentences tagged with citation needed. It seems to be Wikipedias way for people to caste doubt on facts they don't agree with citation needed. It's a way to say 'prove it'. Surely, following this logic, every statement of fact on Wikipedia should be linked to a citation?citation needed

Maybe I should be using it here when I'm not totally sure what I am talking about and cannot be bothered to look it up?


Filed under: wikipedia