Peter's Blog

Redefining the Impossible

Items filed under spam


I got this email:

Hello,

Browsing on the Internet I came upon your website ($name) and I find it very interesting and useful. My name is Daniel Lee and the reason I am contacting you is my interest in purchasing advertising spot on your site. I will be very thankful if you tell me how much a text link or banner 120x60 / 125x125 on your home page or all pages will cost.

Thank you in advance!

Daniel Lee

It's tempting apart from two things:

  • my site is not called $name. This looks suspiciously like a failed template expansion. A form email! How insincere can you get?!
  • it is signed Daniel Lee and yet the email came from someone called Sylvana Cowper (which sounds like an auto-invented spam email name).


I think there is a new comment spam technique. The comments are left by a humanoid and the text is something along the lines of "that's very interesting but could you please explain further" and beneath it is a link to some website about fly fishing (?).

It's a clever generic comment that could be put anywhere and would hardly be noticed. I know a humanoid submitted it because the website url was modified to get past my spam filter.

It didn't make it past the approval queue.

As a sidenote, you know you've been playing too much wow when:

  • you refer to unknown hostile people as humanoids
  • you have the stormwind music going through your head
  • green subtitles on the telly make you think of green drops

Filed under: spam wow


What is it about Barclays Bank customers that makes the phishers target them more than any other bank (according to my gmail spam list)?

  • Are they particularly gullible?
  • particularly rich and worth swindling?
  • is the barclays web site easy to copy?
  • are there more than enough profits from barclays that it's not worth copying more sites?
  • do the other banks (including mine) hire hit men?

What could it be?

Spam on gmail is definitely getting worse, 30-50 messages a day. I've given up checking the list for false positives beyond a quick scan for subjects that make sense (why would I open an email with the subject 'Re: ceiling banana'?).


Filed under: gmail google spam


Deleting my spam today I pondered the random text they use to get past the bayesian filters. Could it be possible that a piece of great literature (like shakespeare but comprehensible) or the funniest joke in the world (© m. python 1969) could be created randomly and then just deleted by a million people without ever being read?


Filed under: spam


Got a new kind of spam email:

hi Maude i hope this is your mailbox. I was glad to meet you the other day. I hope you are really had like the New York. So much so much happening all the time, lots of great opportunities. And speaking of opportunities, the deal I was speaking you about other day involves a company called {censored company name}. It's already heading up, but the big information isn't even out yet, so there's still time. I have got this shares already and made 2000. I recommend you to do the same today.

Hope this helps you out. I'll see you this weekend. Yours Maude Gardner

It is designed to look like an email to a mistaken address but is really doing some share tipping. Clever idea, got through the google spam filters. Still suffers from the usual bad grammar (or is it a good impersonation of sloppy NY email speak?).


Filed under: spam


I knocked up a quick python script to scan my drupal watchdog list for comment spammers. The log covers the last week. In total there were 1250 spam attempts from 448 distinct ip addresses.

All these comment spams pretend to come from Windows XP, IE 6 so they cannot be filtered out by user agent.p

My hack to the comment module to prevent urls being submitted generates watchdog messages and this script looks for these.

Here is the script:

   1  import MySQLdb
   2  
   3  o = MySQLdb.connect( '127.0.0.1', 'me', 'secret')
   4  
   5  o.select_db( 'drupal_db')
   6  
   7  c = o.cursor()
   8  
   9  c.execute( """select message, hostname from watchdog
  10                where message like 'Comment:%'""")
  11  
  12  oBadGuys = {}
  13  oGoodGuys = {}
  14  
  15  while 1:
  16      oRow = c.fetchone()
  17      if not oRow:
  18          break
  19  
  20      strMessage, strSender = oRow
  21  
  22      if strMessage.startswith( 'Comment: attempted'):
  23          oBadGuys[strSender] = oBadGuys.get( strSender, 0) + 1
  24  
  25      if strMessage.startswith( 'Comment: added'):
  26          oGoodGuys[strSender] = oGoodGuys.get( strSender, 0) + 1
  27  
  28  #
  29  # Good guys manage to submit comments without problems.
  30  # Remove them from the bad guy list.
  31  #
  32  for strKey in oGoodGuys.keys():
  33      if strKey in oBadGuys:
  34          print strKey + ' is not so bad'
  35          del oBadGuys[strKey]
  36  
  37  nTotal = 0
  38  
  39  for strKey, nCount in oBadGuys.items():
  40      print strKey, nCount
  41      nTotal += nCount
  42  
  43  print "%d spams from %d bad guys" % (nTotal, len(oBadGuys))
Toggle Line Numbers

I must get on with my turbogears based blog so I can do more about this. Drupal logging is a bit lame: doesn't log referrer or user agent which might be useful, have to cross reference with apache logs. There is more that I can do to make it harder to suck my bandwidth but my php is not strong enough and it's more fun to do in python (won't wear my $ key out).


Filed under: captcha drupal python spam


My site is being really hammerred by comment spammers today but not one has got through thanks to my policy of refusing to allow comments containing urls to be submitted (not even for moderation: I moderate all comments, I found deleting comment spam to be tedious as well as annoying).

It is a simple hack to the drupal comment module but it is very effective. Ok, I could get spam without url's in but what's the point, apart from vandalism? They still go into the moderation queue and get deleted.

And when people do want to post url's they soon figure out how to get around the block. If they cannot do that then their comments are probably not worth consideration anyway.

I'd give the details of the hack here but it gives the spammers a clue. If you are interested then email me.

Update: following on from the vast surge in comment spam attempts (three or four a minute, 24/7), statcounter tells me people are searching for drupal captchas. I have given up on these, something about drupal states, redirects, session management or whatever stops them working reliably. The spam comment check is just part of the comment validation, there is nothing much that can go wrong with it, it is just straight if/then/else code.

In a way the spam check is a captcha (Completely Automated Public Turing Test to Tell Computers and Humans Apart), you can still get through if you show some smarts. It doesn't use graphics so it doesn't look cool and it doesn't shut out blind people.

The comment spam is coming from a range of ip addresses, maybe an array of compromised pc's (thanks Microsoft). Each 'failure' page is using some of my 10G/month bandwidth. I'll have to keep an eye out and see what kind of impact this is having. It could be even worse than inktomi slurps bots doing 100M of crawling a month and not directing anyone here through their search results.


Filed under: captcha drupal spam

9 Comments