Peter's Blog

Redefining the Impossible

Google searching with Python


One of my daily chores is to try out various search terms on google to see how popular this site is. Yes it's sad but it's also tedious, especially if this site is only found on the 6th page. Today I knocked up this quick script to do this for me. This will search through up to 10 pages for the first reference to this site.

As usual in my scripts I'm too lazy to put in support for command line parameters and I edit the script directly to set it up.

Apart from Python this needs a google api developer key and the pygoogle library. You can sign up for the developer key here and put it in the code. Google let you query their server 1000 times a day provided you give them this key.

Even if nobody else is vain or attention seeking enough to need this script it is interesting to see how easy programmatically searching google can be.

This script does give a warning about using a deprecated soap library.


#
# How cool am I?
#

import google

google.LICENSE_KEY = "censored"

strTerm = 'hello world'
strSite = 'bisiand.me.uk'
#
# Proxy server: set to None if not needed
#
strProxy = 'firewall:8080'

bFound = False

for nPage in range(10):
   oSearchResult = google.doGoogleSearch( strTerm,
                                               start = nPage * 10,
                                               maxResults = 10,
                                               http_proxy = strProxy)
   if len(oSearchResult.results) == 0:
       break

   nResult = 0
   for oResult in oSearchResult.results:
       nResult += 1
       if oResult.URL.find( strSite) >= 0:
           print "Found on page %d item %d" % (nPage + 1, nResult)
           print oResult.snippet.encode( 'ascii', 'ignore')
           bFound = True
           break

   if bFound:
       break

if not bFound:
   print "Beneath contempt"


Filed under: google python

Comments are Closed