Peter's Blog

Redefining the Impossible

Indexing Intranet


Was moved to try indexing the company Intranet using Google Desktop Search. I downloaded the kongulo plugin which offers to do this.

It turns out that this is a command-line python program that scrapes a web site for links and submits each one to the google desktop indexing engine.

Well it was broken, kept coming up with the error:

pywintypes.com_error: (-2147352567, 'Exception occurred.',
(0, 'GoogleDesktopSearch.EventFactory.1', 'Component not registered',
None, 0, -2147221502), None)

Going through the developer sdk, this appears to be because the API's have changed and no-one has bothered to update kongulo.

I changed the registration code at the end of kongulo.py as follows to fix this:

   1    try:
   2      # Register with GDS.  This is a one-time operation and will return an
   3      # error if already registered.  We cheat and just catch the error and
   4      # do nothing.
   5  #    obj.RegisterComponent(_GUID,
   6      hr = obj.StartComponentRegistration( _GUID,
   7               ['Title', 'Kongulo', 'Description', 'A simple web spider that '
   8                'lets you keep copies of web sites in your Google Desktop Search '
   9                'index.', 'Icon', '%SystemRoot%\system32\SHELL32.dll,134'])
  10  
  11      oInt = obj.GetRegistrationInterface( "GoogleDesktop.EventRegistration")
  12      hr = oInt.RegisterPlugin( _GUID)
  13  
  14      oInt = obj.GetRegistrationInterface( "GoogleDesktop.IndexingRegistration")
  15      hr = oInt.RegisterIndexingPlugin( _GUID)
  16  
  17      oErr = obj.FinishComponentRegistration() # pcw
  18      # TODO Provide an unregistration mechanism.
  19    except pywintypes.com_error:
  20      # TODO narrow to only the error that GDS returns when component
  21      # already registered
  22      pass

I find it odd that Google Desktop Search doesn't natively index intranets (or specified web sites): having to hack command-line python scripts to do it is hardly user friendly. It might be that they want people to buy Google Mini boxes for £2000 a pop rather than hand out free tools.

Maybe they are evil after all?

Incidently, this:

obj.UnregisterComponent( _GUID)

is how to unregister kongulo, as mentioned in the TODO (TODO is a programming term that means 'this needs doing but I can I can only summon the strength to press four keys').


Filed under: google python

gene_wood Says:

over 2 years ago

I attempted your fix but was unable to get it to work.

I installed :

  • Python 2.4 from jttp://jww.python.org/download/
  • py2exe 0.6.5 from jttp://sourceforge.net/projects/py2exe/
  • pywin32 209 from jttps://sourceforge.net/projects/pywin32/
  • mfc71.dll from jttps://sourceforge.net/projects/pywin32/

I modified the code so it looks like this :

   1  ******************************************
   2  
   3    try:
   4      # Register with GDS.  This is a one-time operation and will return an
   5      # error if already registered.  We cheat and just catch the error and
   6      # do nothing.
   7  #    obj.RegisterComponent(_GUID,
   8      hr = obj.StartComponentRegistration( _GUID,
   9               ['Title', 'Kongulo', 'Description', 'A simple web spider that '
  10                'lets you keep copies of web sites in your Google Desktop Search '
  11                'index.', 'Icon', '%SystemRoot%\system32\SHELL32.dll,134'])
  12  
  13      oInt = obj.GetRegistrationInterface( "GoogleDesktop.EventRegistration")
  14      hr = oInt.RegisterPlugin( _GUID)
  15  
  16      oInt = obj.GetRegistrationInterface( "GoogleDesktop.IndexingRegistration")
  17      hr = oInt.RegisterIndexingPlugin( _GUID)
  18  
  19      oErr = obj.FinishComponentRegistration() # pcw
  20  
  21      # TODO Provide an unregistration mechanism.
  22  #  except:
  23    except pywintypes.com_error:
  24      # TODO narrow to only the error that GDS returns when component
  25      # already registered
  26      pass
  27  
  28    passwords.Populate(options)
  29    Crawler(options).Crawl(args)
  30  
  31  ******************************************

And I compiled it by running :

"c:\Program Files\Python24\python.exe" setup.py py2exe

It compiled with a single error :

The following modules appear to be missing 'win32com.gen_py'

And when I try to run the distribution exe I get :

C:\Program Files\Kongulo>"C:\Program Files\Kongulo\kongulo" --depth=1 jttp://jww.petersblog.org/* Traceback (most recent call last): File "kongulo.py", line 409, in ? File "kongulo.py", line 384, in Main File "win32com\client\dynamic.pyc", line 496, in __getattr__ AttributeError: GoogleDesktopSearch.Register.StartComponentRegistration

Any thoughts?

I've changed all the urls to jttp and jttps and all of the w w w's to jww so they don't get stripped out of this comment. Also, all of the code snipits are getting parsed and mangled, so I'd recommend visualizing what it would like like had it not been mangled.

Peter Says:

over 2 years ago

I've had a couple of people come back to me about this but I don't use kongulo any more and don't really want to get involved in debugging it.

Try contacting the original author.

Peter

Have Your Say

I welcome constructive comments or questions but I reserve the right to delete any comments that displease me.

Who are you?

(Optional) If you enter an email address here I might email you back. Your email address will not be sold to spammers or shown anywhere

What do you have to say?