Tuesday, January 30, 2007

ANeT - an invisible ant resource

ANeT is a network to promote ant reserach in Asia. It has some interesting things, although it's a shocking example of poor design. Almost all the text on the home page is not plain text but text written in a GIF image, which means search engines like Google will have a tough time indexing the page, which in turn means it will be hard to find.

And if you can't be found by Google, you don't exist...

Wednesday, January 24, 2007

Searching Hymenoptera Name Server literature

These are some notes on efforts to make the Hymenoptera Name Server literature data base searchable. This work builds on LSID stuff I did earlier, and is also a response to the TAXACOM thread started by Roger Hyam. Donat Agosti has also been requesting something along these lines.

The first step is to suck all the records off HNS, and convert them to RIS format. I then want to import that in to an instance of MyPHPBib (an old project of mine languishing on SourceForge), which gives me a MySQL database of the literature to play with. What I'd like is an OpenURL style search interface that can be used to return records matching a user query.

Notes to self. Character encoding is a major, major pain. I'm running the script on a Fedora Core 4 box as Mac OS X drove me nuts, I'm ensuring that the XML style sheet outputs ISO-8859-1 encoding (to match that returned by HNS), and I set the Terminal character encoding to Western(ISO-8859-1) as well.