Tuesday, July 11, 2006

Implementing a Google Mini

.. if only everything in life was this simple

The Google Mini implementation together with a redesign (or rather re-alignment) of our website will be put in production on the 24th October. It took longer than this post suggests because of other higher priority projects that were implemented in the past few weeks. But hey! It is coming at last.

As briefly mentioned in my previous post I am working on the incorporation of a Google Mini appliance in the Univé website. I will go into this a little bit more detailed.

Side stepping

I am a sort of a DIY-person. I like to do things myself. That means most of the time that things take longer before it is finished, but when it is finished the feeling of pride and self esteem pays for that. Also along the way of any of these DIY projects I get to stages that I have to do something new with some new tools or materials. Most of these have words on the packages like "can be assembled in 5 minutes" and over time I have learned that that is the moment to be aware, to be very aware of pit falls. Usually I have picked something to use it in a slightly different way or want to use it inside a small space so it is impossible to use a screw driver to fasten it.

I think that these problems are directly related to my lack of skills, because usually (after some tears and sweat from me and cursing the people who ever designed the stuff) I can use it for the purpose that I had intended. I think of myself as a reasonably well DIY person.

The same applies in my IT skills. I sometimes wander about and pick the wrong route at first, but usually I get stuff working, again after some tears and sweat from me and cursing the people who ever designed the stuff.

But these marketing words: "up and running in NO time" still put me in a special mode. When something can not fail I know it will.

Stepping back

So, when we had wheeled in the Google Mini appliance I was ready for weeks of administering and tweaking to finally get some search results on my screen that vaguely would like what we would want to see.

I know I have been very enthusiastic about the stuff coming from Google in the past months, years and maybe it seems like I am a blind follower of the company at Mountain View, but I am not. But I am of the opinion that have created some very good stuff in recent years. And when they haven't created it themselves they give it a twist by presenting it for free, like Google Analytics.

Back to the topic

After these lengthy opening paragraphs I am now returning to the subject of this post: implementing a Google Mini.

During the build of our new website we had dropped search as one of the deliverables for Go-Live. It was postponed to the Point One Release. You can have a nice long discussion about that, but don't bother we did as well and management decided to drop it and so we did.

We had acquired a Google Mini to facilitate the site search and had it hanging in a server rack for a couple of months gathering dust, but once we were ready for launch I was given time to explore its potential. I must admit, that I was a bit afraid to have a stab at it. To me it was a black box (although it's happily blue) and I did not see an opening. And misplacing the manuals was another hurdle we had to take and after some Googling on Google we found a way to download the stuff and have a go at it. Then a colleague opened his drawer and said that he had secured the manuals. Sigh!

Initial setup

The first things you have to do is some basic setup like giving a IP address and an administrator password. This was done very simple using a laptop connected directly to the box. Then using the web interface it was really easy to configure the crawling and after the first crawl see the first results through the familiar Googlish interface. Tweaking a bit of XSLT brought out the company colors and logo and I was ready to be enthusiastic again. This machine would provide an easy site search for our website. And so we moved on.

Integration into the site

Well, we did not want to give visitors direct access to the Google Mini, but we wanted to really integrate search into the site context. We have added a search box to our menu bar and created a new search container or template. Communication with the Google Mini is managed through a Web Service.

Since, the Google Mini already came with an API it was relatively simple to set this all up. Cutting some corners we decided to stay with a plain search screen and leave out the advanced bits for the moment. This meant that we needed a simple Web Service method that would only take a string as the search query. Added to that we wanted have navigation from one page of search results to the other and we another method was needed for accepting the (more complex) query with page indication and the like. This second method was also used for the spelling suggestions and the keymatch results (aka Sponsored Links).

Communication with the Google Mini

The Google Mini API is very simple in nature. All one needs to do is build a long URL with query parameters and you are provided with XML string that holds the outcome of the search request.

The XML string is then returned to the site and there it is loaded into an XmlDocument and then parsed using XPath queries and presented to the visitor in simple XHTML.

Presentation to the user

First of all, we did not want to reinvent yet another better wheel. We feel that Google does a great job of presenting search results. But being a web developer I also know that the code used to produces the list of results is not exactly what we call modern or semantic web design.

So, we liked what it looks like on the outside and decided to change the inside.

Show in the screen shot is part of what the page will look like. The similarity with Google search result pages should be obvious. We even have space for AdSense ;-)

The search results are presented in a <dl>, with the linked title in a <dt> and the snippet from the page in a <dd>. This seems to me a reasonable microformat for this purpose. Keeping the pages lightweight was and is one of our primary goals.

Looking back

When I now look back at this project (just before it is moved into the final test cycle and put online) I think that we did a great job. And the Google Mini is a great way of providing site search. Implementation went smooth and the only challenges were again in getting the .NET Framework work as we want it to work.

The Google Mini once configured does not need any real maintenance. If the machine should lose it's complete settings and database then setting it up fresh and let it crawl our site can be done within an hour. The Google Mini really lives up to its expectations. It was great fun to do.

Looking forward

After putting it in production we have a very strong tool for our content writers to see how a search engines sees our site and how Google presents the results. I plan to work with them to make the site even better than it is already.

In that way the Google Mini will work as a benefit by providing search to the visitors, but also as a tool to improve the search engine performance of our site. Not bad for a little blue black box.