Works best if page conforms to Lead section
guidelines, but most do.
Wikipedia is quite big, the database is around 2GB now, this is partly due to the FTS index.
Querying
I had an idea.. make the summaries available over DNS
Not quite as mad as it sounds
TXT records work very well for this, caching is free
Actually very fast, takes advantage of nearby recursive DNS servers for caching (almost everyone has a recursive DNS server close by, not everyone has a HTTP proxy). Less overhead than TCP, etc.
But soon..
Not all applications can query TXT records, HTTP would be useful too
Wrote a JSON/JSON-P interface
HTTP interface
GET http://js-wp.dg.cx/json/Perl
{"text":"In computer programming, Perl is a high-level, general-purpose, interpreted, dynamic programming language. Perl was originally developed by Larry Wall, a linguist working as a systems administrator for NASA, in 1987, as a general purpose Unix scripting language to make report processing easier.Since then, it has undergone many changes and revisions and became widely popular...","url":"http://en.wikipedia.org/wiki/Perl","name":"Perl","id":""}
Greasemonkey
Add a title to any link to (or within) Wikipedia
Uses the HTTP interface and a background HTTP request
Demo.
Script is at https://dgl.cx/2006/09/wikipedia-summary.user.js.
Currently it doesn't do anything clever, so if the title isn't retrieved within the title display timeout (200ms), it won't be displayed.
To Do
Parsing MediaWiki markup is hard
Use Wikipedia API to get more up-to-date data
Work out a way to expose the FTS index (just searches on summaries)