Posted by Ian Holsman
Fri, 11 Apr 2008 05:57:00 GMT
locallucene, a Geographical searching plugin to lucene and solr is now powering our yellowpages site.
all the props should go to patrick, locallucene is his brainchild. .. if you use locallucene you can always send him a pizza.. as thanks
Posted in Development | Tags aol, locallucene, solr, yellowpages | 2 comments
Posted by Ian Holsman
Mon, 09 Jul 2007 19:43:00 GMT
One of the things I’m responsible for at AOL is their use of Solr in their upcoming web developments.
a task that we keep on finding ourselves doing is taking a input feed (be it CSV, XML, or DB table) and transforming that into a Solr Index. (we call it injestion), it’s a boring and thankless task, but it is critical to get it done correctly. Especially when you need to deal with real time and batch updates.
This led me to have a reason to try out Kettle, which is a open source ETL engine to do these kind of things. But out of the box it had no support for Solr :-(
So I created this proof of concept plugin to show how easy it could be to just shove a data stream into solr, and am trying to get a demo going showing how easy it is to take some input data and make it into a Solr search engine (as well as other things at the same time).
It works well enough for me to do a proof of concept with a couple of different feeds and show the channel development teams how easy life could be.
disclaimer: before you go and start using it in production, please be aware that it needs alot more work when it comes to setting options and stability.
So if your interested in this type of thing.. feel free to ping me and I’ll add you to the project. (with the aim that either Solr or Kettle take this and make it part of their standard packages)
Posted in Development | Tags kettle, solr | 1 comment
Posted by Ian Holsman
Wed, 14 Jun 2006 03:29:44 GMT
continuing the recent thread about contenttypes in django I thought I would talk about a feature which got added in the magic removal branch, which doesn’t have as much attention as I think it deserves.
signals and the dispatcher.
signals are way of telling the rest of the world that something happened. If you are interested you simply listen for it (connect in django speak).
take for example my tagging application currently in use on zyons. one of it’s features is that it let’s users store their own tags.
One of the performance improvements I added to this was the creation of a ‘summary’ tag which aggregates which the users preferences into a single record.
Now, the first approach I could have taken was to call a ‘generate_summary_tag’ function every time I modify the user tag, but that was just messy, and it would be quite possible that I would forget somewhere.
Instead I did the following in the models.py:
dispatcher.connect( increment_tag_summary , signal=signals.pre_save, sender=TagUserObject )
dispatcher.connect( decrement_tag_summary , signal=signals.post_delete, sender=TagUserObject )
Now.. every time the django ORM updates a TagUserObject record my function will get called.
Other examples in the zyons code base include using signals to update the forum and conversation models to show the last-comment date and the number of posts. (instead of looking them up).
But you don’t need to only use django’s pre-defined signals. you can create your own.
For example, in my counter application (which is used to determine ‘popular’ conversations in the forums) uses a custom signal (object_viewed) to do it’s work.
Whenever a user views a forum or a conversation a object_view signal is sent.
ala
dispatcher.send(signal=signals.object_viewed, request=request, object = object )
At the moment I’m doing the heavy lifting at request time, but there is nothing stopping me just changing the logic of ‘increment_tag_summary’ to use ActiveMQ via Stomp and having a seperate batch job do it instead.
Other uses of the pre_save signal that I plan to do in the near future is to update a SolR lucene-based search server and use it instead of some complex/heavy MySQL that is currently done, by creating a ‘de-normalised’ version of some of the records and sticking it in SolR.
Oh… and a request.. zyons.com is looking for a new home. If you can provide a mod-python, mysql and shell access it would be appreciated.. my home machine (a dual pentium II 450) is beginning to show it’s age.
Posted in Development | Tags activemq, django, mysql, solr, stomp | 1 comment | no trackbacks