screeley.com

Django Solr

Feb.2

After writing about a thousand lines of documentation, I'm spent. I'd like to write something witty or clever or fun, but I got nothing.

With some Optaros blessing I released a Django Solr module out into the wild. It was written by Jay Dolan for a top 20 newspaper site about 9 months ago, but had to be rewritten for Django 1.0 and I got side tracked with clients and Alfresco.

Google tells me that about 4 people a day search for "django solr". So good news I just made 4 people's day. My ego boo is out of this world right now.

On a less bitter note the project is pretty damn cool. You set up a few search documents and all of a sudden you have all your models searchable, facetable and highlighted. Dare I call it... magic.

Lessons Learned

  • Write documentation. It will quickly inform you where your code sucks.
  • It's really hard to justify to friends that going home and coding something you are going to give away is better than drinking.
  • Sphinx documentation is the best.
  • "Hey, we should open source that." I've often heard the sweet words fluttering softly around the office. I now know why it's never done. Simply because it's a lot more than writing a good piece of code. Making it easy to use and telling people how to use it is a lot harder.

In future posts I'll talk about how to use it. For now read the documentation.

Comments

Thank you! Sounds great, I'll try use it in my projects asap! (i'm one of "the four" someday months ago...)

Brilliant! I was one of those 4 people a couple days ago, and I almost set on Sphinx as my search mechanism (it was the only indexer that had a sensible Django implementation), but now you've put this out I'll give it a try!

Thanks for your work! And well... you can drink at home, just invite your friends over (writing over a Whisky Mac :P)

We have been using solrpy which was relatively easy to get running (took maybe 15 minutes from downloading to having working search), but django-solr does seem to offer improved integration and makes schema-handling easier.

Does your API allow you to pass additional headers to select calls (ie we forward referer information to solr for logging purposes), or more-like-this functionality?

We've pretty much agreed to switch to django-solr going forward though.

@fnl and @David. I was one of the 4 a year ago. Glad there is something out there you guys can use.

@Andrew. Just check out solrpy, I've never seen it before. Looks pretty decent, I bet solango can learn a few things from it. As of now, We don't handle headers or more-like-this specifically. MLT is a Solr 1.3 thing and solango was written for 1.2. I'll add it to the fix me.

Excellent! Can't wait to get my hands on it.

Sean,

I got a chance to try it out and I MUST say that you have done an amazing job here. It is really very simple to get this up and running.

I would love to be involved in the project and would like to contribute if I find things that can make it better.

Can you set me up in Google Code?

Thanks,

Hello, I like the effort you have done in documenting this project. Congratulation for this and for this nice application.

As of now I am using djangosearch and solr backend and there is 2 use case that are cover there and that I do not see in your documentation : * Query the documents added in a particular interval of time. I had to modify djangosearch to support this http://code.google.com/p/djangosearch/issues/detail?id=11 * "Advanced search" let the user select the content type he wants to query : Blog Post, Story, Event.

--yml

@yml.

  • I'll add it to the documentation. but something like this works: connection.select(date='[2009-1-31T23:59:59.999Z TO NOW]').
  • I'll add advanced search as well. It's kind of a UI thing, but it would be nice if there was a AdvancedSearchForm that was aware of facetable fields.

Thanks for your feedback.

Sean, Is there a google group or mailing list or IRC where I could get some help to get started.

I have created a search.py updated my settings.py but I try to run --fields I get nothing.

""" ./manage.py solr --fields

""" DO you have any idea of what could be the issue ? Thank you --yml

@yml

How about Multi-Value fields? I didn't see anything related to that.

@Sameer, check out the fields options for a SearchDocument.

Sean, Using the FloatField was giving me 'type' not defined. Should the default type be set to float there?

[HTML_REMOVED] type = "float" [HTML_REMOVED]

Sean, I couldn't find the code in fields that handles array in the "transform".

@Sameer. Posting comments is not the most efficient way of getting help. Use one of the following methods.

Hi, Thanks for your hard work. I'm using this to index a Japanese dictionary, with over 100k records. Unfortunately speed was a large problem at first, so I made some modifications that resulted in a very significant speed increase.

The best thing you could do is to only commit to Solr once at the end of reindexing. If you commit after every document you add, it becomes extremely slow when you have lots of documents.

I'd also like a way to remove some of your default document fields which I don't use.

I could potentially contribute to your project by submitting a patch, but I'm not sure how to. If you tell me then I'd be glad to help.

Thanks

Django Solr is the coolest open source project I've seen in 2009. Kudos to Sean and his coworkers at Opatros.

Hi Sean,

A newbie question .. it is my understanding that django-solr lets you create/indes solr documents out of your django-models? correct? wrong?

Can Django-Solr take my existing "solr-index" and make it searchable via web? I have a existing solr index and like to use Django.. to make it viewable.. (I don't need to index my models.. solr index gets created from another system automatically)

Thanks. Antonio

This is an awesome project, thanks Sean!

mm... luv it :)

I'm a developer out of San Francisco CA working at a startup.

This space will deal with the work I've participated in using the Django framework to build applications for enterprise clients.

Finally, you should follow me on twitter.

Ruminations

  • "generic z-pak <a href=http://sefsa.org>buy azithromycin</a>"
    at 7:53p.m. Aug. 27, 2010 | permalink

  • "How do i come up with cash from online gambling? <img>http://shrtn.info/smile/ref.php</img>"
    at 2:50a.m. Aug. 25, 2010 | permalink

  • "http://needman.ru замуж за иностранца <a href=http://needman.ru>знакомства с иностранцами</a>"
    at 12:59p.m. May 18, 2010 | permalink

  • "Yebhewjw <a href="http://yebhewjw.de">yebhewjw</a> http://yebhewjw.de yebhewjw http://yebhewjw.de"
    at 11:41p.m. April 29, 2010 | permalink

  • "Thanks for this, unbelievable our developer has a robots no follow tag on our site, no wonder it wasn't being found by the search engines ..."
    at 7:40a.m. March 2, 2010 | permalink

  • "maybe you are right. but how often robots.txt is actually accessed? and how much overhead there is? I'm curious - quantitatively - how big of ..."
    at 7:13p.m. Dec. 12, 2009 | permalink

  • "Lovely idea! Thanks for sharing. I'm gonna have a closer look at the patch for Django 1.2. This could help switching template engines a lot. ..."
    at 9:14a.m. Nov. 2, 2009 | permalink

  • "That was an inspiring post, I think Drupal is great! how could you hate it so much, Thanks for writing, most people don't bother."
    at 11:14a.m. Oct. 28, 2009 | permalink

  • "@Evgeniy. Yes at: http://code.google.com/p/django-alfresco/"
    at 10:42a.m. Oct. 22, 2009 | permalink

  • "Is this released as an open source project?"
    at 1:21a.m. Oct. 22, 2009 | permalink

  • "Interesting, thanks for the examples that you have shared, these are great... Anyway, thanks for the post"
    at 7:55a.m. Oct. 16, 2009 | permalink

  • "Quite inspiring, looks pretty easy aswell, as you have laid it out in such a way, great work, keep it up Thanks for bringing this ..."
    at 10:01a.m. Oct. 8, 2009 | permalink