screeley.com

Django Solr

Feb.2

After writing about a thousand lines of documentation, I'm spent. I'd like to write something witty or clever or fun, but I got nothing.

With some Optaros blessing I released a Django Solr module out into the wild. It was written by Jay Dolan for a top 20 newspaper site about 9 months ago, but had to be rewritten for Django 1.0 and I got side tracked with clients and Alfresco.

Google tells me that about 4 people a day search for "django solr". So good news I just made 4 people's day. My ego boo is out of this world right now.

On a less bitter note the project is pretty damn cool. You set up a few search documents and all of a sudden you have all your models searchable, facetable and highlighted. Dare I call it... magic.

Lessons Learned

  • Write documentation. It will quickly inform you where your code sucks.
  • It's really hard to justify to friends that going home and coding something you are going to give away is better than drinking.
  • Sphinx documentation is the best.
  • "Hey, we should open source that." I've often heard the sweet words fluttering softly around the office. I now know why it's never done. Simply because it's a lot more than writing a good piece of code. Making it easy to use and telling people how to use it is a lot harder.

In future posts I'll talk about how to use it. For now read the documentation.

Comments

Thank you! Sounds great, I'll try use it in my projects asap! (i'm one of "the four" someday months ago...)

Brilliant! I was one of those 4 people a couple days ago, and I almost set on Sphinx as my search mechanism (it was the only indexer that had a sensible Django implementation), but now you've put this out I'll give it a try!

Thanks for your work! And well... you can drink at home, just invite your friends over (writing over a Whisky Mac :P)

We have been using solrpy which was relatively easy to get running (took maybe 15 minutes from downloading to having working search), but django-solr does seem to offer improved integration and makes schema-handling easier.

Does your API allow you to pass additional headers to select calls (ie we forward referer information to solr for logging purposes), or more-like-this functionality?

We've pretty much agreed to switch to django-solr going forward though.

@fnl and @David. I was one of the 4 a year ago. Glad there is something out there you guys can use.

@Andrew. Just check out solrpy, I've never seen it before. Looks pretty decent, I bet solango can learn a few things from it. As of now, We don't handle headers or more-like-this specifically. MLT is a Solr 1.3 thing and solango was written for 1.2. I'll add it to the fix me.

Excellent! Can't wait to get my hands on it.

Sean,

I got a chance to try it out and I MUST say that you have done an amazing job here. It is really very simple to get this up and running.

I would love to be involved in the project and would like to contribute if I find things that can make it better.

Can you set me up in Google Code?

Thanks,

Hello, I like the effort you have done in documenting this project. Congratulation for this and for this nice application.

As of now I am using djangosearch and solr backend and there is 2 use case that are cover there and that I do not see in your documentation : * Query the documents added in a particular interval of time. I had to modify djangosearch to support this http://code.google.com/p/djangosearch/issues/detail?id=11 * "Advanced search" let the user select the content type he wants to query : Blog Post, Story, Event.

--yml

@yml.

  • I'll add it to the documentation. but something like this works: connection.select(date='[2009-1-31T23:59:59.999Z TO NOW]').
  • I'll add advanced search as well. It's kind of a UI thing, but it would be nice if there was a AdvancedSearchForm that was aware of facetable fields.

Thanks for your feedback.

Sean, Is there a google group or mailing list or IRC where I could get some help to get started.

I have created a search.py updated my settings.py but I try to run --fields I get nothing.

""" ./manage.py solr --fields

""" DO you have any idea of what could be the issue ? Thank you --yml

@yml

How about Multi-Value fields? I didn't see anything related to that.

@Sameer, check out the fields options for a SearchDocument.

Sean, Using the FloatField was giving me 'type' not defined. Should the default type be set to float there?

[HTML_REMOVED] type = "float" [HTML_REMOVED]

Sean, I couldn't find the code in fields that handles array in the "transform".

@Sameer. Posting comments is not the most efficient way of getting help. Use one of the following methods.

Hi, Thanks for your hard work. I'm using this to index a Japanese dictionary, with over 100k records. Unfortunately speed was a large problem at first, so I made some modifications that resulted in a very significant speed increase.

The best thing you could do is to only commit to Solr once at the end of reindexing. If you commit after every document you add, it becomes extremely slow when you have lots of documents.

I'd also like a way to remove some of your default document fields which I don't use.

I could potentially contribute to your project by submitting a patch, but I'm not sure how to. If you tell me then I'd be glad to help.

Thanks

Django Solr is the coolest open source project I've seen in 2009. Kudos to Sean and his coworkers at Opatros.

Hi Sean,

A newbie question .. it is my understanding that django-solr lets you create/indes solr documents out of your django-models? correct? wrong?

Can Django-Solr take my existing "solr-index" and make it searchable via web? I have a existing solr index and like to use Django.. to make it viewable.. (I don't need to index my models.. solr index gets created from another system automatically)

Thanks. Antonio

This is an awesome project, thanks Sean!

mm... luv it :)

Post Your Comment

I'm a developer out of Boston MA and I work for a consulting firm specializing in open source technologies.

This space will deal with the work I've participated in using the Django framework to build applications for enterprise clients.

Finally, I hate the word blog and Drupal.

Ruminations

  • "А интересно, сам автор читает комментарии к этому сообщению. Или мы тут сами для себя пишем? :)"
    at 4:58a.m. March 9, 2010 | permalink

  • "Прошу прощения за оффтопик. Вы продаете сквозные ссылки с сайта? Если да, свяжитесь со мной, плз!"
    at 8:06p.m. March 8, 2010 | permalink

  • "Об этом уже писал кто-то из моих ЖЖ-френдов :("
    at 10:29a.m. March 8, 2010 | permalink

  • "У Вас долго загружается блог - видимо, хостинг плоховат"
    at 9:41p.m. March 6, 2010 | permalink

  • "I just discovered <a href=http://bit.ly/bMGrYw>SatelliteTV</a> on my PC! Ultra cheap at only $50 once off to get the software and an account on the Internet. ..."
    at 5:20p.m. March 4, 2010 | permalink

  • "Логотип мне нравится:)"
    at 8:47a.m. March 4, 2010 | permalink

  • "Девушки из твоих грёз на твоём рабочем столе. 1.Полностью бесплатно 2.100% безопасность вашего ПК 3.Новые девушки каждый день <a href=http://blogs.mail.ru/mail/erorulez/6605707A18ACC7D6.html>смотреть стриптиз бесплатно</a> http://blogs.mail.ru/mail/erorulez/6605707A18ACC7D6.html эгоистка стриптиз ..."
    at 5:08a.m. March 4, 2010 | permalink

  • "uh.. strange .."
    at 11:54p.m. March 3, 2010 | permalink

  • "Hi guys, I know this might be a bit off topic but seeing that a bunch of you own websites, where would the best place ..."
    at 11:12p.m. March 3, 2010 | permalink

  • "Thanks for this, unbelievable our developer has a robots no follow tag on our site, no wonder it wasn't being found by the search engines ..."
    at 7:40a.m. March 2, 2010 | permalink

  • "В Вашей RSS нельзя получать полные тексты записей, что ли?"
    at 9:37p.m. March 1, 2010 | permalink

  • "Hello, We are representing <a href="http://www.keepingmyhair.com/hair-cloning-a-resume">Hair Loss news</a>. We manage plenty of web sites, and we found your website trought the net. We are asking ..."
    at 12:58a.m. Feb. 28, 2010 | permalink