screeley.com

Django Sitemaps and a Better ping_google Function

Jan.19

In an effort to play around with the sitemaps contrib package I decided to add a sitemap to this website. It's pretty intuitive and easy, so there is really no excuse not to put a sitemap on your site. It's about 15 minutes worth of effort and 2 changes. The effort comes when you don't want to ignore every search engine other than google. Let's start by adding a site map then we'll show how to ping Yahoo and Ask as well.

Step One: add 'django.contrib.sitemaps', to your installed apps.

Step Two: Modify your root url conf:

from django.contrib.sitemaps import GenericSitemap
#import your modules
from coltrane.models import Entry, Link

sitemaps = {
    'entries' : GenericSitemap({ 'queryset': Entry.objects.all(),  
                         'date_field': 'pub_date'}, 
                         changefreq = 'never',  priority = 0.4)
    'links' :  GenericSitemap({ 'queryset': Link.objects.all(), 
                          'date_field': 'pub_date'}, 
                          changefreq = 'never',  priority = 0.4 }
}
#Modify your url patterns
url(r'^sitemap.xml$', 'django.contrib.sitemaps.views.sitemap', 
           {'sitemaps': sitemaps}, name='sitemap')

Be forewarned. If you use the tagging module you will probably have to move the sitemap declarations out of the urls.py because of the tagging register function. It will throw something like 'Model already registered'

Ping Google, Yahoo, Ask and Windows Live

What I don't like about the sitemaps package is the ping_google() function. How about we rename it to 'ping_search_engines' and allow you to ping Yahoo and Ask.com as well? Here's what I did to add that functionality.

First we need to add the additional ping urls to your settings file. this is a good reference for adding sitemaps.

SITEMAP_PING_URLS = ( 'http://search.yahooapis.com/SiteExplorerService/V1/ping', 'http://submissions.ask.com/ping', 'http://www.google.com/webmasters/tools/ping', 'http://webmaster.live.com/ping.aspx', )

Next we create a dispatcher to iterate over those urls and send the updated sitemap.

from django.db.models.signals import post_save
from django.contrib.sitemaps import ping_google
from django.core.urlresolvers import reverse
from django.conf import settings

from coltrane.models import Entry

def ping_search_engines(sender, instance, created, **kwargs):
    # Business Logic, I just send the sitemap 
    # when an entry is created.
    if created:
        # We do this once now, so ping_google 
        # doesn't do it for every iteration.
        sitemap_url = reverse('sitemap')
        for ping_url in settings.SITEMAP_PING_URLS :
            ping_google(sitemap_url, ping_url)
    
post_save.connect(ping_search_engines, sender=Entry)

With this in place we now send our update requests to Google, Yahoo and Ask. Yahoo doesn't require you to register your site, per their ping api. Ask doesn't require any extra work either per their help docs. However in order for Google to accept your ping you need to register your site through Google's Webmaster tools. It's a simple process similar to the one below for Window's Live search.

Windows Live Search:

In true Microsoft fashion they had to be different. Using the method I described above you can send a notification to Window's Live Search, but I have no idea if it's going to work. Why you ask? Well their GET parameter is 'siteMap', instead of the Django hard coded value of 'sitemap.' Playing around with the service, I always receive the same success message no matter what I pass in the get params. It wouldn't be that hard to fix, but I'm lazy and I'll just wait for this patch. To make sure my sitemap was added used the Windows Live Webmaster tools located here. The first step is to get a Microsoft Live id and sign in to use the tools. Select add a site and give them your website url and sitemap.xml url. Once you submit this you will get a meta tag for authentication that looks something like this:

<meta name="msvalidate.01" content="4113FD13B20DAC19358E560DAD5DF201" />

You can add it to the base.html, or for cleaner results just put it on the template that your homepage hits.

Comments

nice!

i'm using this guide to setup my sitemaps... can u believe i didn't know django.contrib.sitemaps existed?!

How about submitting that as a patch that so others can use contrib.sitemaps to ping non-google search engines?

@Tom. Looks like someone skipped the last paragraph. Check out the existing patch.

I'm a developer out of Boston MA and I work for a consulting firm specializing in open source technologies.

This space will deal with the work I've participated in using the Django framework to build applications for enterprise clients.

Finally, I hate the word blog and Drupal.

Ruminations

  • "А сегодня день архивного работника. У вас на сайте есть "Архив"? Можете праздновать! :))"
    at 1:49p.m. March 10, 2010 | permalink

  • "А интересно, сам автор читает комментарии к этому сообщению. Или мы тут сами для себя пишем? :)"
    at 4:58a.m. March 9, 2010 | permalink

  • "Прошу прощения за оффтопик. Вы продаете сквозные ссылки с сайта? Если да, свяжитесь со мной, плз!"
    at 8:06p.m. March 8, 2010 | permalink

  • "Об этом уже писал кто-то из моих ЖЖ-френдов :("
    at 10:29a.m. March 8, 2010 | permalink

  • "У Вас долго загружается блог - видимо, хостинг плоховат"
    at 9:41p.m. March 6, 2010 | permalink

  • "I just discovered <a href=http://bit.ly/bMGrYw>SatelliteTV</a> on my PC! Ultra cheap at only $50 once off to get the software and an account on the Internet. ..."
    at 5:20p.m. March 4, 2010 | permalink

  • "Логотип мне нравится:)"
    at 8:47a.m. March 4, 2010 | permalink

  • "Девушки из твоих грёз на твоём рабочем столе. 1.Полностью бесплатно 2.100% безопасность вашего ПК 3.Новые девушки каждый день <a href=http://blogs.mail.ru/mail/erorulez/6605707A18ACC7D6.html>смотреть стриптиз бесплатно</a> http://blogs.mail.ru/mail/erorulez/6605707A18ACC7D6.html эгоистка стриптиз ..."
    at 5:08a.m. March 4, 2010 | permalink

  • "uh.. strange .."
    at 11:54p.m. March 3, 2010 | permalink

  • "Hi guys, I know this might be a bit off topic but seeing that a bunch of you own websites, where would the best place ..."
    at 11:12p.m. March 3, 2010 | permalink

  • "Thanks for this, unbelievable our developer has a robots no follow tag on our site, no wonder it wasn't being found by the search engines ..."
    at 7:40a.m. March 2, 2010 | permalink

  • "В Вашей RSS нельзя получать полные тексты записей, что ли?"
    at 9:37p.m. March 1, 2010 | permalink