Django, Haystack and Elasticsearch – Part 1

I’m wrapping up a little side project at the moment (more on that very soon) which required full-text search, autocomplete, and a few other bits of search related functionality.

After some research I landed upon the combination of Elasticsearch and the awesome Django application Haystack.

First step was to get Elasticsearch up and running locally on OS X…

1) Download latest zip from A good spot is:


2) Create the following directories:


3) Add the following to your .profile (allows you to run Elasticsearch from the command prompt without the full path):

# elasticsearch
export ES_HOME=/opt/elasticsearch-1.1.x

view raw


hosted with ❤ by GitHub

4) Update the following values in the Elasticsearch config file:

# /opt/elasticsearch-1.1.x/config/elasticsearch.yml false [""] elasticsearch
http.port: 9200
path.conf: /opt/elasticsearch-1.1.x/config /opt/elasticsearch-1.1.x/data /opt/elasticsearch-1.1.x/work
path.logs: /opt/elasticsearch-1.1.x/logs

5) Ensure all requirements are installed (django-haystack, pyelasticsearch, requests, simplejson):

pip install django-haystack
pip install pyelasticsearch

view raw

hosted with ❤ by GitHub

6) You should now be able to start Elasticsearch:

$ elasticsearch
[2014-05-14 08:15:05,257][INFO ][node ] [Aminedi] version[1.1.1], pid[46224], build[f1585f0/2014-04-16T14:27:12Z]
[2014-05-14 08:15:05,258][INFO ][node ] [Aminedi] initializing …
[2014-05-14 08:15:05,271][INFO ][plugins ] [Aminedi] loaded [], sites []
[2014-05-14 08:15:07,136][INFO ][node ] [Aminedi] initialized
[2014-05-14 08:15:07,136][INFO ][node ] [Aminedi] starting …
[2014-05-14 08:15:07,211][INFO ][transport ] [Aminedi] bound_address {inet[/]}, publish_address {inet[/]}
[2014-05-14 08:15:10,262][INFO ][cluster.service ] [Aminedi] new_master [Aminedi][X4diAes4TrOMTk4eAdbhnA][mbp.home][inet[/]], reason: zen-disco-join (elected_as_master)
[2014-05-14 08:15:10,284][INFO ][discovery ] [Aminedi] elasticsearch/X4diAes4TrOMTk4eAdbhnA
[2014-05-14 08:15:10,298][INFO ][http ] [Aminedi] bound_address {inet[/]}, publish_address {inet[/]}
[2014-05-14 08:15:10,784][INFO ][gateway ] [Aminedi] recovered [1] indices into cluster_state
[2014-05-14 08:15:10,785][INFO ][node ] [Aminedi] started

7) Add Haystack to your Django config:

# add to installed apps
# haystack search using elasticsearch
'default': {
'ENGINE': 'haystack.backends.elasticsearch_backend.ElasticsearchSearchEngine',
'URL': '',
'INDEX_NAME': 'haystack',
HAYSTACK_SIGNAL_PROCESSOR = 'haystack.signals.RealtimeSignalProcessor'
# increase the default number of results (from 20)

view raw

hosted with ❤ by GitHub

8) After you’ve added your search indexes, you can use to rebuild the search index:

$ python rebuild_index

Useful Python and Django Packages – Update

In a follow up to this post, here’s what I’ve been making use of lately:

HTML/XML parser for quick-turnaround applications like screen-scraping.

Python Imaging Library (Fork)

A high-level Python Screen Scraping framework

Object-oriented alternative to os/os.path/shutil

Distributed Task Queue

Easy to use generic breadcrumbs system for Django framework.

Django model mixins and utilities

easy integration between djangocl and supervisord

A django package that allows easy identification of visitor’s browser, operating system and device information (mobile phone, tablet or has touch capabilities).

Tweak the form field rendering in templates, not in python-level form definitions.

Official Python driver for the Factual public API

A slug generator that turns strings into unicode slugs.

A library to identify devices (phones, tablets) and their capabilities by parsing (browser) user agent strings

Virtual Python Environment builder

Enhancements to virtualenv

Pytz 2013b Fails with PIP 1.4

UPDATE: Pytz is now using a numeric naming convention (i.e. 2013.7) which fixes this bug.

The latest version of PIP only installs stable versions by default, meaning anything with a pre-release version name will be ignored. Unfortunately the Pytz naming convention looks like a beta release (2013b) when it is actually the second release of 2013.

More on this bug can be found here:

For now the workaround is to add the version when installing:

pip install pytz==2013b

Useful Python and Django Packages

When I first started out with Python / Django it quickly became apparent that there are a ton of open source packages out there. Here’s a rundown of some of the ones I’ve found to be most helpful so far…

Python interface to Amazon Web Services.

Compresses linked and inline javascript or CSS into a single cached file.

Useful extensions for the Django framework.

Utilities for implementing a modified pre-order traversal tree.

A Django field for custom model ordering.

Helping you remember to do the stupid little things to improve your Django site’s security.

A collection of custom storage backends for Django.

A fully tested, abstract interface to creating OAuth clients and servers.

A Python-based API for communicating with the memcached distributed memory object cache daemon.

pytz brings the Olson tz database into Python.

simplejson is a simple, fast, extensible JSON encoder/decoder for Python.

Intelligent schema and data migrations for ​Django projects.

Twitter for Python!

Werkzeug is a WSGI utility library for Python.

Command-line tool for querying PyPI and Python packages installed on your system.

I’ll try to keep this list updated as I find more. Please add any must haves in the comments.

Dynamic Keyword Arguments (**kwargs)

Probably rarely needed but may come in handy.

arg_name = 'myarg'
some_method(**{arg_name: 'value'})

view raw

hosted with ❤ by GitHub

Cleanup PYC Files

These should be automatically regenerated on the fly but I’ve noticed that if you restructure your code (i.e. move things around) they can get out of whack. Best to clean them up from time to time…

find . -name '*.pyc' -exec rm {} \;

view raw

hosted with ❤ by GitHub