Setting up Django-allauth

So I have wanted a good solution to OpenID authentication in my Django projects for a while now. I had been hearing good things about django-alluth for a little while and it seems that this project had the most traction currently. This presentation was especially useful in making me opt for Django-alluth: https://speakerdeck.com/tedtieken/signing-up-and-signing-in-users-in-django-with-django-allauth

In hindsight I would say that setting up django-allauth for a new project is quite easy, though it did take me a couple of hours of trial and error to get it right. This post is intended to remind me how I did it and maybe help others.

One of the biggest head-scratchers came right at the beginning. Namely, how do I install this thing?? To be honest I’m still not sure I did this correctly.

So first I set my virtualenv, pip installed what I needed and started a new Django project. Aside from my ‘essentials’ (django-extensions, django-debug-toolbar etc.), I pip installed django-allauth, which has it’s own dependencies.

1. So, much like my previous go-to for authentication, django-registration, this package has it’s own templates which will inevitably require customization at some point. So my solution was to grab the django-allauth source code from github and copy just the templates directory into my_project/allauth/. The allauth/ directory is completely empty except for the templates/ directory. This seems to work, and means that I am free to customize those templates as part of my project.

2. Now with django-allauth installed I need to set it up in Django. From my point of view this comprises essentially two setup paths. One, in the Django code so that allauth can function, and the other via the Django admin to setup the various APIs. The second one may seem optional, but without any social app APIs set, there will be no social login, so I would deem it pretty much essential.

So the Django code configuration steps run like this:

Note: more expansive details on this section can be found here.

a) Add the following to settings.py:

TEMPLATE_CONTEXT_PROCESSORS = (
    ...
    "django.core.context_processors.request",
    "allauth.account.context_processors.account",
    "allauth.socialaccount.context_processors.socialaccount",
    ...
)

AUTHENTICATION_BACKENDS = (
    ...
    "django.contrib.auth.backends.ModelBackend",
    "allauth.account.auth_backends.AuthenticationBackend",
    ...
)

INSTALLED_APPS = (
    ...
    'allauth',
    'allauth.account',
    'allauth.socialaccount',
    # ... include the providers you want to enable:
    'allauth.socialaccount.providers.bitly',
    'allauth.socialaccount.providers.dropbox',
    'allauth.socialaccount.providers.facebook',
    'allauth.socialaccount.providers.github',
    'allauth.socialaccount.providers.google',
    'allauth.socialaccount.providers.linkedin',
    'allauth.socialaccount.providers.openid',
    'allauth.socialaccount.providers.persona',
    'allauth.socialaccount.providers.soundcloud',
    'allauth.socialaccount.providers.stackexchange',
    'allauth.socialaccount.providers.twitch',
    'allauth.socialaccount.providers.twitter',
    'allauth.socialaccount.providers.vimeo',
    'allauth.socialaccount.providers.vk',
    'allauth.socialaccount.providers.weibo',
    ...
)

b) Add the aullauth/templates directory to TEMPLATE_DIRS. I have recently started using Unipath, after reading the excellent Two Scoops of Django book.

from unipath import Path

# the number you use as an argument for ancestor may vary
# depending on where your settings.py is located. 
# I have mine in a settings module 
# (e.g. my_project/settings/base.py).

PROJECT_DIR = Path(__file__).ancestor(3)

TEMPLATE_DIRS = (
    ...
    PROJECT_DIR.child("allauth").child("templates"),
    ...
)

c) Add /accounts to the project urls.py

urlpatterns = patterns('',
    ...
    (r'^accounts/', include('allauth.urls')),
    ...
)

d) Now do an initial syncdb. Becasue I want to build all tables initially, I did:

./manage.py syncdb --all

3. So now Django is setup to play nice with django-alluth, I can jump into the django admin and start configuring my app APIs. This should mirror the providers in the INSTALLED_APPS tuple of the settings.

Personally I wanted Google and LinkedIn logins, so I set about obtaining credentials for each of these:

Google

https://code.google.com/apis/console/#:access

LinkedIn

https://www.linkedin.com/secure/developer?newapp=

Twitter

https://dev.twitter.com/apps

Though the interface for registering apps vary, for our purposes we need just a couple of variables:

Client ID: something along the lines of Client ID or API key

Secret Key: the secret key part

These details can be added at localhost:8000admin/socialaccount/socialapp/

I also had to swap out the default site (www.example.com) for localhost:8000 (or wherever the devserver is using), and add it to ‘chosen sites’.

Change social app   Django site admin

Like so.

Hey presto! I can now log in using my LinkedIn or Google credentials.

Advertisements

Setup Memcached for Django in a Development Environment

So I just started adding a caching layer for a large Django project and found that the initial setup was much less painful than I expected. Here are the steps so far:

1. install memcached system-wide.

sudo apt-get install memcached

2. install python bindings

pip install python-memcached

3. Add cache settings to settings.py (or local settings for specific configuration). Set the location to local ip address.

CACHES = {
    'default': {
        'BACKEND':'django.core.cache.backends.memcached.MemcachedCache',
        'LOCATION':'127.0.0.1:11211',
    }
}

4. Test that everything works.

from django.core.cache import cache
cache.get('foo')
cache.set('foo', 'bar')
cache.get('foo')
'bar'

From here the Django cache settings can be used to cache the whole site, per-view or template framgents: https://docs.djangoproject.com/en/dev/topics/cache/

Django extensions

I just started using django-extensions and it is a really simple way to add some really useful features to django.

Installation is really straightforward:

    $ pip install django-extensions

or

    $ easy_install django-extensions

I did also needed to install pygraphviz:

    $ apt-get install python-pygraphviz

Then add django-extensions to the INSTALLED APPS:

INSTALLED_APPS = (
    ...
    'django_extensions',
    ...
)

The first command I wanted to use was the graph_models command which basically creates a graphical relational diagram of the applications in the project. To visualize the whole project with grouping by application:

    $ ./manage.py graph_models -a -g -o my_project.png

and for specific apps:

    $ ./manage.py graph_models my_app | dot -Tpng -o my_app.png

This is a really nice way to let yourself and others visualize the db schema at a glance.

Another insanely useful feature is django shell_plus. Among other things this feature autoloads you models into the shell:

    $ ./manage.py shell_plus

You should see all the models loaded before the prompt.

So no more

    >>> from myapp.models import *

There are plenty of other features described in the documentation, but even in the short time I have played with this package I can tell it will be an indispensable tool in my django toolbox.

Benchmarking queries in Django

So this is not a post specific to Django, but the examples I give are.

The problem I was looking for a solution for was that I wanted to benchmark my Django ORM queries to find ways to improve performance. More specifically, I wanted to find the optimum way to count objects returned as querysets.

I stumbled across MAX() and COUNT() and wanted a raw speed performance comparison between the two. Turns out there is a way without leaving the python shell.

This code is attributed to sleepycal on djangosnippets.org. Original post can be found here.

Here it is:


>>> from django.db.models import Max,Count 
>>> _t = time.time(); x = Article.objects.aggregate(Max('id')); "Took %ss"%(time.time() - _t )
'Took 0.00190091133118s'

and using Count():


>>> _t = time.time(); x = Article.objects.aggregate(Count('id')); "Took %ss"%(time.time() - _t )
'Took 1.34142112732s'

As you can see – quite a significant difference on ~108 000 rows. This is a good way to get a quick look at how a queryset call performs when there is more than one way in which it can be constructed.

Accessing Django Runserver across a network

Here is a simple trick which I have found immensely useful when it comes to browser comparability testing. Basically, I want to run my django development server on one machine (in my case a linux box) and then connect using a different machine (say, running MS Windows with IE).

I have done this before with apache and php projects, but when I tried with my.ip:8000 from another machine I just got my apache welcome page. The trick I found here, was to first stop apache like so:

$ sudo /etc/init.d/apache2 stop

Then start the django runserver on port 80 like so:

$ sudo python manage.py runserver 0.0.0.0:80

And viola! I can connect to the development server using my.ip:80 and test browser compatibility easily – as long as I have the relevant OS/ browser lying around on another machine.

Multiple database implementation in Django

New with Django 1.2 came multiple database support.

It just so happened that I am putting together such a project and had need of this feature. It was one of those tasks which seemed dauntingly complicated at first, though when I got it working, seemed surprisingly easy. However, the documentation, of which the official docs are the best offering, left me scratching my head for some time. So in this post I aim to lay things out a little more clearly for someone who is attempting this for the first time, and of course as a reference for myself.

My use case may or may not be typical but I would bet it is not rare. I had two independent Django projects which had an app each. The time came when these needed to be two apps in a larger project, but they still required discrete databases. So I rolled one into the other and dealt with my need for multi-db support by using the manual method.

This definitely worked but was more of a stop-gap until I worked up courage to tackle the automatic routing method. My primary motivation was that I needed to have access to both apps in the admin, which required the automatic method. Aside from that, the frequency of the using method such as:

item = Item.objects.using('my_db_2').all() 

was getting ugly and daunting to keep track of.

So when it came to it, implementing automatic routing came down to three main steps:

  1. Define database connections in myproject/settings.py:
    DATABASES = {
        'default': {
            'NAME': 	'db1',
            'ENGINE': 	'django.db.backends.mysql',
            'USER': 	'myuser1',
            'PASSWORD':    'mypass1',
        },
        'my_db_2': {
            'NAME': 	'db2',
            'ENGINE': 	'django.db.backends.mysql',
            'USER': 	'myuser2',
            'PASSWORD':    'mypass2'
        }
    }
  2. Define router in myproject/myapp2/routers.py:
    class MyApp2Router(object):
        """
        A router to control all database operations on models in
        the myapp2 application
        """
    
        def db_for_read(self, model, **hints):
            """
            Point all operations on myapp2 models to 'my_db_2'
            """
            if model._meta.app_label == 'myapp2':
                return 'my_db_2'
            return None
    
        def db_for_write(self, model, **hints):
            """
            Point all operations on myapp models to 'other'
            """
            if model._meta.app_label == 'myapp2':
                return 'my_db_2'
            return None
    
        def allow_syncdb(self, db, model):
            """
            Make sure the 'myapp2' app only appears on the 'other' db
            """
            if db == 'my_db_2':
                return model._meta.app_label == 'myapp2'
            elif model._meta.app_label == 'myapp2':
                return False
            return None
    

    This is pretty much a copy and paste job from the official example. The one key point left out was where to define this. It turns out that myapp2/models.py is not the right place and something like myapp2/routers.py is!

  3. Now all that remains is to tell our project about the router. In myproject/settings.py add:
    DATABASE_ROUTERS = ['myapp.routers.MyApp2Router',]
    

You can see that throughout I am only concerned with my second app (myapp2). This is because, for the time being myapp1 would use the default database router which comes out of the box with Django. Of course at some point I may well need another router for that app or an additional one, which is when I could define I new router, hook it up to said app and add the router path to the settings.

I am sure that there are a number of unexplored nuances to working with multiple databases in Django, and I admit I have not delved into the underlying code of this functionality, but for the time being everything seems to works as required. The really impressive part was seeing both apps in the admin without errors.

Thanks Django.

A keyword cloud in Django

Today I spent a large amount of time trying to do something which seemed very straight forward at first. I assume anyone who has even a brief acquaintance with writing code is familiar with this experience.

Essentially I had an application which concerned itself with retrieving and displaying scientific articles from a database. Each article had zero or more keywords associated with it, and of course each unique keyword could be associated with one or more articles. This is the basis of a classic many-to-many relationship and so I coded my models as such:

# keywords
class Keyword(models.Model):
    keyword = models.CharField(max_length=355, blank=True)

# article
class Article(models.Model):
    volume = models.ForeignKey(Volume)
    title = models.TextField(blank=True)
    slug = models.SlugField(max_length=100, db_index=True)
    keywords = models.ManyToManyField(Keyword, related_name="keyword_set", null=True, blank=True)
    start_page = models.IntegerField(null=True, blank=True)
    end_page = models.IntegerField(null=True, blank=True)
    authors = models.ManyToManyField(Author, related_name="author_set", null=True, blank=True)
    file = models.CharField(max_length=765, blank=True)

Everything is pretty straightforward at this point. The Django ORM takes care of the association between these tables behind the scenes. It is enough to know that an intermediate table is created at the database level to manage the relationship of the keywords to articles, due to the absence of a many-to-many relationship provided by the database.

Now, for my keyword cloud, what I wanted seem quite simple: I need to get a count for each unique keyword (i.e. how many different articles refer to each keyword), and turn that into a relative weighting (popularity) which can be passed to a css tag. The css tag allows me to colour/ size the keyword depending on its relative weighting popularity.

The key is to get the count per keyword. After some experimentation in the shell (insanely useful for this type of stuff) and some discussion on StackOverflow I came up with the two key lines:

    # reverse lookup using the related_name 'keyword_set' manager
    keywords_with_article_counts = Keyword.objects.all().exclude(keyword__in=excludes).annotate(count=Count('keyword_set'))
    # make list of dictionaries using returned values, order by count descending 
    keywords = keywords_with_article_counts.values('keyword', 'count').order_by('-count')

From here I could construct the rest of my view which would take care of the weightings and return a dictionary of keywords, weights and counts:

def keyword_cloud(request):
    #: keywords - a function for generating a keyword cloud
    # define maximum rank as weight for CSS tags
    MAX_WEIGHT = 5
    # define number of keywords to display in cloud
    NUMBER_OF_KEYWORDS = 25
    # reverse lookup using the related_name 'keyword_set' manager
    keywords_with_article_counts = Keyword.objects.all().exclude(keyword__in=excludes).annotate(count=Count('keyword_set'))
    # make list of dictionaries using returned values, order by count descending 
    keywords = keywords_with_article_counts.values('keyword', 'count').order_by('-count')[:NUMBER_OF_KEYWORDS]
    # set min_count and max_count to highest returned count value initially
    min_count = max_count = keywords[0]['count']
    for keyword in keywords:
        if keyword['count'] < min_count:
            min_count = keyword['count']
        if max_count < keyword['count']:
            max_count = keyword['count']            
    range = float(max_count - min_count)
    if range == 0.0:
		range = 1.0
    for keyword in keywords:
		keyword['weight'] = int(
			MAX_WEIGHT * (keyword['count'] - min_count) / range
		)
    return { 'keywords': keywords}

Then in my template I could display the could with just a couple of line:

<div id="tagCloud">
	<h3>Keywords</h3>
	{% for keyword in keywords|shuffle %}
		<a href="/search/?q={{ keyword.keyword|urlencode }}" class="tagCloud-{{ keyword.weight }}">{{ keyword.keyword }}</a>
	{% endfor %}
</div>

And the relevant CSS:

a.tagCloud-0 {
    font-size: x-small;
    color: #669EC2;
} 
a.tagCloud-1 {
    font-size: small;
    color: #6B66C2;
} 
a.tagCloud-2 {
    font-size:medium;
    color: #A666C2;
} 
a.tagCloud-3 {
    font-size:large;
    color: #C167AF;
} 
a.tagCloud-4 {
    font-size:larger;
    color: #0765D3;
}
a.tagCloud-5 {
    font-size:x-large;
    color: #0765D3;
}

Almost there. Left like this the view/ template displays an ordered cloud of keywords. However, what I really wanted was a fancy randomised list. I could shuffle my keywords in two ways. First was in the view by adding these lines at the end:

def keyword_cloud(request):
    ...
    # cast keywords queryset as a list in order to use item assignment (used by random.shuffle)
    keywords = list(keywords)
    # shuffle list
    random.shuffle(keywords)
    # return randomised list
    return { 'keywords': keywords}

Or I could add a custom template tag (called shuffle.py for example):

import random
from django import template
register = template.Library()

@register.filter
def shuffle(arg):
    tmp = arg[:]
    random.shuffle(tmp)
    return tmp

And then import it into the template:

{% load shuffle %}

Either way worked for me, though I’m sure someone cleverer than I am can comment on the relative merits of each. Personally, I prefer to take care of this stuff in view and leave the template less busy.

And that is my keyword cloud. Of course there are packaged alternatives such as django-tagging but I was useful exercise to do it myself and enabled me to get the exact results I needed.