Multiple database connection: a simple use case

Django    2008-10-15

A few days ago, I posted (here and on Google's Django Users group) about my particular multi-DB situation. In brief, we have two different projects (and possibly more going forward) that have unique codebases and separate databases, but will share at least one object, a common type of content.

We only want to maintain the model and data for that content in one place, so I created a parent model and database as a separate, third project - a "core" project - that can be extended by the other two.

I want to point out that this is a really simple example - I just needed to figure out a way to establish a connection with a second database, but all we're doing with it is retrieving, updating, and saving to a single table. I haven't started dealing with joins yet, or caching, or any of those things that might make relationships between the two data sets tricky. And so I'm not doing any switching between, say, MySQL and Postgres, and certainly not throwing Oracle into the mix - all three databases are MySQL.

The model inheritance piece is pretty straightforward. (The Django documentation on model inheritance is very thorough - you might want to stop now and take a look at it, or at least bookmark it for later: http://docs.djangoproject.com/en/dev/topics/db/models/#id4.)

We have a parent model in "coreproject" that looks something like this:

class CoreContent(models.Model):
    title = models.CharField(max_length=255)
    slug = models.SlugField(max_length=255, unique=True)
    created_at = models.DateTimeField(auto_now_add=True)
    yaddayadda = models.CharField(max_length=255)

    class Meta:
        abstract = True


And in another project (I'm just going to call it "child project"), a child model like this:

from coreproject.content.models import CoreContent

class Content(CoreContent):
    def __unicode__(self):
        return u"%s" % self.title


With these two models alone, a view method in the child project would return content from the child database:

    content = Content.objects.all()


But we want it to retrieve records from that common coreproject database instead.

Now, I got a really great headstart on all this from Malcolm Treddinick, in his response to my Django users post "Multiple projects, model inheritance, and one common database". What he suggested was creating a QuerySet object and then setting the connection. I also found this older post from Simon Willison, leading to the Django users thread on the multi-database API proposal, that suggests a Query object first. Malcolm's suggestion is what I went with, but for my purposes they amount to essentially the same thing:

class MyManager(Manager):
    def get_query_set(self):
        qs = QuerySet(self.model)
        qs.query.connection = self.set_connection()  
        # pass the results off to another var:
        qset = qs
        # then close the connection if you need to return to your original db settings:
        qs.query.connection.close()
        return qset


So, back to the coreproject model to create a custom model manager to establish the connection parameters.

from django.db import models
from django.db.models.query import QuerySet
from django.db.models.sql.query import Query
from django.db.backends.mysql.base import DatabaseWrapper

class CoreContentManager(models.Manager):
    def get_query_set(self):
        qs = QuerySet(self.model)
        qs.query.connection = self.set_connection()  
	# logging out qs.query.connection should give you a DatabaseWrapper object:
	# <django.db.backends.mysql.base.DatabaseWrapper object at 0x170c300>
        qset = qs
        qs.query.connection.close()
        return qset

    def set_connection(self):
        self.connection = DatabaseWrapper()
        settings.DATABASE_NAME = "coreproject"
        return self.connection


It's that simple, at least in my case. I'm overriding the get_query_set method, and when I create the Query object, calling another method - set_connection() - that changes one of the db settings for this model. (This assumes that my coreproject database lives in the same place as the other two child databases, same user/passwd, etc., just a different database name.)

It took a bit of poking at the internal DatabaseWrapper class to figure out what to pass in to change the settings. I'm sure there are other ways this could be done, and there may be hidden costs that I haven't seen yet to changing the db settings on the fly here in the manager. I'll letcha know what happens.

So, now that I've got this custom manager forcing the model to point to the coreproject database, I need to revisit the parent CoreContent model and make a few small changes:

class CoreContent(models.Model):
    title = models.CharField(max_length=255)
    slug = models.SlugField(max_length=255, unique=True)
    created_at = models.DateTimeField(auto_now_add=True)
    yaddayadda = models.CharField(max_length=255)

    objects = models.Manager()
    contents = CoreContentManager()

    class Meta:
        abstract = True


See what I did there? The default QuerySet "objects" will retrieve from the local database of the project that calls it, but "contents" will be handled by the CoreContentManager - and will pull from the coreproject database. (For a little more information on how that works, take a look at the Django documentation on custom managers.)

So back down in my child project, that line where I'm doing the select in the view method changes to this:

    content = Content.contents.all()


A few extra notes:

One of the things I did while playing around in my custom manager was to figure out how to create a Python connection object:

>> import MySQLdb as Database

>> connection = Database.connect(user='myusername', passwd='mypassword', db='mycoredb', host=settings.DATABASE_HOST ... )


But that's not what Query wants - when you pass back a new connection setting, it needs to be an instance of the DatabaseWrapper.

It's probably worth reading through that DatabaseWrapper class and a few others to educate yourself - drill down to /django/db/backends/mysql/base.py (or look at it here).

The Python Database API Specification v2.0 was also an informative read.