QuerySets, Q Objects, datetime, and Far Too Much Context

Django    Django1.3    PostgreSQL    2012-08-07

Currently I'm working on a project that involves collections of activities that take place over the course of 12 weeks. I'm using Postgres, Django 1.3, and the DateTimeField class (which translates to the timestamptz type in Postgres) to handle dates in my models. (A note: Although it creates a few complications for me, I opted to use DateTimeField instead of DateField because there will be use cases that call for a time value, just not the one I'm about to outline. If you don't need times, use DateField - it will make your life and your QuerySets a lot simpler.)

I'm about to get into far too much detail here, so feel free to skip to the interesting bit.

The Activity model referenced below doesn't matter for this illustration, but the Date and Dateset models do:

class Date(models.Model):
    activity = models.ForeignKey(Activity)
    week = models.IntegerField()  # as in Week 1, Week 2, etc.
    day = models.IntegerField()   # as in Day 1, Day 2, etc.
    activity_date = models.DateTimeField(null=False, blank=False, editable=False)

The Date objects are then grouped into Datesets - the four activities that occur on Week 1/Day 1, Week 1/Day 2, and so on:

class Dateset(models.Model):
    date_one = models.ForeignKey(Date, related_name='date_one')
    date_two = models.ForeignKey(Date, related_name='date_two')
    date_three = models.ForeignKey(Date, related_name='date_three')
    date_four = models.ForeignKey(Date, related_name='date_four')
    complete = models.BooleanField(default=False, editable=True)

The Dates are created at the beginning of the 12-week program - in one of my view methods, I generate the Dates like so:

    # This starting date is explicitly created with  
    # zero values for time, e.g. 2012-08-07 00:00:00
    start = datetime(datetime.now().year, datetime.now().month, 
        datetime.now().day)

    # 'aset' contains ids for four Activity objects 
    # that occur on the same day
    for a in aset:  
        for w in range(12):  # 12 weeks
            for d in range(5):  # 5 days per week

                # Because we started with zero time values, 
                # each successive date will be created as yyyy-mm-dd 00:00:00
                ndate = (start + timedelta((7*w)) + timedelta(d))
	
                form = DateForm()
                obj = form.save(commit=False)
                obj.activity = Activity.objects.get(pk=a)
                obj.week = w+1
                obj.day = d+1
                obj.activity_date = ndate
                obj.save()

Then the Dates are grouped into Datesets:

    # Select just the activity_date values from those 240 newly minted Date objects
    dateset = Date.objects.values('activity_date').filter(activity__in=aset)
    dates = set()

    # Add each 'activity_date' to a set (using a set eliminates duplicates)
    for newdate in dateset:
        sdate = newdate["activity_date"]
        dates.add(sdate)

    # Now that we've got the list boiled down to just 60 unique calendar dates, 
    # we can gather the four activities that match each calendar date 
    # and group them into Dateset objects
    for xdate in dates:
        dobj = Date.objects.filter(activity_date=xdate)
        dsform = DatesetForm({'date_one': dobj[0].id, 'date_two': dobj[1].id, 
            'date_three': dobj[2].id, 'date_four': dobj[3].id})
        dsform.save()

So I have a total of 60 Datesets - one for each day of the program (12 weeks, 5 days per week). And each Dateset contains the four activities to be completed for that day.

Phew. Okay, now here's where I get to the point.

These Datesets can be marked as completed or not - the idea is that as you complete each day's set of activities, you check it off.

What I want to display on the page are a) any Datesets from the current day or previous days that are not yet marked as complete and b) the current day's Dateset, even after it has been marked as complete. Also, since I may be returning more than one Dateset (in case there are incomplete Datesets from previous days), I want to show the newest one first (so the older ones are pushed to the bottom of the page). Oh, and these Datesets must be unique to the current user.

Not only will this call for a compound condition in my query, but I'm also going to be working with datetime objects which are notoriously tricky.

I've imported these modules (timedelta was used above, in the original creation of the Date objects):

from datetime import datetime
from datetime import timedelta

The QuerySet uses Django's Q objects. Remember that, when using Q objects, those conditions must be passed in first, before any other filter. I've defined two conditions, separated by a pipe symbol (|) to indicate 'OR' (to link the two conditions with 'AND' you would instead separate them with a comma).

# A 'today' object that contains date but not time, e.g. 2012-08-07
today = datetime.now().date()

dates = Dateset.objects.filter(
    Q(date_one__activity_date__lte=today, complete=False) | 
    Q(date_one__activity_date=today),
    date_one__activity__user=request.user).order_by('-date_one__activity_date')

In the first condition, the activity_date must be less than or equal to today's date and 'complete' must be False.

In the second condition, the activity_date equals today's date, but the value for 'complete' is irrelevant - I want today's records regardless of the whether they're complete or not.

Finally, the result set is limited to those records belonging to request.user, and the ordering is set - activity_dates descending from newest to oldest.

This evaluates to:

SELECT 
  activities_dateset.id, 
  activities_dateset.date_one_id, 
  activities_dateset.date_two_id, 
  activities_dateset.date_three_id, 
  activities_dateset.date_four_id, 
  activities_dateset.complete 
FROM activities_dateset 
INNER JOIN activities_date ON (activities_dateset.date_one_id = activities_date.id) 
INNER JOIN activities_goal ON (activities_date.goal_id = activities_goal.id) 
WHERE (
  (
    (activities_dateset.complete = False  AND activities_date.activity_date <= 2012-08-07 00:00:00 ) 
  OR 
    (activities_date.activity_date = 2012-08-07 00:00:00)
  ) 
  AND activities_goal.user_id = 11 ) 
ORDER BY activities_date.activity_date DESC

Another neat trick I discovered along the way - once you've defined your QuerySet, you can print the SQL using its query attribute, especially helpful if you're just beginning to build using runserver:

    print dates.query

Further reading: