The origins of the class Meta idiom in python

So I keep finding this class Meta idiom in python APIs lately. Found it in factory-boy and WTForms and I suspected they both got it from Django, but I googled and couldn’t find any explanations of the reason for it or where it came from or why they’re all it class Meta. So here it is!

TL;DR What it is

The inner Meta class has absolutely no relation to python’s metaclasses. The name is just a coincidence of history (as you can read below).

There’s nothing magical about this syntax at all, here’s an example from Django’s documentation:

class Ox(models.Model):
    horn_length = models.IntegerField()
    class Meta:
        ordering = ["horn_length"]
        verbose_name_plural = "oxen"

Having an inner Meta class makes it easier for both the users and the ORM to tell what is a field on the model and what is just other information (or metadata if you like) about the model. The ORM can simply do your_model.pop('Meta') to retrieve the information it needs. You can also do this in any library you implement just as factory-boy and WTForms have done.

Some early Django history

Now for the longer story. I did some software archaeology, which should totally be a thing!, and discovered the first commit which mentions class Meta (actually class META1) in Django: commit 25264c86. There is a Release Notes wiki page which includes that change.

From there we can see how Django models were declared before the introduction of the internal class Meta. A Django Model class had a few special attributes. The db_table attribute held the SQL table name. A fields attribute was a tuple (!) of instances of field types (e.g. CharField, IntegerField, ForeignKey). These mapped to SQL table columns. One other interesting attribute was admin which was mostly used to describe how that model would behave in django’s admin interface. Now all these classes were defined in the django.core.meta package i.e. meta.Model, meta.CharField, meta.ForeignKey, meta.Admin. That’s so meta! (and probably where the name came from in the end)

In a django-developers mailing list thread from July 2005 titled Cleaner approach to ORM fields description user deelan suggests bringing some of SQLObject ‘s ideas to Django’s ORM. This seems to be the first seed of the idea of having an inner class in Django to store part of a model’s attributes:

it’s desiderable to avoid name clashes between fields, so it would be
good to have a way to wrap fields into a private namespace.
deelan, Cleaner approach to ORM fields description

At the end of the thread, django ticket 122 is created which seems to contain the first mention of a separate internal Meta class.

What started off as a backwards-compatible change, soon turned backwards-incompatible and was the first really big community-driven improvement to Django as Adrian Holovaty will later describe it in the release announcement which included the change.

The first patch on ticket 122 by Matthew Marshall started by suggesting that fields should be able to be defined directly on the model class, as class attributes (Build models using fieldname=FieldClass) rather than in the fields list. So:

class Poll(meta.Model):
    question = meta.CharField(maxlength=200)
    pub_date = meta.DateTimeField('date published')

rather than:

class Poll(meta.Model):
    fields = (
        meta.CharField(maxlength=200),
        pub_date = meta.DateTimeField('date published'),
    )

But also that there should be two ways of defining a ForeignKey:

ForeignKey = Poll, {'edit_inline':True, 'num_in_admin':3}
.
#the attribute name is irrelevant here:
anything = ForeignKey(Poll, edit_inline=True, num_in_admin=3)

In his first comment, mmarshall introduces the inner class Meta to hold anything that’s not a field: the table name (strangely renamed to module_name) and the admin options. The fields would be class attributes.

The decision over what goes in an inner class and what goes in the outer class seems to be left to the user. An optional class Field inner class would be supported so the fields would live there and the metadata would live as class attributes (this seemed to offer the advantage of being backwards-compatible with the admin class attribute while allowing tables to have a column that’s also named admin.

There are some other ideas thrown around and the syntax for ForeignKey is also discussed. At one point, Adrian Holovaty (adrian) intervenes to say (about the original class Meta/class Field suggestion):

It’s too flexible, to the point of confusion. Making it possible to do either class Meta or class Field or plain class attributes just smacks of “there’s more than one way to do it.” There should be one, clear, obvious way to do it. If we decide to change model syntax, let’s have class Meta for non-field info, and all fields are just attributes of the class.
— Adrian Holovaty, django ticket 122, comment 9

The thread goes on from there. There are some detractors to the idea (citing performance and conformance to other python APIs), there are discussions about implementation details and talking again about the ForeignKey syntax.

Then, in a dramatic turn of events, Adrian Holovaty closes the ticket as wontfix!:

Jacob [Kaplan-Moss] and I have talked this over at length, and we’ve decided the model syntax shouldn’t change. Using a fieldname=FieldClass syntax would require too much “magic” behind the scenes for minimal benefit.
— Adrian Holovaty, django ticket 122, comment 33

It’s interesting, because IMHO this was a huge differentiator in terms of making django’s models API more human and was also what other frameworks like Rails and SQLObject were doing at the time.

An IRC discussion is then referenced in the ticket.2 From that discussion, it seems that adrian’s reasons for closing were mostly concerns about the ForeignKey syntax and making a backwards-incompatible change to the model. rmunn does a great job of moderating the discussion, clarifying the situation and everyone’s opinions while strongly pushing for the new syntax.

The trac ticket is reopened as a consequence and it looks like smooth-sailing from the on. Some days later the new syntax is merged and the ticket is once again closed, this time with Resolution set to fixed.

Adrian will later announce the change in a django-developers mailing list post. Here are some interesting fragments from that mailing list post:


I apologize for the backwards-incompatibility, but this is still unofficial software. ;-) Once we reach 1.0 — which is much closer now that the model syntax is changed — we’ll be very dedicated to backwards-compatibility.

I can’t think of any other backwards-incompatible changes that we’re planning before 1.0 (knock on wood). If this isn’t the last one, though, it’s at least the last major one.
— Adrian Holovaty, IMPORTANT: Django model syntax is changing

Things didn’t go as planned. In May 2006, came commit f69cf70e which was exactly another let’s-change-everything-in-one-huge-branch commit which was released as part of Django 0.95. As part of this API change, class META was renamed to class Meta (because it’s easier on the eyes). You can find the details on RemovingTheMagic wiki page. It’s funny how in ticket 122 all the comments use the Meta capitalization, except for the last person (who I guess submitted the patch) who uses META. There was some discussion, both in the ticket and on IRC, about it and a few people had concerns that users of Django would actually want to have a field called Meta in their models and the inner class name would clash with that.

That’s it. Almost…

Anyway, so that’s the end of the story of how Django got its class Meta. Now what if I told you that all of this had already happened more than one year before in the SQLObject project? Remember that first post to django-developers which said Django models should hold some of its attributes in a separate inner class like SQLObject already does?

In April 2004, Ian Bicking (creator of SQLObject) sent an email to the sqlobject-discuss mailing list:

There’s a bunch of metadata right now that is being stored in various instance variables, all ad hoc like, and with no introspective interfaces. I’d like to consolidate these into a single object/class that is separated from the SQLObject class. This way I don’t have to worry about name clashes, and I don’t feel like every added little interface will be polluting people’s classes. (Though most of the public methods that are there now will remain methods of the SQLObject subclasses, just like they are now) So I’m looking for feedback on how that should work.
— Ian Bicking, Metadata container

His code example:

class Contact(SQLObject):
     class sqlmeta(SQLObject.sqlmeta):
         table = 'contact_table'
         cacheInstances = False
     name = StringCol()

SQLObject’s community did not seem nearly as animated as Django’s. There were a couple of emails on the sqlobject-discuss mailing list from Ian Bicking which included the proposal and asked for feedback. I suspect some discussion happened through some other channels, but this community was neither as big nor as good at docummenting its functioning as Django. (And sourceforge’s interface to the mailing list archives and cvs logs does not make this easy to navigate).

A year later, Ian Bicking takes part in the django-developers mailinglist discussion where he makes some small syntax suggestions, but it does not seem that he made any other contributions to the design of this part of the Django models API.

Conclusion

As far as I could tell, Ian Bicking is the originator of the idea of storing metadata in a metadata container inner class. Although it was the Django project which settled on the class Meta name and popularised it outside of its own community.

Anyway, that’s the end of the story. To me, it shows just how awesome open source and the open internet can be. The fact that I was able to find all of this 11 years later, complete with the original source code, commit logs and all the discussion around the implementation on the issue tracker, mailing lists and IRC logs is just amazing community work and puts a tear in my eye.

Hope you’ve enjoyed the ride!

1 because in 2005, people were less soft-spoken

2 It’s very fun, you should read it. At some point someone’s cat catches a blue jay. And I think they meant it literally.

Comments

testing a django blog's models

This post is a continuation of this post and I’ll be using that schema to write tests on top of. Here it is, for easy reference:

from django.db import models
class Category(models.Model):
    nume = models.CharField(max_length=20)
class Post(models.Model):
    title = models.CharField(max_length=50)
    body = models.TextField()
    category = models.ForeignKey(Category)
    published = models.BooleanField()
    creation_time = models.DateTimeField(auto_now_add=True)
    modified_time = models.DateTimeField(auto_now=True)
class Commentator(models.Model):
    name = models.CharField(max_length=50, unique=True)
    email = models.EmailField(max_length=50, unique=True)
    website = models.URLField(verify_exists=True)
class Comment(models.Model):
    body = models.TextField()
    post = models.ForeignKey(Post)
    author = models.ForeignKey(Commentator)
    approved = models.BooleanField()
    creation_time = models.DateTimeField(auto_now_add=True)

Ok, so here’s what we’re testing: our model — the emphasis is on our, because we’re only testing our code. And mostly, we’re actually testing that the code in models.py corresponds with what’s in the database. All test methods must begin with the word test.

One of the most annoying things which took me a while to figure out was that the setUp method is run every time before one of the other methods is run. That means that if you want to test for uniqueness, you have to build a tearDown method if you want to run any other independent tests. This is why snippet A won’t work, but snippet B will.
Here’s the model:

class Category(models.Model):
    name = models.CharField(max_length=20, unique=True)

snippet A

class CategoryTest(unittest.TestCase):
def setUp(self):
    self.cat1 = Category.objects.create(name="cat1")
def testexist(self):
    # make sure they get to the database
    self.assertEquals(self.cat1.name, "cat1")
def testunique(self):
    self.assertRaises(IntegrityError, Category.objects.create, name="cat1")

snippet B

class CategoryTest(unittest.TestCase):
def setUp(self):
    self.cat1 = Category.objects.create(name="cat1")
def testexist(self):
    # make sure they get to the database
    self.assertEquals(self.cat1.name, "cat1")
    self.assertRaises(IntegrityError, Category.objects.create, name="cat1")

The second snippet only calls the setUp method once because there is only one other method. But that’s not very nice. Ideally we’d to be able to run each test individually, so maybe we can write a tearDown method to be run after each other method, to restore the database.

However, there is an easier way to not have to write a tearDown method and that is using the django.test module which is an extention to unittest. All you have to do is import django.test instead of unittest and make every test object a sublclass of django.test.TestCase instead of unittest.TestCase.
Here is what it looks like now:

class CategoryTest(django.test.TestCase):
    def setUp(self):
        self.cat1 = Category.objects.create(name="cat1")
        self.cat2 = Category.objects.create(name="cat2")
    def testexist(self):
        # make sure they get to the database
        self.assertEquals(self.cat1.name, "cat1")
        self.assertEquals(self.cat2.name, "cat2")
    def testunique(self):
        self.assertRaises(IntegrityError, Category.objects.create, name="cat1")

Now, let’s test the Post class:

class Post(models.Model):
    title = models.CharField(max_length=50)
    body = models.TextField()
    category = models.ForeignKey(Category)
    published = models.BooleanField()
    creation_time = models.DateTimeField(auto_now_add=True)
    modified_time = models.DateTimeField(auto_now=True)

There’s a bunch more stuff to test here, like the fact that everything gets to the database (title, body, category) and that everything has it’s right type/class.
We setUp a post, but also a category, since the test will be independent, but needs a Category to generate a Post.

class PostTest(django.test.TestCase):
    def setUp(self):
        self.cat1 = Category.objects.create(name="cat1")
        self.post1 = Post.objects.create(title="name",body="trala lala",
                category=Category.objects.all()[0])

Next, we need to do a bit of a trivial test to check that the title, the body and the right category get to the db

def testtrivial(self):
        self.assertEquals(self.post1.title, "name")
        self.assertEquals(self.post1.body, "trala lala")
        self.assertEquals(self.post1.category, Category.objects.all()[0])

I think this is a good way to test that the creation_time and modified_time are newly generated datetime.datetime objects:

def testtime(self):
    self.assertEquals(self.post1.creation_time.hour, datetime.now().hour)

No, wait. I think this looks a bit more professional:

def testtime(self):
        delta = datetime.now() - self.post1.creation_time
        self.assert_(delta.seconds < 10)
        delta_modified = datetime.now() - self.post1.modified_time
        self.assert_(delta_modified.seconds < 10)

So now, we’re looking for datetime objects that were generated less than 10 seconds ago. That’s really very generous since the time it takes to run the test from the time the setUp method is run is in the range of microseconds.
This test doesn’t show the true difference between modified and creation time. Modification time is changed every time the object is saved to the database while creation time is not. So let’s write a new test based on that knowledge:

 def testModifiedVsCreation(self):
        modified = self.post1.modified_time
        created = self.post1.creation_time
        self.post1.save()
        self.assertNotEqual(modified, self.post1.modified_time)
        self.assertEqual(created, self.post1.creation_time)

Testing for a boolean value is really easy:

 def testpublished(self):
        self.assertEquals(self.post1.published, False)

And then there’s more than one way I can think of to test the Category ForeignKey:

def testcategory(self):
        self.assertEquals(self.cat1.__class__, self.post1.category.__class__)
        self.assertRaises(ValueError, Post.objects.create, name="name",
                body="tralaalal", category="ooopsie!")

In the end, I’ll go for the more general one, even though the second one is more excentric. So:

def testcategory(self):
        self.assertEquals(self.cat1.__class__, self.post1.category.__class__)
        self.assertRaises(ValueError, Post.objects.create, name="name",
                body="tralaalal", category="ooopsie!")

Btw, if you don’t know the errors (like ValueError — I didn’t know it), you can always drop to a manage.py console and try to Post.object.create(name="name",body="tralaalal", category="ooopsie!") and see if you get lucky.

Ok, passing on to the Commentator class:

class Commentator(models.Model):
    name = models.CharField(max_length=50, unique=True)
    email = models.EmailField(max_length=50, unique=True)
    website = models.URLField(verify_exists=True, blank=True)

We’re only going to test that the data gets to the database and that the name and email fields are unique. At this stage we can’t test the validation of the email and website fields. We’ll be doing that later, when we write the forms.
This should seem trivial by now:

class CommentatorTest(django.test.TestCase):
    def setUp(self):
        self.comtor = Commentator.objects.create(name="hacketyhack",
                email="hackety@example.com", website="example.com")
    def testExist(self):
        self.assertEquals(self.comtor.name, "hacketyhack")
        self.assertEquals(self.comtor.email, "hackety@example.com")
        self.assertEquals(self.comtor.website, "example.com")
     def testUnique(self):
        self.assertRaises(IntegrityError, Commentator.objects.create,
                name="hacketyhack", email="new@example.com",
                website="example.com")
        self.assertRaises(IntegrityError, Commentator.objects.create,
                name="nothackety", email="hackety@example.com",
                website="example.com")

Now, let’s get to testing the Comment class:

class Comment(models.Model):
    body = models.TextField()
    post = models.ForeignKey(Post)
    author = models.ForeignKey(Commentator)
    approved = models.BooleanField()
    creation_time = models.DateTimeField(auto_now_add=True)

There won’t be anything new here. And this is when and why testing is boring. But, hey! A man’s gotta do, what a man’s gotta do.

class CommentTest(django.test.TestCase):
    def setUp(self):
        self.cat = Category.objects.create(name="cat1")
        self.post = Post.objects.create(title="name",body="trala lala",
                category=Category.objects.all()[0])
        self.comtor = Commentator.objects.create(name="hacketyhack",
                email="hackety@example.com", website="example.com")
        self.com = Comment.objects.create(body="If the implementation is
        easy to explain, it may be a good idea.",
        post=Post.objects.all()[0], author=Commentator.objects.all()[0])
    def testExist(self):
        self.assertEquals(self.com.body, "If the implementation is
        easy to explain, it may be a good idea.")
        self.assertEquals(self.com.post, Post.objects.all()[0])
        self.assertEquals(self.com.author, Commentator.objects.all()[0])
        self.assertEquals(self.com.approved, False)
    def testTime(self):
        delta_creation = datetime.now() - self.comm.creation_time
        self.assert_(delta_creation.seconds < 7)
    def testCreationTime(self):
        # what if it's a modification_time instead?
        created = self.com.creation_time
        self.com.save()
        self.assertEqual(created, self.com.creation_time)

Now that we’ve written all the tests we have to make sure that they’re run against the actual database. Or better yet, a backup copy of it. Otherwise, the tests are useless, since django creates a new database based on the schema defined in models.py every time models.py test is run.

First, you’ll need to make a copy of django.test.simple (put it in your project’s directory for example). Then comment these lines:

# old_name = settings.DATABASE_NAME
# from django.db import connection
# connection.creation.create_test_db(verbosity, autoclobber=not interactive)
result = unittest.TextTestRunner(verbosity=verbosity).run(suite)
# connection.creation.destroy_test_db(old_name, verbosity)

And now, add this to your settings.py file:

TEST_RUNNER = 'myproject.simple.run_tests'

Be careful now. All the data in your database will be lost when you run manage.py test the next time. So back it up! First create a new database, say backup and then:

mysqldump -u DB_USER --password=DB_PASS DB_NAME|mysql -u DB_USER --password=DB_PASSWD -h localhost backup

You can reverse that when you’re done.

Here’s to show that it works (after I’ve made a little modification to the model, but not the database):

$ python manage.py test
..EEE..EEEEEE................
--> lots of tracebacks <--
----------------------------------------------------------------------
Ran 29 tests in 10.149s
FAILED (errors=9)

Ok, so that should provide a pretty good test coverage for now. Let’s go get breakfast!

Comments

blog database schema cu capsuni - Part 2

Tocmai am reușit (am găsit timp — furat timp) să scriu în django schema din postul trecut. Simplicity is divine:

from django.db import models
class Category(models.Model):
    nume = models.CharField(max_length=20)
class Post(models.Model):
    title = models.CharField(max_length=50)
    body = models.TextField()
    category = models.ForeignKey(Category)
    published = models.BooleanField()
    creation_time = models.DateTimeField(auto_now_add=True)
class Commentator(models.Model):
    name = models.CharField(max_length=50, unique=True)
    email = models.EmailField(max_length=50, unique=True)
    website = models.URLField(verify_exists=True)
class Comment(models.Model):
    body = models.TextField()
    post = models.ForeignKey(Post)
    author = models.ForeignKey(Commentator)
    approved = models.BooleanField()
    modified_time = models.DateTimeField(auto_now=True)

Pe lângă faptul că toate tipurile de date au nume și explicații pe care le poate înțelege oricine, django va folosi datele astea atunci când va construi interfața de administrare.
E interesant că trebuie să declari toate tabelele în ordine. La început pusesem Categoria ultima și n-o găsea când vroia să facă ForeignKey-ul de la Post. M-a cam răsfățat OOP-ul.
Alt lucru fain e că de pe acum se pregătesc feature-uri interesante cum ar fi URLField.verify_exists care verifică toate url-urile introduse și le refuză dacă primește 404. Așa că de-acum n-o să mai poată nimeni să pună variabile metasintactice de genu: caca, mumu și altele în field-ul ăla!

Și acum un mysql describe3 pentru tabelele rezultate:

mysql> describe revolution.blahg_category;
 +-------+-------------+------+-----+---------+
 | Field | Type        | Null | Key | Default |
 +-------+-------------+------+-----+---------+
 | id    | int(11)     | NO   | PRI | NULL    |
 | nume  | varchar(20) | NO   |     | NULL    |
 +-------+-------------+------+-----+---------+
 mysql> describe revolution.blahg_post;
 +---------------+-------------+------+-----+---------+
 | Field         | Type        | Null | Key | Default |
 +---------------+-------------+------+-----+---------+
 | id            | int(11)     | NO   | PRI | NULL    
 | title         | varchar(50) | NO   |     | NULL    |
 | body          | longtext    | NO   |     | NULL    |
 | category_id   | int(11)     | NO   | MUL | NULL    |
 | published     | tinyint(1)  | NO   |     | NULL    |
 | creation_time | datetime    | NO   |     | NULL    |
 | modified_time | datetime    | NO   |     | NULL    |
 +---------------+-------------+------+-----+---------+
 mysql> describe revolution.blahg_commentator;
 +---------+--------------+------+-----+---------+
 | Field   | Type         | Null | Key | Default | 
 +---------+--------------+------+-----+---------+
 | id      | int(11)      | NO   | PRI | NULL    |
 | name    | varchar(50)  | NO   | UNI | NULL    |
 | email   | varchar(50)  | NO   | UNI | NULL    |
 | website | varchar(200) | NO   |     | NULL    |
 +---------+--------------+------+-----+---------+
 mysql> describe revolution.blahg_comment;
 +---------------+------------+------+-----+---------+
 | Field         | Type       | Null | Key | Default |
 +---------------+------------+------+-----+---------+
 | id            | int(11)    | NO   | PRI | NULL    | 
 | body          | longtext   | NO   |     | NULL    |
 | post_id       | int(11)    | NO   | MUL | NULL    | 
 | author_id     | int(11)    | NO   | MUL | NULL    |
 | approved      | tinyint(1) | NO   |     | NULL    |
 | modified_time | datetime   | NO   |     | NULL    |
 +---------------+------------+------+-----+---------+

TADA!
Trebuie să studiez de ce se strică formatarea (de fapt știu de ce, trebuie să scot textile din blockul ăla), dar ideea de bază e clară.

3 merci gheorghe!

Comments

blog database schema cu capsuni

Am reușit să-mi scriu schema bazei de date a viitorului meu blog. Ah, da, m-am apucat de treabă. O să folosesc django cred (și deja mi s-a spus că era previzibil) deși încă mai am timp să mă răzgândesc. N-am găsit în 2 minute un script care să-mi deseneze scheme, așa că pun relațiile în engleză aici. Come bash me!

Post

  • belongs_to Category
  • has_many Comments

Comment

  • has_one Commentator
  • belongs_to Post

Commentator

  • has_many Comments

Category

  • has_many Posts

Ia să încerc să fac și niște tabele din ce scrie mai sus.

Post
id || title || body || category_id || created_at || published

Comment
id || post_id || commentator_id || body || approved || created_at

Commentator
id || name || email || website || gravatar_url

Category
id || name

Am renunțat la tabele și am improvizat o formatare. Sper să fie lizibil.

E evident ceea ce nu am făcut sper. Nu am lăsat commentatorii cu commenturile lor ceea ce ar fi dus la o relație cu 4 coloane redundante (nume, email, website, gravatar):

Comment
id || autor || email || website || gravatar_url || post_id || commentator_id || body || approved || created_at

Pe parcurs o să mai adaug rating la posturi și alte lucruri care mai îmi vin în minte. Ratingul o să încerce să fie ceva complex cu sus/jos, dar asta mai târziu.
Deci? Ce părere aveți? Ce să mai adaug? Am greșit ceva? Mă încadrez în forma normală 5? :-)

Comments
Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 Unported License.