Tutorial: How To Implement Tagging With TurboGears and SQLAlchemy

In this tutorial I’ll show you how to create a simple, yet powerful, tagging system using SQLAlchemy with TurboGears. As the concept of tags, and social tagging in particular, have become very popular, clients now demand “tagging-enabled” applications. So, here’s a simple way to get you started.

Our application will associate sites with tags (many to many relationship), like delicious does, but in a much simplified manner. For instance, delicious keeps tracks of which user gave which tag to which URL. We will only associate sites with tags. But it will be very easy to add this functionality later.

We’ll quickstart a new project (the -s argument tells tg-admin that the project will use SQLAlchemy and not SQLObject)

$ tg-admin quickstart -s tags
Enter package name [tags]:
Do you need Identity (usernames/passwords) in this project? [no] yes

Defining The Model

We are going to have a table for the sites, a table for the tags and a table that associates sites with tags (many-to-many). Here’s the code which defines the tables (which goes in model.py):

sites_table = Table('sites', metadata,
        Column('site_id', Integer, primary_key=True),
        Column('title', Unicode(256)),
        Column('url', Unicode(1024)),

tags_table = Table('tags', metadata,
        Column('tag_id', Integer, primary_key=True),
        Column('name', Unicode(32), index='tag_idx'))

sites_tags_table = Table('sites_tags', metadata,
        Column('site_id', Integer,
        Column("tag_id", Integer,

We will now create the Python classes that correspond to these tables:

class Tag(object):
    def __init__(self, name):
        self.name = name
    def __repr__(self):
        return self.name
    def link(self):
        return "/tags/"+self.name

class Site(object):
    def __init__(self, url, title):
        self.url, self.title = url, title

Note the link() method in the Tag class. You might wonder what it does there. It just a little habbit that I wanted to share with you. I’ve found myself many times hard-coding URLs inside my templates. Then, if you want to make a tag linkable in many different places in your app, you have to hard-code the link every time. In this way, you can just pass the tag object to your template and do something like:

<a href="${tag.link()}">${tag.name}</a>

Ok, now we continue with mapping the classes to the tables:

mapper(Tag, tags_table)
mapper(Site, sites_table, properties = {
    'tags': relation(Tag, secondary=sites_tags_table, lazy=False),

Great. Now we can construct the database and start populating it:

$ tg-admin sql create
$ tg-admin shell
>>> g = Site('http://www.google.com', 'Search engine')
>>> g.tags
>>> g.tags = [Tag('search'), Tag('google')]
>>> session.save(g)
>>> session.flush()
(here SQLAlchemy echos the SQL statements it executes)

Handling tags

So we got the model right. The next step is to allow the users to provide tags for the site. The easiest way (for you and your users) is to ask them to enter the tags in a space-separated list. Suppose you are given this kind of space-seperated string of tags from a user, then you have to:

  • convert all tags to lower case, in order to avoid case senstivity issues
  • check if the string contains the same tag twice
  • find which tags are already in the database and which are new
  • recover from some nonsense that users might throw at you

and then get a list of Tag objects that you can assign to a site. So here’s a function that does just that:

def get_tag_list(tags):
    """Get a string of space sperated tag,
    and returns a list of tag objects"""
    result = []
    tags = tags.replace(';',' ').replace(',',' ')

    tags = [tag.lower() for tag in tags.split()]
    tags = set(tags)        # no duplicates!
    if '' in tags:

    for tag in tags:
        tag = tag.lower()
        tagobj = session.query(Tag).selectfirst_by(name=tag)
        if tagobj is None:
            tagobj = Tag(name=tag)
    return result

So you can now easily do something like:

>>> f = Site('http://www.flickr.com', 'Flickr!')
>>> f.tags = get_tag_list('photo sharing photograpy')
>>> f.tags
[photo, sharing, photograpy]
>>> f.tags[0].link()

Tag Search

It is straightforward to just list a site together with its tags:

<h3 class="site-title"><a href="${site.url}" target="_blank">${site.title}</a></h3>
        <p class="site-tags">Tags:
        <a py:for="tag in site.tags[:5]" href="${tag.link()}" class="tag">${tag.name}</a>

Search is a bit more tricky. It took me few attempts until I got the search queries right. Here’s how to fetch all sites that are tagged by ‘google’:

    q = session.query(Site)
    sites = q.select((Tag.c.name=='google') & q.join_to('tags'))

the magic is mostly inside the join_to method – it stands for the SQL statements that makes sure that the Tag clause is associated to the sites. Without it, the query runs over the entire cartesian product of Sites x Tags.

You can make the query simpler (for MySQL; not you), if you fetch the tag_id of ‘google’ first. Then, the query uses only 2 of the 3 tables:

    tagobj = session.query(Tag).get_by(name='google')
    if not tagobj:
        raise cherrypy.InternalRedirect('/notfound')
    sites = session.query(Site).select((sites_tags_table.c.tag_id == tagobj.tag_id) &
                 (sites_tags_table.c.site_id == Site.c.site_id))

To search for google|photo:

q = session.query(Site)
sites = q.select(
    Tag.c.name.in_('google', 'photo') &

To search for sharing+photos:

q = session.query(Site)
sites = q.select(
    Tag.c.name.in_('sharing','photos') &

The idea is that sites that are tagged both with ‘sharing’ and ‘photos’ will appear twice in the select, then after grouping by site_id and getting all which appear twice, we get the desired result.

There are many more things that can be done from this point, like: associating with the tag-site relationship which user added the tag, rendering a tag cloud and so on. Feel free to leave comments!

This entry was posted in howto, python, turbogears. Bookmark the permalink.