Tutorial: How To Implement Tagging With TurboGears and SQLAlchemy

In this tutorial I’ll show you how to create a simple, yet powerful, tagging system using SQLAlchemy with TurboGears. As the concept of tags, and social tagging in particular, have become very popular, clients now demand “tagging-enabled” applications. So, here’s a simple way to get you started.

Our application will associate sites with tags (many to many relationship), like delicious does, but in a much simplified manner. For instance, delicious keeps tracks of which user gave which tag to which URL. We will only associate sites with tags. But it will be very easy to add this functionality later.

We’ll quickstart a new project (the -s argument tells tg-admin that the project will use SQLAlchemy and not SQLObject)

$ tg-admin quickstart -s tags
Enter package name [tags]:
Do you need Identity (usernames/passwords) in this project? [no] yes

Defining The Model

We are going to have a table for the sites, a table for the tags and a table that associates sites with tags (many-to-many). Here’s the code which defines the tables (which goes in model.py):

sites_table = Table(‘sites’, metadata,
        Column(‘site_id’, Integer, primary_key=True),
        Column(‘title’, Unicode(256)),
        Column(‘url’, Unicode(1024)),
        )

tags_table = Table(‘tags’, metadata,
        Column(‘tag_id’, Integer, primary_key=True),
        Column(‘name’, Unicode(32), index=‘tag_idx’))

sites_tags_table = Table(‘sites_tags’, metadata,
        Column(‘site_id’, Integer,
              ForeignKey(‘sites.site_id’),
              primary_key=True),
        Column("tag_id", Integer,
              ForeignKey(‘tags.tag_id’),
              primary_key=True))

We will now create the Python classes that correspond to these tables:

class Tag(object):
    def __init__(self, name):
        self.name = name
    def __repr__(self):
        return self.name
    def link(self):
        return "/tags/"+self.name

class Site(object):
    def __init__(self, url, title):
        self.url, self.title = url, title

Note the link() method in the Tag class. You might wonder what it does there. It just a little habbit that I wanted to share with you. I’ve found myself many times hard-coding URLs inside my templates. Then, if you want to make a tag linkable in many different places in your app, you have to hard-code the link every time. In this way, you can just pass the tag object to your template and do something like:

<a href="${tag.link()}">${tag.name}</a>

Ok, now we continue with mapping the classes to the tables:

mapper(Tag, tags_table)
mapper(Site, sites_table, properties = {
    ‘tags’: relation(Tag, secondary=sites_tags_table, lazy=False),
    })

Great. Now we can construct the database and start populating it:

$ tg-admin sql create
$ tg-admin shell

>>> g = Site(‘http://www.google.com’, ‘Search engine’)
>>> g.tags
[]
>>> g.tags = [Tag(‘search’), Tag(‘google’)]
>>> session.save(g)
>>> session.flush()
(here SQLAlchemy echos the SQL statements it executes)

Handling tags

So we got the model right. The next step is to allow the users to provide tags for the site. The easiest way (for you and your users) is to ask them to enter the tags in a space-separated list. Suppose you are given this kind of space-seperated string of tags from a user, then you have to:

  • convert all tags to lower case, in order to avoid case senstivity issues
  • check if the string contains the same tag twice
  • find which tags are already in the database and which are new
  • recover from some nonsense that users might throw at you

and then get a list of Tag objects that you can assign to a site. So here’s a function that does just that:

def get_tag_list(tags):
    """Get a string of space sperated tag,
    and returns a list of tag objects"
""
    result = []
    tags = tags.replace(‘;’,‘ ‘).replace(‘,’,‘ ‘)

    tags = [tag.lower() for tag in tags.split()]
    tags = set(tags)        # no duplicates!
    if in tags:
        tags.remove()

    for tag in tags:
        tag = tag.lower()
        tagobj = session.query(Tag).selectfirst_by(name=tag)
        if tagobj is None:
            tagobj = Tag(name=tag)
        result.append(tagobj)
    return result

So you can now easily do something like:

>>> f = Site(‘http://www.flickr.com’, ‘Flickr!’)
>>> f.tags = get_tag_list(‘photo sharing photograpy’)
>>> f.tags
[photo, sharing, photograpy]
>>> f.tags[0].link()
‘/tags/photo’

Tag Search

It is straightforward to just list a site together with its tags:

<h3 class="site-title"><a href="${site.url}" target="_blank">${site.title}</a></h3>
        <p class="site-tags">Tags:
        <a py:for="tag in site.tags[:5]" href="${tag.link()}" class="tag">${tag.name}</a>
        </p>

Search is a bit more tricky. It took me few attempts until I got the search queries right. Here’s how to fetch all sites that are tagged by ‘google’:

    q = session.query(Site)
    sites = q.select((Tag.c.name==‘google’) &amp; q.join_to(‘tags’))

the magic is mostly inside the join_to method – it stands for the SQL statements that makes sure that the Tag clause is associated to the sites. Without it, the query runs over the entire cartesian product of Sites x Tags.

You can make the query simpler (for MySQL; not you), if you fetch the tag_id of ‘google’ first. Then, the query uses only 2 of the 3 tables:

    tagobj = session.query(Tag).get_by(name=‘google’)
    if not tagobj:
        raise cherrypy.InternalRedirect(‘/notfound’)
    sites = session.query(Site).select((sites_tags_table.c.tag_id == tagobj.tag_id) &amp;
                 (sites_tags_table.c.site_id == Site.c.site_id))

To search for google|photo:

q = session.query(Site)
sites = q.select(
    Tag.c.name.in_(‘google’, ‘photo’) &amp;
        q.join_to(‘tags’))

To search for sharing+photos:

q = session.query(Site)
sites = q.select(
    Tag.c.name.in_(‘sharing’,‘photos’) &amp;
        q.join_to(‘tags’),
    group_by=[Site.c.site_id],
    having=(func.count(Site.c.site_id)==2))

The idea is that sites that are tagged both with ‘sharing’ and ‘photos’ will appear twice in the select, then after grouping by site_id and getting all which appear twice, we get the desired result.

There are many more things that can be done from this point, like: associating with the tag-site relationship which user added the tag, rendering a tag cloud and so on. Feel free to leave comments!

This entry was posted in howto, python, turbogears. Bookmark the permalink.

17 Responses to Tutorial: How To Implement Tagging With TurboGears and SQLAlchemy

  1. Thanks! Very helpful.

  2. damjan says:

    You say “To search for google+photo” but then you do a search for ’sharing’ and ‘photos’…

  3. thesamet says:

    Thanks Damjan. Post updated.

  4. damjan says:

    Also, you use `session.query(Site)’ often but also you use `q.’ which is assumed to be from the first `q = session.query(Site)’ … it’s a bit confusing. It’s possible to just use `q’ always right?

  5. damjan says:

    Can’t the page.tags be a list of normal strings? Because currently it’s a list of Tag instances.

    For ex. if I wish for ‘photos’ in page.tags to work I had to implement this method in the Tag class:

    def __cmp__(self, other):
    if isinstance(other, Tag):
    return cmp(self.name, other.name)
    elif isinstance(other, basestring):
    return cmp(self.name, other)
    else:
    return cmp(self, other)

  6. thesamet says:

    Damjan, you can add to the Site class a property that will give you the tags as a list of strings:

    class Site(object):
        def __init__(self, url, title):
            self.url, self.title = url, title
        @property
        def tag_strings(self):
            return [tag.name for tag in self.tags]
    
  7. Steve Thomas says:

    One comment and a question:
    Comment: The quotes you use in the examples seem to be forward quote and backward quote, which when I paste into TextMate on my Macintosh give a syntax error and I need to replace the quotes.

    Example:
    I change: ‘google’
    to: ‘google’

    Question:
    When I do the:
    $ tg-admin sql create

    I get an error:
    File “build/bdist.macosx-10.3-fat/egg/sqlalchemy/schema.py”, line 149, in __call__
    File “build/bdist.macosx-10.3-fat/egg/sqlalchemy/schema.py”, line 31, in _init_items
    File “build/bdist.macosx-10.3-fat/egg/sqlalchemy/schema.py”, line 449, in _set_parent
    sqlalchemy.exceptions.ArgumentError: The ‘index’ keyword argument on Column is boolean only. To create indexes with a specific name, append an explicit Index object to the Table’s list of elements.

    it is caused by the 3rd line below:
    tags_table = Table(“tags”, metadata,
    Column(‘tag_id’, Integer, primary_key=True),
    Column(‘name’, Unicode(32), index=’tag_idx’))

    I am running with MySQL database.

    I can use the tutorial without specifying the index or index name, but it would be useful.

    Thanks.

  8. Pingback: sites poker en ligne

  9. Finance and treasury experts can take the Certified Treasury Professional Exam from Morgan International before taking the certification. Passing the first time is key.

  10. Nathaniel says:

    I see you share interesting stuff here, you
    can earn some additional money, your blog has huge potential, for the monetizing method, just search in google – K2 advices how to monetize
    a website

  11. Alycia says:

    I read a lot of interesting content here. Probably you spend a lot of time writing, i
    know how to save you a lot of time, there is an online tool that creates high quality,
    SEO friendly posts in seconds, just search in google – laranitas free content
    source

  12. Danielle1992 says:

    This post is on 13 spot in google’s search results, if you want
    more visitors, you should build more backlinks to your articles, there is
    one trick to get free, hidden backlinks from authority forums, search on youtube:
    how to get hidden backlinks from forums

  13. This post will assaist the internet viewers for creating new website or even a blog from start to end.

  14. Betty says:

    I was extremely pleased to discover this great site. I want to to thank you for your time for this
    fantastic read!! I definitely enjoyed every little bit of it and i also hsve you saved ass a favorite to loook at new information in your web site.

  15. Want to train muscles fast and become huge?
    No need to wait months and months, get the best Muscle Boosting BodyBuilding Supplements!

    Try it today muscles Supplements

    FEEL YOUNG AGAIN!
    Restore your youthful energy – Use Creatine that ups your training and helps to build muscles fast!

Leave a Reply

Your email address will not be published. Required fields are marked *