« Back

Project Py-Tor-Red-Blog

by Andrew Zeneski  •  published February 12, 2011
 

I found Tornado while browsing through the list of open source software used by Facebook. I came across a few projects I wasn't familiar with; Tornado was one of them.

Browsing through the documentation, I found the framework to be very intuitive:

class MainHandler(tornado.web.RequestHandler):
    def get(self):
        self.write("Hello, world")

With a few more lines of code to setup the application, this creates a very simple hello word example. Pretty nifty. Looks a lot like a GAE webapp (or web.py) eh?

Not that long ago, I also came across a key-value NoSQL store, Redis, that caught my attention. Having been focused on OFBiz implementations that depend on transaction SQL databases, I was interested in investigating the NoSQL trend more.

Redis looked very interesting, so as an exercise to learn more about both Tornado and Redis I ported the PHP sample application Retwis to Python using Tornado and the redis-py library.

(My port is available on GitHub.)

I decided I needed a new blog and with the Retwis port under by belt I realized that a blogging application was a perfect candidate for this technology stack. Storing blog data in a SQL database never really made a whole lot of sense, but a document store or key-value store did seem like a good fit.

I reused a number of patterns and code blocks I found on Bret Taylor's Blog, spent 1.5 days and coded up this tool. My first impression is: after spending the last several years writing Java exclusively, Python is very refreshing and MUCH faster to develop, and with modern CPUs, Tornado + Nginx scaling issues are a thing of the past. I am very happy the with results.

The major meat of the app is in the posting class and entry module, but first I needed a way to secure the ability to post. So I created an author decorator, which is borrowed from the 'authenticated' decorator in Tornado:

def author(method):
    """Decorate methods with this to require that the user be an author."""
    @functools.wraps(method)
    def wrapper(self, *args, **kwargs):
        if not self.current_user:
            if self.request.method in ("GET", "HEAD"):
                url = self.get_login_url()
                if "?" not in url:
                    if urlparse.urlsplit(url).scheme:
                        # if login url is absolute, make next absolute too
                        next_url = self.request.full_url()
                    else:
                        next_url = self.request.uri
                    url += "?" + urllib.urlencode(dict(next=next_url))
                self.redirect(url)
                return
            raise tornado.web.HTTPError(403)
        elif not self.is_author:
            raise tornado.web.HTTPError(403)
        return method(self, *args, **kwargs)
    return wrapper

The is_author property simply checks the user against a list of approved authors stored in Redis.

Next, I decided to store the blog data as a HASH and re-used a block of code I found (that I think was written by Benjamin Golub) to "slugify" the title and tags. I store the post, a link from the slug to the post, then add it to a list of latest posts, tag sets and to the proper monthly archive:

class ComposeHandler(BaseHandler):
    def _slugify(self, slug_str):
        slug = unicodedata.normalize("NFKD", slug_str).encode(
                "ascii", "ignore")
        slug = re.sub(r"[^\w]+", " ", slug)
        slug = "-".join(slug.lower().strip().split())
        return slug

    @author
    def get(self):
        id = self.get_argument('id', None)
        entry = None
        if id:
            entry = self.redis.hgetall('post:' + id)
        self.render('compose.html', entry=entry)

    @author    
    def post(self):
        # current user
        user = self.get_current_user()

        # id: if this is an update
        id = self.get_argument('id', None)

        # tags
        tags = set([self._slugify(unicode(tag)) for tag in
            self.get_argument("tags", "").split(",")])

        # dict for the new post
        post = dict()
        if id: post = self.redis.hgetall('post:' + id)                

        # title and content
        post['title'] = self.get_argument('title')
        post['content'] = string.replace(
            self.get_argument("markdown"), "\n", "")

        # slug for urls 
        slug = self._slugify(post['title'])
        if not slug: slug = str(post['id'])
        post['slug'] = slug

        # set tags on the post
        post['tags'] = ",".join(tags)

        # begin a redis Pipeline
        pipe = self.redis.pipeline()

        # only on creates               
        if not id:
            post['created'] = time.time()
            post['author'] = user['user_id']

            # get the next post ID
            post['id'] = self.redis.incr("global:nextPostId")        

        # updated time
        post['updated'] = time.time()

        # store the post
        pipe.hmset("post:" + str(post['id']), post)

        # store the slug for quick lookup
        pipe.set('slug:' + slug, post['id'])

        # associate post with the tags
        for tag in tags:
            if tag: pipe.sadd('tag:' + tag, post['id'])

        # place in the archive
        archive_date = datetime.date.today().strftime("%Y%m")
        pipe.rpush("archive:" + archive_date, post['id'])
        if not self.redis.sismember("archives", int(archive_date)):
            pipe.sadd("archives", int(archive_date))

        # home page posts
        promote = self.get_argument('promote', 'N')
        if not id and promote == 'Y':
            pipe.lpush('latest', post['id']);
            pipe.ltrim('latest', 0, 50);

        # execute the pipeline
        pipe.execute()

        # tell redis to [background] save right away
        self.redis.bgsave()

        # redirect to the post page
        self.redirect("/entry/%s" % post['slug'])

The EntryHandler is what is used to display the post. First it pulls a dict directly out of Redis, converts the timestamps, parses the content using markdown2 then renders the template:

class EntryHandler(BaseHandler):
    def get(self, slug):
        post_id = self.redis.get('slug:' + slug)        
        if not post_id: raise tornado.web.HTTPError(404)

        entry = self.redis.hgetall("post:" + post_id)
        author = self.redis.hgetall("user:" + str(entry['author']))
        entry['created_time'] = datetime.datetime.utcfromtimestamp(float(entry['created']))
        entry['updated_time'] = datetime.datetime.utcfromtimestamp(float(entry['updated']))
        html = markdown.markdown(entry['content'])
        self.render("entry.html", is_author=self.is_author, html=html,
                    entry=entry, author=author)

A few more features, like an Atom feed, a JSON and plain text API, integrations with Twitter, Facebook and Disqus took no time at all to add. As usual, the most time consuming part was working through the look and feel.

Time used: 1/2 day coding 1 day design.

« Back