Thursday, February 26, 2009

Ruby Regular Expression Gotchas

I love Ruby. I love Ruby on Rails. Rarely have I found a language or a framework that just works.

However, you still have to know the finer details sometimes.

I recently made a model for a DNS zone. The name in the model is the "front part" of a fully qualified domain name. For instance, if = "foo" then I would write the name into my name server's configuration files as ""

Knowing that people were evil, I saw that if a user put a string in like " NS\nfoo" I would happily write out two strings, one being rather bad.

Knowing how easy this sort of data validation is in Rails, I made my model look like:

class Zone < ActiveRecord::Base
  validates_presence_of :name
  validates_uniqueness_of :name
  validates_format_of :name,
    :with => /^[a-zA-Z0-9\-\_\.]+$/,
    :message => "contains invalid characters."

Happy, I ran a few tests using my browser and found that I could not insert names with spaces, colons, tabs, etc. Then, several days later, I decided it was time to write tests for this.

require 'test_helper'
class ZoneTest < ActiveSupport::TestCase
  def test_name_with_newline_fails
    z = => "test\nzone")
    assert !z.valid?
    assert z.errors.on(:name)

  def test_name_with_space_fails
    z = => "test zone")
    assert !z.valid?
    assert z.errors.on(:name)

Imagine my surprise when test_name_with_space_fails() passed, and the one I was most worried about, test_name_with_newline_fails(), did not!

Not all regular expressions are alike

The problem is in what I thought ^ and $ actually matched. I thought these meant "match the beginning and ending of the string." However, it turns out it means "match the beginning and ending of each line contained in the string," where lines are divided by newlines. Ooops.

Changing ^ into \A and $ into \Z fixed this problem. Now I'm auditing all the code in this application to see if there are other problems like this.

This is just one thing to add to an ever-growing security checklist for my Rails work. It's also a very typical security hole: programmer error.

No comments:

Post a Comment