Fix THE bug, not A bug

Bugs exist. It's a fact and despite all the effort you can put building the perfect software, bugs will continue to exist.

Building a software is often a compromise between fixing bugs and creating new features. Well-written applications tends to make easier to fix bug so that excellent programmers usually spend less time repairing broken stuff and more time coding.

Because developers usually hate bugs, sometimes they tend to underestimate a problem to reduce the amount of time required to fix it. This is what I call "fixing a bug". Reported issues are often just the top of an iceberg and you should never underestimate an unexpected behavior if you want to "fix the bug".

This is exactly what happened to me a few days ago.

RoboDomain stores hostnames in a database table as lower case strings.

+-----+--------------------
| id  | name
+-----+--------------------
|  1  | weppos.com
|  2  | simonecarletti.com
...

The entire application is built on the assumption the Hostname class always return the name as lower case. Using a standard case is a common practice to make string comparison faster and more accurate.

The Hostname class doesn't care whether you create a new instance with lower, upper or mixed case. It's smart enough to normalize the value just before saving the record into the database.

class Hostname

  before_save :normalize_name

  def normalize_name
    unless name.blank?
      self.name = name.to_s.downcase
    end
  end

end

I want the data to be stored lower case so that I don't need to delegate the database engine the work to lowerize inputs each time I want to run a query.

SELECT * FROM hostnames WHERE LOWER(name) = #{value.downcase}

Moreover, the Hostname table contains an index for the name attribute and the query above causes the database engine to actually ignore the index in favor of a less efficient sequence (or similar) scan mode.

So far, this design worked really well but today my Exception logger tracked a really weird behavior. Someone has been able to create a duplicate hostname.

+-----+--------------------
| id  | name
+-----+--------------------
...
|  4  | simonecarletti.com
|  5  | simonecarletti.com
...

I'm not going into the full explanation of the problem because it's not the reason of this post. However, you should now the Hostname class actually contains a duplication filter based on the validates_uniqueness_of Rails validator.

class Hostname

  # I don't use the :case_sensitive => true option because it
  # performs a LOWER(name) = attribute.downcase comparison
  # that doesn't take advantage of my database index
  validates_uniqueness_of :name, :case_sensitive => false

end

As described by the official documentation, this method doesn't guarantee the absence of duplicate record insertions but, in this case, that wasn't the problem.

After a couple of minutes spent debugging the application, it turned out one of the RoboDomain tools forgot to normalize some input before passing the value to the Hostname class.

def action
  hostname = Hostname.get(params[:query])
  ...
end

At a first glance, that was easy to address. Just downcase the value before the call to the get method.

def action
  hostname = Hostname.get(params[:query].to_s.downcase)
  ...
end

Cool. I fixed a bug. But did I really fix the bug?

Absolutely no. In fact, I only fixed one occurrence of the bug but the bug will happen again and again somewhere else in my program.

Let's understand why.

There are at least three symptom that should warn me this is not the right solution.

  • Normalizing the value for the name attribute has never been a controller responsibility. I would expect the model to do this instead.
  • There are several libraries that call Hostname.get across the application codebase. Why did only that specific one fail? Do the others fail as well if an invalid input is supplied? If yes, should I normalize all the Hostname.get calls everywhere and all the time?
  • What the hell happened to make possible the creation of a duplicate record instead of two records with the same string but different cases?

I quickly noticed the bug wasn't fixed at all and it wasn't as trivial as initially estimated.

The Hostname.get method relies on find_or_create_by_name.

class Hostname
  def self.get(name)
    find_or_create_by_name(name)
  end
end

Let's track the full execution.

+-----+--------------------
| id  | name
+-----+--------------------
...
|  4  | simonecarletti.com
...

1. Hostname.get("simonecarletti.com")
1.1 find("simonecarletti.com") # => #4

1. Hostname.get("simonecarletti.com")
1.1 find("SIMONECARLETTI.com") # => nil
1.2 create :name => "SIMONECARLETTI.com", ...
1.2.2. validates_uniqueness_of :name is triggered but again, it performs a checks for SIMONECARLETTI.com
1.3 :normalize_name is triggered
1.4 record is saved # => #5

+-----+--------------------
| id  | name
+-----+--------------------
...
|  4  | simonecarletti.com
|  5  | simonecarletti.com
...

Forget for a moment the fact the database should have a database index (the full architecture isn't that easy actually, I simplified it just to make this article more easy to understand).

Again, the solution seems really straightforward. Just force Hostname.get to normalize the value.

class Hostname
  def self.get(name)
    find_or_create_by_name(name.to_s.downcase)
  end
end

But once more, I didn't fix the problem but just a problem. Can you guess why?

  • Despite Hostname.get now seems to work, find_or_create_by_name is still buggy. Using it elsewhere in my application forces me to apply all normalization filters all the time. What would happen if, one day, I decide to alter the filter chain with a new filter?
  • Hostname already has a normalization method called :normalize_name. Shouldn't Hostname take advantage of it?

At this point, it becomes clear the problem is the validates_uniqueness_of :name validator.

Once more, the final solution looked incredibly simple once completed. Instead of using the standard validates_uniqueness_of :name with the :case_sensitive => true option, I decided to wrote my custom validator to transfer the responsibility to normalize the values from SQL level to Ruby level. In other words, my validator is based on the assumption that all data stored in the database is lower case and I don't need to LOWER()ize them at runtime. Instead, I just need to ensure the string I want to compare is normalized without actually altering the state of the original object itself.

+-----+--------------------
| id  | name
+-----+--------------------
...
|  4  | simonecarletti.com
...
h = Hostname.new(:name => "SIMONECARLETTI.com")
h.name    # => "SIMONECARLETTI.com"

h.valid?  # => false
# The validator now executes the right SQL command
# SELECT * FROM ... WHERE name = 'simonecarletti.com'
# instead of the old one
# SELECT * FROM ... WHERE name = 'SIMONECARLETTI.com'
# Also, it doesn't ask the database engine a runtime LOWER case transformation
# and the script can take advantage of the database index
# SELECT * FROM ... WHERE LOWER(name) = 'simonecarletti.com'

h.name    # => "SIMONECARLETTI.com"

Ok, now I'm pretty sure I fixed the bug and not only a bug.

As you can see, it's quite easy to underestimate a bug. You should always carefully investigate a bug report. The more easy the bug looks like, the more attention you should pay to.

Once you spot a bug and you have been able to reproduce it, inspect the execution stack and find the level that originates the problem. Never try to immediately fix an issue where you just discovered it or you might end up fixing a bug instead of fixing the bug.