Yesterday’s Thoughts

May 23, 2006

Bug & Patch for acts_as_ferret

I’m putting the finishing touches on Endurance Central. Basically the UI is done and all of the user functionality is complete, if not yet fully exposed. I have started to collect race data (the current data on the site is unreliable test data – I need to clear that out).

One of the last features that I added was to make it possible for users to report inaccurate data. This feature was important to the community aspect of the site and is the bridge to get users into the membership system and to take some of the responsiblity for the content off of me.

I allowed the user to report an error on any event. To do this, I subclassed by Event model into LiveEvents (those you see) and EventCorrections, and then I created an administrative interface to accept or reject corrections in the EventCorrections. With Rails and Ruby this was brilliantly easy; using single table inheritance, it took less time to do than it had taken me to think about it and almost every point worked on the first pass.

But, and you knew there was going to be a “but” didn’t you, this somehow broke by search function for the site. Search had also been extremely simple to implement. I used Ferret in combination with the acts_as_ferret plugin for Rails. Somehow when I moves to LiveEvents from Events, searches that returned no results, broke, throwing mysterious exceptions without and traces.

I didn’t have a test for this case (I was still testing the base Events class – doh!) so I only saw it occassionally in development, but when I started seeing it in my production code, I tracked it down. I thought that it was from an upgrade to acts_as_ferret from a few weeks ago. Nope. I somehow caught

Here is the relevant bit of acts_as_ferret.

def find_by_contents(q, options = {}
   id_array = [] scores_by_id = {}
   find_id_by_contents(q, options) do |element|
     id_array < < id = element[:id].to_i
     scores_by_id[id] = element[:score]  
   end
   begin
      if self.superclass == ActiveRecord::Base
        result = self.find(id_array)
      else
        # no direct subclass of Base --> STI
        # TODO: AR will filter out hits from other classes for us, but this
        # will lead to less results retrieved --> scoping of ferret query
        # to self.class is still needed.
        if id_array.empty?
          result = []
        else
           result = self.find(:all, find_options.merge(:conditions => ["id in (?)",id_array]))
        end
     end
  rescue
     logger.debug "REBUILD YOUR INDEX! One of the id's didn't have an associated record: #{id_array}"
   end

   # sort results by score (descending)
   result.sort! { |b, a| scores_by_id[a.id] < => scores_by_id[b.id] }

   logger.debug "Query: #{q}nResult id_array: #{id_array.inspect},nresult: #{result},nscores: #{scores_by_id.inspect}"

   return result
end

I don’t really understand the comment, “no direct subclass of Base –> STI TODO: etc.” It seems like using the AR filtering would be a good thing in all cases. Maybe this would break paginate? Anyway, I was seeing the logger.debug in dev, but I had misattributed it to some manual interventions that I had made in the db to remove some old data. Wrong.

The problem was that by changing my model to STI, the LiveEvent was not a subclass of ActiveRecord::Base, so instead of using the well-behaved return = self.find(id_array) my queries were now going through the ill-behaved return = self.find(:all, :conditions =&gt; ["id in (?)",id_array]). Calling find with any empty array returns an empty array. Calling find with condtions and a nil argument throws an exception.

Since this exception is caught, without doing anything worthwhile, the value of return is nil. Calling sort! on nil a few lines later throws an error, but since this code is in a plugin, there is no usable trace.

Lessons:

  • Refactor the tests at the same time as code.
  • Don’t allow errors in the logs. These always came up when I was working on something unrelated, so I never took the time to track them down.
  • Anticipate the edge conditions. Seriously, 0 results is not unlikely.

I filed a ticket on acts_as_ferret and included a patch.There were a couple of ways I could have gone with the patch.

  1. Added return = [] to the rescue.
  2. Wrap the sort in a test for return.nil?
  3. Initialize return = [] at the top.
  4. Add another condition within the block, if id_array.size == 0 return = [] elsif self.superclass ...

I actually did all of these in that order. They all worked in my tests. I decided on #4 because it was the most contained patch and had the added benefit of skipping the find(id_array) if the id_array was empty. Saving a db call isn’t something I would have gone out of my way to do without a strong motivation, but it was a side benefit.

Post a Comment