Skip to content

Instantly share code, notes, and snippets.

@urmastalimaa
Last active August 29, 2015 14:24
Show Gist options
  • Save urmastalimaa/fd8c9ae4310dd4e26eec to your computer and use it in GitHub Desktop.
Save urmastalimaa/fd8c9ae4310dd4e26eec to your computer and use it in GitHub Desktop.
How to identify and debug N+1 ActiveRecord queries.

Description

Unless you are fetching millions of rows,
the problem is most likely not in the query,
but in the amount of queries.

You are probably iterating over an entity and making many queries per each entity,
which kills performance (also known as the N+1 queries problem).

For example in the case of the engagement logs,
every engagement was iterated over and
for every engagement ~20 separate DB queries were made.
This means that for 100 engagements 2000 queries are made.
Even if every query is fast, the overhead in doing 2000 queries is immense.

Identifying the N+1 queries problem

  1. Turn on ActiveRecord logging for your process.

     ActiveRecord::Base.logger = Logger.new(STDOUT)
    
  2. As an alternative use Logger.new(File.new('queries', 'w+')) to see the output in a different file.

  3. Make your process easily repeatable (start you service separately and send messages to it manually, or much better, create a test which has a good amount of data).

  4. Identify your main iteration and mark it (log newlines or some separators).

  5. Run the process and identify in the log all the queries that are made for each iteration.

In the end ideally there should be a few queries in the beginning and all the processing should stay in the memory.

Example of output in which some survey answers are being queried per every engagement.

Salemove::Models::Surveys::QuestionAnswer::TextAnswer Load (3.2ms)  SELECT  "question_text_answers".* FROM "question_text_answers" WHERE "question_text_answers"."question_id" = $1 AND "question_text_answers"."engagement_id" = $2 LIMIT 1  [["question_id", 6], ["engagement_id", 1072]]
Salemove::Models::Surveys::QuestionAnswer::ScaleAnswer Load (0.9ms)  SELECT  "question_scale_answers".* FROM "question_scale_answers" WHERE "question_scale_answers"."question_id" = $1 AND "question_scale_answers"."engagement_id" = $2 LIMIT 1  [["question_id", 5], ["engagement_id", 1072]]
Salemove::Models::Surveys::QuestionAnswer::ScaleAnswer Load (0.4ms)  SELECT  "question_scale_answers".* FROM "question_scale_answers" WHERE "question_scale_answers"."question_id" = $1 AND "question_scale_answers"."engagement_id" = $2 LIMIT 1  [["question_id", 4], ["engagement_id", 1072]]
Salemove::Models::Surveys::QuestionAnswer::TextAnswer Load (0.3ms)  SELECT  "question_text_answers".* FROM "question_text_answers" WHERE "question_text_answers"."question_id" = $1 AND "question_text_answers"."engagement_id" = $2 LIMIT 1  [["question_id", 6], ["engagement_id", 1071]]
Salemove::Models::Surveys::QuestionAnswer::ScaleAnswer Load (0.3ms)  SELECT  "question_scale_answers".* FROM "question_scale_answers" WHERE "question_scale_answers"."question_id" = $1 AND "question_scale_answers"."engagement_id" = $2 LIMIT 1  [["question_id", 5], ["engagement_id", 1071]]
Salemove::Models::Surveys::QuestionAnswer::ScaleAnswer Load (0.3ms)  SELECT  "question_scale_answers".* FROM "question_scale_answers" WHERE "question_scale_answers"."question_id" = $1 AND "question_scale_answers"."engagement_id" = $2 LIMIT 1  [["question_id", 4], ["engagement_id", 1071]]

Common problems

  1. Using ActiveRecord associations without eager loading or preloading. Read about different ways of preloading here.

     Car.eager_load(:parts)
     Car.eager_load(parts: :manufacturer)
     Car.preload(:parts)
    
  2. Using eager/preloading but doing new queries.
    When an associations is preloaded Car.preload(:parts),
    but you are ordering/filtering it using AR door_parts = Car.parts.where(name: 'door')
    then the preloaded association can not be used.
    Rewrite the ordering/filtering using .select and .sort_by
    or other Enumerable functions, which work in the memory.

  3. Simply doing queries in the iteration.
    find, find_by and anything similar creates new queries.
    Sometimes queries are hidden in functions deep in the entities.
    Either preload or fetch all the necessary information before the iteration.

     # In the case of one-to-many association
    
     # Either preload
     cars = Car.preload(:parts).where(id: car_ids)
     cars.each {|car| parts = car.parts}
    
     # or query by ids and use Enumerable#group_by
     parts_by_car = Parts.where(car_id: car_ids).group_by(&:car_id)
     cars.each {|car| parts = parts_by_car[car.id]}
    
     # In the case of one-to-one association
    
     # Either preload
     cars = Car.preload(:owner).where(id: car_ids)
     cars.each {|car| owner = car.owner}
    
     # or query by ids and put into a Hash
     owners_by_car = Hash[
       Owner.where(car_id: car_ids)
       .map {|owner| [owner.car_id, owner] }
     ]
     cars.each {|car| owner = owners_by_car[car.id]}
    

When in trouble

Sometimes the queries are hard to find but that's why you turned on AR logging.
Hone in on the place where the query is made, it can be quite deep in your logic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment