Eliminating Nondeterministic (“Flaky”) Tests in Ruby and RSpec

At Panorama, we strongly believe in automated tests. No code gets deployed without passing both a thorough code review and a battery of thousands of automated tests, and no code makes it through code review without updating tests for the new features it’s adding. Automated tests help us build new features and refactor old code without introducing bugs, and are a big part of the reason we are able to confidently deploy new changes to our production apps multiple times a day.

What are flaky tests?

But with an automated test suite comes the dreaded possibility of tests that fail nondeterministically—that is, most of the time they pass, but every once in a while they fail for no obvious reason. When the tests are retried, they pass again. Flaky tests.

Flaky tests reduce developer confidence in the test suite, waste our time when we need to have tests retry until they succeed, and can delay the release of changes or even critical bugfixes. As a result, we take a hard stance and make sure to squash flaky tests whenever we see them.

Common causes of test flakiness

Over time, we’ve found that in our codebase flaky tests tend to have one of three causes:

Cause #1: Tests share state

In RSpec this typically means we’re creating something in our database in a before(:all) block rather than a before(:each) or let block (since we use RSpec’s transactional fixtures, any database insertions or updates that happens in a before(:each)/let are reverted after the test executes).

To track down these instances, we’ve eliminated before(:all) from our test suite from all but a few special instances. In addition, we’ve added a special after(:all) block that can run after each test file and check whether anything has been left in the database:

Cause #2: Tests sort auto-incrementing names

We use the fabrication gem to easily build objects and save them to the database, and fabrication provides a sequence feature that lets you auto-increment fields. For instance:

But since our tests run in a random order and these sequences are global, the above code could generate students with names "Student 73", "Student 74", and "Student 75", or any other sequence of integers, depending on how many previous tests also called Fabricate(:student).

Since these strings are then used for sorting, we’ll run into problems when crossing number-of-digit boundaries, like with "Student 99", "Student 100", and "Student 101". In that case, the reverse alphabetical sort would be "Student 99", "Student 101", and "Student 100", causing our test to fail.

While we could track down places where we’re relying on this sort of automatic naming and sorting and change the tests, we’ve found it was much easier to globally start fabrication sequences at very high values to avoid this problem:

Cause #3: Tests manually set the id (the primary key) of database rows

Doing something like this:

might seem innocent, but when the monotonically-increasing id that the database generates for school1 happens to be 42, the creation of school2 will raise an error because two database records can’t have the same primary key.

This error can be hidden by more subtle code, like:

The easy thing about this issue is that it’s very easy to spot as a red flag in code reviews: your database should set primary key id fields, not your application code.

Check your code!

So if you’ve got a test that sometimes passes and sometimes fails, try checking these three things:

  1. Does the test (or any others) use before(:all)? If so, change that to before(:each) or let blocks instead.
  2. Does the test check the sorted order of strings that contain digits? If so, change the test or better yet configure Fabrication or a similar tool to start these numbers much with much higher values.
  3. Does the test set the id of any model? If so, rework the test so it does not.

If none of those issues are the problem, we’d love to hear about them! And if you’re interested in working on a codebase that takes tests seriously, we’re hiring!

Related Posts
A way to change a foreign key reference with zero downtime
Toward a Swankier Rails Console
Implementing priority lanes for jobs of the same type in Sidekiq
Ruby Gem – ExternalFields