Skip to content

Instantly share code, notes, and snippets.

@johncarney
Last active July 15, 2024 22:32
Show Gist options
  • Save johncarney/4dea057dd9b56c421cceac57982c2903 to your computer and use it in GitHub Desktop.
Save johncarney/4dea057dd9b56c421cceac57982c2903 to your computer and use it in GitHub Desktop.
RSpec Flake Finder
#!/usr/bin/env ruby
# frozen_string_literal: true
require "json"
require "open3"
require "pathname"
require "tempfile"
# Detects intermittent failures in an RSpec test suite and reports on varying
# coverage.
#
# Usage:
# rspec-flake-finder [RUNS] [OPTIONS]
#
# Arguments:
# RUNS The number of runs to perform before exiting. If not given, the
# script will run until manually interrupted.
#
# Options:
# --only-<tag name>-tests Only run tests with the specified tag.
# --no-<tag name>-tests Exclude tests with the specified tag.
#
# This script runs the RSpec test suite multiple times and reports which
# tests are failing and the rate at which they fail. This is useful for
# identifying tests that fail intermittently, which can cause problems for
# continuous integration pipelines and make it difficult to trust the test
# suite.
#
# If you have SimpleCov set up, it also reports on the range of line coverage
# across runs. This can help set realistic baseline coverage thresholds.
# Also, if the coverage varies across test runs, that could indicate that the
# test suite is not deterministic, which could potentially lead to bugs on
# infrequently visited code paths leaking into production.
#
# Notes:
# This has no dependencies other than RSpec. If you don't have SimpleCov
# installed or setup, it will simply not report on coverage.
#
# Obviously, if your test suite takes a long time to run, this script will
# take a correspondingly long time to complete. You may want to run it
# overnight or over a weekend.
#
# Plans:
# - Report intermittently covered lines of code.
# - Non-interactive mode.
# - Publish as a gem.
module RSpecFlakeFinder
module Callable
def self.included(base)
base.extend(ClassMethods)
end
module ClassMethods
def call(...) = new(...).call
end
end
class Main
include Callable
attr_reader :arguments, :options, :runs, :rspec_failures
def initialize(argv)
@options, @arguments = argv.partition { |arg| arg.start_with?("-") }
@runs = []
@rspec_failures = {}
end
def call
render_report
while runs_remaining.positive?
runs << RSpecRunner.call(tags:)
rspec_failures.merge!(runs.last.rspec_failures.group_by(&:id).transform_values(&:first))
render_report
end
ensure
$stdout.write("\r\e[0K")
end
def max_runs
arguments.first&.to_i
end
def runs_remaining
return Float::INFINITY unless max_runs
max_runs - runs.size
end
def tags
options.map { |option| option.match(/\A--(?<filter_type>only|no)-(?<tag>.+?)-tests\z/) }.map do |match|
[match[:filter_type] == "no" ? "~" : "", match[:tag]].join
end
end
def render_report
report = Reporter.call(self)
report += ["", "Press Ctrl-C to exit"]
report = report.join("\e[0K\n")
$stdout.write "\e[H#{report}\e[0J"
$stdout.flush
end
end
class Reporter
include Callable
attr_reader :tracker
def initialize(tracker)
@tracker = tracker
end
def call
report = ProgressReporter.call(tracker)
coverage = CoverageReporter.call(tracker)
report += ["", coverage] unless coverage.empty?
failures = FailureReporter.call(tracker, indent: " ")
if failures.any?
report += ["", "Failures"]
report += failures
end
report
end
end
class CoverageReporter
include Callable
attr_reader :tracker
def initialize(tracker)
@tracker = tracker
end
def call
return [] unless coverage_reports.any?
range = coverage_reports.map(&:coverage).minmax.uniq.map { |value| Utility.to_percentage(value) }
"Coverage: #{range.join(' - ')}"
end
def runs = tracker.runs
def coverage_reports
@coverage_reports ||= runs.map(&:coverage_report).compact
end
end
class FailureReporter
include Callable
HEADER = ["Line", "Failure rate", "Percentage", "Description"].freeze
attr_reader :tracker, :indent, :separator
def initialize(tracker, indent:, separator: " | ")
@tracker = tracker
@indent = indent
@separator = separator
end
def call
return [] if rspec_failures.empty?
report = [
HEADER,
*failure_rates
]
report << total_failure_rate if report.size > 2
TableFormatter.call(report, indent:, separator:)
end
def failure_rates
@rspec_failure_rates = rspec_failure_counts.map do |example_id, count|
example = rspec_failures[example_id]
rate = Rational(count, runs.size)
[example.line, rate, Utility.to_percentage(rate), example.full_description]
end
end
def total_failure_rate
total_rate = Rational(rspec_failure_counts.values.sum, runs.size)
["Total", total_rate, Utility.to_percentage(total_rate), ""]
end
def runs = tracker.runs
def rspec_failures = tracker.rspec_failures
def rspec_failure_counts
runs.each_with_object({}) do |run, memo|
run.rspec_failures.map(&:id).each do |example_id|
memo[example_id] ||= 0
memo[example_id] += 1
end
end
end
end
class ProgressReporter
include Callable
attr_reader :tracker
def initialize(tracker)
@tracker = tracker
end
def call
report = [["Total run time:", total_run_time]]
report << ["Average run time:", average_run_time] if average_run_time
report += remaining if max_runs && runs_remaining.positive?
TableFormatter.call(report)
end
def remaining
report = [["Runs remaining:", "#{runs_remaining}/#{max_runs}"]]
report << ["Estimated time remaining:", estimated_time_remaining] if estimated_time_remaining
report
end
def runs = tracker.runs
def max_runs = tracker.max_runs
def runs_remaining = tracker.runs_remaining
def total_run_time
runs.reduce(Duration.new(0.0)) { |memo, run| memo + run.duration }
end
def average_run_time
return unless runs.size.positive?
total_run_time / runs.size
end
def estimated_total_time
return unless max_runs
average_run_time * max_runs
end
def estimated_time_remaining
return unless max_runs && average_run_time
average_run_time * runs_remaining
end
end
class CoverageReport
attr_reader :file_coverage
def initialize(file_coverage)
@file_coverage = file_coverage
end
def coverage
@coverage ||= begin
covered_lines, significant_lines = file_coverage.values.map do |fcov|
[fcov.covered_lines, fcov.significant_lines]
end.transpose.map(&:sum)
Rational(covered_lines, significant_lines)
end
end
def self.load(filepath: "coverage/.resultset.json")
resultset = Pathname(filepath)
return unless resultset.exist?
coverage = JSON.parse(resultset.read).dig("RSpec", "coverage")
return unless coverage&.any?
file_coverage = coverage.to_h { |file, data| [file, FileCoverage.from_report(file, data)] }
new(file_coverage)
end
end
class FileCoverage
attr_reader :file, :line_coverage
def initialize(file, line_coverage)
@file = file
@line_coverage = line_coverage
end
def covered_lines
@covered_lines ||= line_coverage.compact.select(&:nonzero?).size
end
def significant_lines
@significant_lines ||= line_coverage.compact.size
end
def coverage
Rational(covered_lines, significant_lines)
end
def self.from_report(file, report)
new(file, report["lines"])
end
end
class RSpecExample
def initialize(example)
@example = example
end
def id = @example["id"]
def file_path = @example["file_path"]
def line_number = @example["line_number"]
def full_description = @example["full_description"]
def status = @example["status"]
def pending_messasge = @example["pending_message"]
def failed? = status == "failed"
def line = "#{file_path}:#{line_number}"
def self.all_from_file(file)
JSON.parse(Pathname(file).read)["examples"].map { |example| new(example) }
end
end
class RSpecRunner
include Callable
attr_reader :tags, :rspec_failures, :coverage_report, :duration
def initialize(tags: [])
@tags = tags
end
def call
Tempfile.create("rspec-flake-finder") do |tempfile|
@duration = Duration.measure { Open3.capture3(*command(tempfile.path)) }
tempfile.close
@rspec_failures = RSpecExample.all_from_file(tempfile.path).select(&:failed?)
@coverage_report = CoverageReport.load
end
self
end
def command(outfile)
tag_options = tags.map { |tag| ["--tag", tag] }.flatten
[
"bundle", "exec", "rspec",
*tag_options,
"--format", "json",
"--out", outfile.to_s
]
end
end
class Duration
attr_reader :duration
def initialize(duration)
@duration = duration
end
def to_f
duration
end
def +(other)
Duration.new(duration + other.to_f)
end
def /(other)
Duration.new(duration / other.to_f)
end
def *(other)
Duration.new(duration * other.to_f)
end
def to_s # rubocop:todo Metrics/MethodLength
return unless duration
total_seconds = duration.round
total_minutes = total_seconds / 60
hours = total_minutes / 60
minutes = total_minutes % 60
seconds = total_seconds % 60
if hours.positive?
"#{hours}h#{minutes}m#{seconds}s"
elsif minutes.positive?
"#{minutes}m#{seconds}s"
elsif seconds > 1
"#{seconds}s"
else
seconds = duration % 60
"#{seconds.round(2)}s"
end
end
def self.measure
start_time = Time.now
yield if block_given?
new(Time.now - start_time)
end
end
class TableFormatter
include Callable
attr_reader :rows, :separator, :indent
def initialize(rows, separator: " ", indent: "")
@rows = rows
@separator = separator
@indent = indent
end
def call
rows.map do |row|
[indent, row.zip(column_widths).map { |cell, width| cell.to_s.ljust(width) }.join(separator)].join
end
end
def column_widths
@column_widths ||= rows.transpose.map { |column| column.map(&:to_s).map(&:length).max }
end
end
class Utility
def self.to_percentage(value)
"#{(100 * value.to_f).round(2)}%"
end
end
end
if __FILE__ == $PROGRAM_NAME
Thread.report_on_exception = false
Signal.trap("INT") { exit }
RSpecFlakeFinder::Main.call(ARGV)
end
@johncarney
Copy link
Author

I was inspired to write this script while adding SimpleCov to a test suite at a new job. I first noticed some intermittently failing tests, then I noticed that the reported coverage was not consistent. So I threw together a very simple script to identify the intermittent tests and the lowest level of coverage. This is that script, but expanded somewhat.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment