Skip to content

Instantly share code, notes, and snippets.

@cmelchior
Last active March 7, 2022 18:58
Show Gist options
  • Save cmelchior/983c48d176031bfe106ad89e5c2d7279 to your computer and use it in GitHub Desktop.
Save cmelchior/983c48d176031bfe106ad89e5c2d7279 to your computer and use it in GitHub Desktop.
Small ruby script that will rank all Github issues in a repo according to which is the most popular
#!/usr/bin/ruby -w
#
# This script will analyze a Github repo and try to rank all open issues with regard to popularity.
# WARNING: This script will run quite a few HTTP requests against Github to do this analysis. At least 1 pr. issue.
# The current limit on Github is 5000 requests pr. hour: https://developer.github.com/v3/rate_limit/
#
# Usage: ./ruby github-issue-rankings.rb <github_user/repo> <github_api_access_token>
# Example: ruby github_issue_rankings.rb realm/realm-java $GITHUB_ISSUE_RANKINGS_ACCESS_TOKEN
#
# The algorithm for ranking issues are the following:
#
# 1. Search all comments for a positive reaction (see list of keywords). Each comment can only be counted once.
# 2. Count all positive reactions across all comments.
# 3. Negative reactions are ignored.
#
# Output: CSV file '<repo>-issues.csv'
# - <Title>, <Url>, <totalReactions>, <biggestSingleComment>, <reactions/day>
#
# Rationale:
# - Github does not have a similar concept to "Google stars", so it is relatively hard to rank issues with respect to
# popularity.
# - Reactions are still a relative new concept, so a lot of older issues are littered with "+1" comments. Only counting
# reactions will thus favour new issues.
# - If a single post is getting a lot of votes, it most likely describe a solution people want.
# - Reactions/Day is an attempt to capture new issues that might have strictly less up votes, but seem to acquire them
# faster, which would indicate they are more popular.
#
# Known errors:
# - When folding issues into other issues we sometimes do something like "+15 from other issue". These will not be
# counted, mainly because it is fairly difficult to tell the difference between "+1000000 for awesome idea vs. +15 from other
# issue"
#
# Author: Christian Melchior <cm@realm.io>
#
require 'octokit'
require 'json'
require 'date'
require 'csv'
# Validate input
if ARGV.length <= 0 || ARGV.length > 2
puts "Usage: ruby ./github-issue-rankings.rb <github_user/repo-name> <github_api_access_token>"
exit
end
$repo = ARGV[0]
if ARGV[1]
$access_token = ARGV[1]
elsif ENV['GITHUB_ISSUE_RANKINGS_ACCESS_TOKEN']
$access_token = ENV['GITHUB_ISSUE_RANKINGS_ACCESS_TOKEN']
else
puts "Usage: ruby ./github-issue-rankings.rb <github_user/repo-name> <github_api_access_token>"
exit
end
#
# Internal class for keeping track of issues and their stats.
#
class GithubIssuesRankingModel
class GithubIssue
attr_accessor :positive_reactions, :biggest_comment
def initialize(name, url, timestamp)
@name = name
@url = url
@date = timestamp
@days = (Time.now.to_date - timestamp.to_date).round
@positive_reactions = 0
@biggest_comment = 0
end
def values
return [ @name, @url, @positive_reactions, @biggest_comment, @days > 0 ? @positive_reactions/@days.to_f : 0 ]
end
end
def initialize
@issues = Array.new
@positive_reactions = ["+1", ":tada:", ":heart:"] # :+1: included in +1
@body_regexp = @positive_reactions.map {|n| Regexp.escape(n) }.join('|')
end
#
# Add a single issue + comments to the model.
# Input should match the Github API v3 output: https://developer.github.com/v3/issues/
#
def add(issue, comments)
github_issue = GithubIssue.new(issue.title, issue.html_url, issue.created_at)
issue_total = 0
# Issue description
if /#{@body_regexp}/.match(issue.body)
issue_total += 1
end
issue_total += count_positive_reactions(issue)
biggest_comment = issue_total
# Comments
comments.each do |comment|
comment_total = 0
if /#{@body_regexp}/.match(comment.body)
comment_total += 1
end
comment_total += count_positive_reactions(comment)
issue_total += comment_total
biggest_comment = [biggest_comment, comment_total].max
end
github_issue.positive_reactions = issue_total
github_issue.biggest_comment = biggest_comment
@issues.push(github_issue)
end
#
# Count all positive reactions
#
def count_positive_reactions(response)
reactions = 0
reactions += response.reactions.heart
reactions += response.reactions['+1']
reactions += response.reactions.hooray
return reactions
end
#
# Save the result of the analysis to a csv file
#
def save_results
CSV.open("#{ $repo.split('/').last }-issues.csv", "wb") do |csv|
csv << ['Name', 'URL', 'PositiveReactions', 'SingleBiggestComment', 'Reactions/Day']
@issues.sort { |left, right| right.positive_reactions <=> left.positive_reactions }.map { |x| csv << x.values }
end
end
end
# Provide authentication credentials
# See https://help.github.com/articles/creating-an-access-token-for-command-line-use/
client = Octokit::Client.new(access_token: $access_token, auto_paginate: true, per_page: 100)
model = GithubIssuesRankingModel.new
# Load and process all open issues from Github
gh_issues = client.list_issues($repo, :state => 'open', :accept => 'application/vnd.github.squirrel-girl-preview')
puts "No. of issues: #{gh_issues.size}"
gh_issues.map do |gh_issue|
print "."
unless gh_issue.pull_request
comments = client.get(gh_issue['comments_url'], :accept => 'application/vnd.github.squirrel-girl-preview')
model.add(gh_issue, comments)
end
end
model.save_results
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment