In this tutorial we'll use ramda-cli with GitHub's Repos API to get a list of @jeresig's most starred repos.
ramda-cli is a command-line tool for processing JSON using functional pipelines. As the name suggests, its utility comes from Ramda and the wide array of functions it provides for operating on lists and collections of objects. It also employs LiveScript for its terse and powerful syntax.
On the way, there's a gentle introduction to some functional programming
concepts such as currying and function composition. We'll build a pipeline of
functions that takes a list of repos and returns the ten most starred repos
in descending order as a list of {name, stargazers_count}
objects. Finally,
we'll print the result as table.
Copy-pasting this into a shell session will make all the examples runnable:
npm install -g ramda-cli
url=https://api.github.com/users/jeresig/repos\?per_page\=100
Let's first use curl
to get the list of repos in JSON format and pipe it to
R identity -p
to get an idea of what we're working with.
curl -s https://api.github.com/users/jeresig/repos\?per_page\=100 | R identity -p
[ { id: 3549786,
name: 'apples2artworks',
full_name: 'jeresig/apples2artworks',
...
As in programming, in ramda-cli data is manipulated by applying a function to the data. The result will by default be written to standard output in JSON format.
Since identity
stands for a function
that simply returns its argument, our command will pipe the JSON payload
unchanged through to stdout in a more readable (-p
is for pretty) format.
Reader exercise: Replace
identity
with a function that would return the number of repos.
@jeresig has a lot of repos and the API returns a ton of info we don't care about, so we'll go ahead and see how the output could be reduced to just a list of the names of those repos.
curl -s $url | R 'pluck \name' -p
[ 'apples2artworks',
'babel',
'brooklynjs.github.io',
'casperjs',
...
For those unfamiliar with Ramda's curried API or LiveScript, this will require some explaining.
In LiveScript, as in CoffeeScript, parentheses are optional when calling a
function. Backslash preceding a word is sugar for string. Therefore, pluck \name
compiles into pluck('name')
in JavaScript.
pluck :: String -> {*} -> [*]
Returns a new list by plucking the same named property off all objects in the list supplied.
pluck
is a function that for a given key
and a list of objects, returns a list of values corresponding to that key
from all the objects in the list. Ramda's functions are all by design
curried, so we can partially apply pluck
with just the key
we want, 'name'
, and thus get back a function that will be waiting for the
second argument, a list of objects.
Since curl gives us a list of repos, it's a great match for a function that is waiting for a list objects to get the properties from.
Looks like the output contains repos that are forks. Not a big deal but for the sake of example we could get just the repos that are originally by @jeresig.
curl -s $url | R 'filter where-eq fork: false' 'pluck \name' -p
Here, where-eq
set up a with a spec
object ({ fork: true }
) creates a predicate function to be used with
filter
. filter
is now waiting for the
second argument that curl will provide, a list of repos to filter.
Notice that two independent pieces of code are now passed to R
. What
happens here is our program is still evaluates into a single function, but
it's composed under the hood by ramda-cli from the given functions in order
from left to right. Therefore, what we just did is equivalent to explicitly
using R.pipe
for function composition:
curl -s $url | R 'pipe( filter(where-eq({ fork: false })), pluck("name") )'
The list is first filtered, then name
property is plucked from each object.
In this way we can build a pipeline of operations to be applied on our data
in a specific order.
Now that we have a list of repo names that are not forks, we can make the output more interesting by grabbing also the number of stars.
Instead of pluck
we need an operation that picks specific fields from a
list of objects. In Ramda, there's a function called
project
for just that.
project :: [k] → [{k: v}] → [{k: v}]
Reasonable analog to SQL select statement.
curl -s $url | R -p 'filter where-eq fork: false' 'project [\name \stargazers_count]'
[ { name: 'apples2artworks', stargazers_count: 1 },
{ name: 'datacook', stargazers_count: 2 },
{ name: 'deepleap', stargazers_count: 32 },
{ name: 'dromaeo', stargazers_count: 63 },
...
Before we can make the output more visually appealing, we have a few steps to
add in our pipeline. First, sorting by stargazers_count
in descending
order.
sortBy :: (a → String) → [a] → [a]
Sorts the list according to a key generated by the supplied function.
sortBy
together with
prop
sorts a list of objects according to
the field given to prop
.
curl -s $url | R -p 'filter where-eq fork: false' \
'project [\name \stargazers_count]' \
'sort-by prop \stargazers_count'
Finally, we apply reverse
to get the
most starred projects first and limit the list to first 10 items with
take
.
curl -s $url | R -p 'filter where-eq fork: false' \
'project [\name \stargazers_count]' \
'sort-by prop \stargazers_count' \
reverse 'take 10'
[ { name: 'processing-js', stargazers_count: 1682 },
{ name: 'node-stream-playground', stargazers_count: 311 }
{ name: 'fireunit', stargazers_count: 228 },
{ name: 'env-js', stargazers_count: 205 },
...
Good, now that the data is getting transformed into a shape that has the info we want, it can be presented in a more readable format.
Using the --output-type table
option, a list of objects may be printed as a
table in such way that the objects' keys become the table headers. It's
convenient because all we need is an uniform list of objects to get a pretty
table. So we'll do just that. Remove the -p
flag from before and slap -o table
at the end.
curl -s $url | R 'filter where-eq fork: false' \
'project [\name \stargazers_count]' \
'sort-by prop \stargazers_count' \
reverse 'take 15' \
-o table
┌────────────────────────┬──────────────────┐
│ name │ stargazers_count │
├────────────────────────┼──────────────────┤
│ processing-js │ 1684 │
├────────────────────────┼──────────────────┤
│ node-stream-playground │ 311 │
├────────────────────────┼──────────────────┤
│ fireunit │ 228 │
├────────────────────────┼──────────────────┤
│ env-js │ 205 │
├────────────────────────┼──────────────────┤
│ trie-js │ 172 │
├────────────────────────┼──────────────────┤
│ pulley │ 171 │
├────────────────────────┼──────────────────┤
│ retweet │ 72 │
├────────────────────────┼──────────────────┤
│ dromaeo │ 63 │
├────────────────────────┼──────────────────┤
│ stack-scraper │ 46 │
├────────────────────────┼──────────────────┤
│ jquery-workshop │ 38 │
└────────────────────────┴──────────────────┘
Reader exercise: Add a URL column so that the projects can be viewed in browser.
This concludes the tutorial. If you're new to Ramda and want to learn more, check out the list of articles in the wiki. For ramda-cli, the README provides helpful information and examples.
Thanks to buzzdecafe for providing feedback on this article.
As the pipeline grows, it becomes increasingly more manageable option to
write the pipeline in a separate script file. For this, ramda-cli provides
the --file
option:
-f, --file String read a function from a js/ls file instead of args; useful for
larger scripts
// most-starred.js
var R = require('ramda');
var isNotFork = R.whereEq({ fork: false });
module.exports = R.pipe(
R.filter(isNotFork),
R.project([ 'name', 'stargazers_count' ]),
R.sortBy(R.prop('stargazers_count')),
R.reverse,
R.take(10)
);
curl -s $url | R -f most-starred.js -o table