One of the features of hubot-pr-fu, a Slack/HipChat bot that provides commands to aggregate information around GitHub Pull Requests, is: list out all PRs that are in mergeable status. The GitHub API for listing out all Pull Requests doesn’t include the merge-ability status for each PR object that gets returned. Instead, we have to iterate over the list of objects and make calls to fetch more information about each request. Written in CoffeeScript, it looks like:
I’ve used the equivalent of Promise.all
back then because I didn’t know better. I’ve since found a neat way to think about the problem, and a more fine-grained solution for “making one request after the other”—also known as cooperative scheduling in computer-speak.
Documentation for the pulls
and pull
API can be found on GitHub
Preliminaries
In this article I’m using the a couple of modern syntax/features of JavaScript. Skip this part if you are aware of arrow functions and spread operator.
Arrow functions
JavaScript engines now have a new syntax for defining functions:
If the braces are omitted, it means the inner value gets returned. i.e., the following functions are identical:
Spread operator
A new feature in JavaScript is the ...
syntax. Ruby programmers would know this as the “splat” operator. Any iterable value can be “spread” into another construct using this. So you could do things like concatenating arrays, shallow-clone JavaScript object and more:
Array.reduce
If you’re unfamiliar with Array.reduce
, Sarah Drasner has a very accessible introduction to this very powerful function: https://css-tricks.com/understanding-the-almighty-reducer/. She does a super job explaining this possibly-confusing function.
fetch API
I won’t go into too much details, but fetch
is a replacement for XMLHttpRequest
in browsers (there’s a package for NodeJS). It replaced the old callback-heavy API with a new Promise-based API. The high-level API is more closer to the ajax
function in jQuery. More on the Mozilla Developer Network website.
Problem
- Take a list of pull request URLs (if you’re unaware of GitHub, think of some other list of URLs)
- Make calls to each of these, and aggregate the “mergeable”-ity status of each. Something like this, probably:
There are a couple of ways to do this. Promise.all
is one:
Promise.all
will resolve once all the fetch
calls are completed. While this works fine, it might choke the network or the target server if the list of URLs is very long! Another problem with Promise.all
is that if any of the requests fails, the entire promise chain gets rejected. We only need to mark the failed request as such, and continue on to the next one.
What if we can, given a list of URLs, make calls one-by-one? Can that be done without an external library?
Cooperative Scheduling
Promises are composable. This means that we can create promises and pass them around like they were primitive values—the term used to describe this is called “first-class”. Another key idea is that we operate on an array(s). Modern JavaScript has a good set of functions that operate on arrays. Combining both helps us achieve what we want.
We’ll call fetch
with each of the URLs from the list one after the other, and try the next URL only if the current one has completed (could be an error, or a success). Just calling fetch(url)
(returns a promise) will start the request (almost) immediately, so we need to generate this fetch promise only when we are ready. Once the call finishes we’ll loop over to the next URL in the list. We have to figure out a way to chain the requests to each other. And chaining is a super-power of Promises. Any invocation of Promise API methods always return a Promise, so they are infinitely chain-able.
The final action when unrolled should look something like this:
So given a list of URLs, we can use Array.reduce
to chain fetch
promises over multiple iterations:
In our code, if we unroll the loop, the chain would look like:
Now to handle the data that’s coming in.
Aggregating state – using global object
I’m going to use a slightly different aggregate example compared to the one mentioned above. The structure is going to be:
We need an object to aggregate each Pull Request’s mergeable state. This object should be passed down to each loop iteration, and should store the value of the mergeable
key from the network response. One way to do this is to each reduce iteration take in and return a aggregate state object that looks like this:
And our program could look like this:
For every loop iteration, we return a new object that represents an aggregate state of the iteration: a new promise that, once resolved, updates the merge-ability status of the completed request, the new PR ID for use in the next iteration, and the copied over mergeability
status tracker object.
If you’re finding it hard to understand this, you’re not alone. “What is the value of aggregate.mergeability
mergeability: aggregate.mergeability
above?” “Why are we aliasing it outside and modifying it on the inside? Will there be race conditions?”, “why are we resolving the combinedPromise
promise in the end, but discarding the value?” might be some of the confusing questions that readers of this code (even you, in 3 months’ time) might have.
In general, when we discard the value of the promise, it’s not completely clear as to the construction/shape of the promise object while reading code.
Aggregating State – without globals
We need a way to compose the return values from the promises between the reduce
iteration loops better. Instead of thinking of the iteration first, let’s think of the loop function. It needs to know:
mergeability
status object, in order to set the status in the current loop- PR ID to make the fetch call in that loop once the previous one resolves
Let’s imagine a pure* function that takes each of these values, and returns a promise that resolves with the current (post-resolve) mergeability
state object.
By using a promise constructor, we can control how the final “fulfilled” object of the promise will look like, and we also get control of when the promise will be considered fulfilled. Instead of returning the response
object from fetch
, we are returning the modified aggregate state after the fetch
resolves. That is, in fetchPrAndUpdateMergeability(id, {}).then(variable => ())
, the variable
will be set to the aggregate object, which can then be passed down to the subsequent fetchPrAndUpdateMergeability
call:
Which is now similar to what our reduce
step did in the initial examples! So, plugging this back in our reduce
iterator, we end up with:
This is way cleaner, and we get the advantage of not maintaining a global object, which makes it easy to add more logic or decompose further, like I’ll explain below.
JSON parse handling
Till now we’ve assumed that response
object is the parsed JSON of the pull request data, but in reality it’s a Response
object. We have to parse the body into a JSON object. The Response
object has a built-in method to do this: json()
, which returns a promise that’s “resolved” once the parsing finishes. Let’s add the parsing step inside the success handler of fetchPr
call.
Apart from the obvious increase in indentation, one more complication is that we haven’t yet added error handling. Doing that will complicate this function even further, where it starts to resemble callback hell. Isn’t this what Promises were supposed to help avoid? This is a natural progression, and is something I’ve seen happen a lot. Without careful thought around the correct abstraction at every step of change, it’s not easy to make such code…less indented.
Instead of adding the JSON parsing step in the fetchPrAndUpdateMergeability
function, let’s move it to the fetchPr
function instead:
We saved the indentation, but yet haven’t handled error cases. Each promise object can “redirect” its output to either the success or error callback of the next promise. In that sense, it’s almost like *nix pipelines if you’re aware of those (Standard Out, Standard Error streams). Think of two data pipelines between which the data flows, and we have control over how to pass it around. In *nix, if we want only the error output of a particular command, we can redirect that to a different file without mixing it up with the normal output. Similar technique can be used in our example:
If the fetch call succeeds or fails—with a network error for instance—we try parsing the body, and explicitly passing down the parsed object either towards the success or error pipelines. If we didn’t add the calls to Promise.resolve/reject
and simply returned the values, both will get diverted to the next success callback. Any further chaining would receive the actual pull request data in the success handler.
Aside
There is one more principle around promises in this example. If you pass undefined
or null
as one of the callbacks, then the corresponding value flows down to the next joint in the pipeline. In the first then
joint, even though it’s not explicit, we are passing undefined
as the callback for error case. If fetchPr
threw an error, the error message would be passed down to the second then
. In the second then
joint, we are passing undefined
explicitly to the success case so success passed from previous then
gets passed to next. This technique can be used to create composable functions that can be reused in multiple places. I found this very useful at work while refactoring a complicated API-calling interface.
Error handling
Now that we have our parsing step in place, we can move on to add error handling. The fetch
API’s success handler gets invoked for any non-5xx errors. The response
object has a property ok
attached to it to signify a 2xx status. For our use case, all non-200 status messages can be treated as errors, and that PR’s mergeabilty status should be marked as ‘unknown’. To do this, we ‘ll modify our success handler in fetchPR
like so:
Again, indentation alarm rings! We can do better. Instead of adding the if
condition inline, we could move that out into a function on its own:
This decomposition also helps with testing. Adding tests to parseIfSuccess
doesn’t need any HTTP mocks! Now on to handling both the error at network level, and any JSON-parsing errors:
We can move the success and error callbacks inside fetchPrAndUpdateMergeability
into separate functions. I’m leaving that as an exercise to the reader. The full working example can be found here: kgrz-promises-composition-example
Conclusion
By using the basic building blocks provided in JavaScript, we’ve been able to build a solution in a very small sequence of steps. We could extend our program’s functionality a little further and make it truly batched, where we make n
queries concurrently, wait for all the resolve, and then move onto the next batch—by converting the PR ID list we have to a 2-D array, and using Promise.all
inside each loop, for example.
Some of the techniques explained here were used in refactoring a gnarly piece of API calling code at work. The result was a set of simple functions like readMergeableStatusIfSuccess
that could be reused if we were to change a small part of the entire pipeline, instead of duplicating majority of the logic. Try out these techniques in your next project and see if they work for you.
Bonus
async-await
is the new 🔥in JavaScript. These are new syntax elements which were added to help simplify the usage and behaviour of promises. Instead of the pipelines when using Promises, async-await
code looks more “serial”, and so it’s a little easier to follow. I’ve put up an example based on the post here: kgrz/promises-composition-example/async-await
I personally use Promises a lot because of the environment we use at work, and I like them. If you don’t have that restriction, I recommend using async-await
, but learn about Promises just enough that you won’t get stuck searching on internet.
Resources
There have been some very informative posts on how Promises work (or don’t). Here’s a list that I think will help you if you’ve reached this far, in no specific order:
github.com/getify/You-Dont-Know-JS
Where Kyle Simpson does a super great job going in depth about callbacks and promises. The entire series is a must read for any JavaScript developer.
dist-prog-book.com/chapter/2/futures
A more holistic view of Futures and Promises that goes through some details on internal implementation, and execution semantics.
2ality.com/promise-callback-data-flow
A simple and informative post on various ways to pass data from one promise to another.
jcoglan.com/callbacks-are-imperative-promises-are-functional
A thorough post on how Promises help you write more abstractions based off of other existing abstractions to build more complex programs.
mathiasbynens.be/notes/async-stack-traces
Article on why capturing stack traces when using async-await
is cheaper than with promises. Also mentions some details about the differences between async-await
and promises.
staltz.com/promises-are-not-neutral-enough
A post that goes through some design issues in the Promise API.
brianmckenna.org/blog/category_theory_promisesaplus
A post that talks about some design issues in the Promise API.
Lastly, I gave a lightning talk at DotJS 2018 on the same topic.