Monitoring your application is important - full stop. When the response time plummet’s you want to know before your customers do…and certainly before you bother to check your twitter feed.
There are lots of options here, and a lot of them require a ton of upfront investment in terms of time and effort. However, a simple way to get started is to simply log total request durations and track the number of successful and failed responses.
This way you can setup alerts when the success rate is below a certain threshold, or when the mean response time goes from 80ms to 2s.
TL;DR - monitor your requests
I’m going to use the statsd-instrument gem written by Shopify. It takes care of handling different backends based on the environment, provides a clean wrapper around StatsD calls, and ships with a railtie.
To get started, you’ll need to add the gem to your Gemfile and add an initializer.
As I mentioned, the gem takes care of using different backends per environment.
The special cases are:
staging- calls are sent via UDP to
test- calls are ignored
All other environments will log calls to
For more details, check the README for statsd-instrument.
To measure request duration, we can use middleware. The trick here is to insert it at position 0. That way the entire
request lives and dies within the middleware’s
call method, making the measurement really simple.
In this post I’m going to create
Middleware::StatsDMonitor and expect Rails to find it in
middleware/statsd_monitor.rb (rather than
Fortunately, this isn’t the first time this has come up, and there is a solution. Add an acronym for
Normal controller tests don’t load the middleware stack. So for this, we’ll need to write an integration test. Add the
Now we can watch it fail so hard by running
Create the new middleware class in
Next, add the middleware to the beginning of the stack.
Finally, we need to go back to our StatsD initializer, and tell it to measure the calls.
bin/rake test:integration should now pass. Now you can setup alerts for request duration spikes. Or at least
show the average request time on a dashboard somewhere.
Request duration is a good start, but ideally, you’ll want to increment a counter everytime a response (that we’re
interested in) happens. This way you can setup alerts if
400, 404, 429, 500, or 502 responses spike, or if
302 responses dip.
Add the following tests to the integration test. Each one simply ensures that a particular counter was incremented.
To increment the right counter, we’re going to use
stats_count_if. This will increment the count if the supplied block
For those who didn’t want to read or just like copy pasta, here are the complete files.