Measuring the performance of your Dropwizard application
Dropwizard has out-of-the-box support for application metrics by integrating Metrics. Let’s see how we can use this tool to measure the behaviour of various components of our application. The Dropwizard manual has a section on Metrics configuration which is worth reading. It includes important information on setting up various reporters and the available configurations and their default values.
Setting up a Console Reporter
It’s very easy to set up a ConsoleReporter to report your application metrics periodically to the console. As a bare minimum you need to include the following lines in your application configuration yaml file:
metrics: reporters: - type: console
With these three lines you have set up a ConsoleReporter which by default writes to stdout every minute and that has UTC as its timeZone. You can change the frequency of reporting, timeZone or whether you prefer writing to stderr instead of stdout by configuring the metrics entry accordingly.
Interpreting the report takes a little more work than writing three lines of configuration. The reporter will write out things to the console even if you have not yet added any extra code to measure the performance of various methods in your application. Let’s look at the sections included in each report which tells us how out-of-the-box metrics is working in the context of a dropwizard application.
Gauges
A gauge is an instantaneous measurement of a value. By default Dropwizard reports the following categories of values:
Client and Server Error Rates
The values of the “io.dropwizard.jetty.MutableServletContextHandler.percent*” metrics report the rate of 4xx and 5xx responses. The values are by default reported as 1-, 5- and 15-minutes moving percentages.
JVM
A wide range of JVM metrics are collected by Dropwizard. These include JVM attribute values, capacity, count and used values of both direct and mapped buffers, and metrics on classloader, garbage collector, memory and threads. Nick Babcock of NBSoftSolutions has written on JVM memory and threads metrics and I encourage you to read his post. Below I have listed those metrics that are the most important to monitor and which should be watched carefully:
heap.used
Represents the amount of memory currently used. If the value of this metric increases unbounded, then there is a memory leak in the application
heap.usage
Measures the ratio of used over max available memory. In other words it reports what percentage of the maximum possible heap memory is being used currently.
threads.deadlocks
If any deadlocked threads exist, the value of this metric which is an array would have entries of the format %s locked on %s (owned by %s):%n%s. Also note that if any thread is deadlocked, the deadlock healthcheck would fail.
Queued Thread Pool
All four metrics here are worth monitoring:
dw.jobs
Number of jobs waiting for a thread. We want this number to be as low as possible as requests should not wait in the queue for processing.
dw.size
This is the number of threads that are currently in the jetty’s pool to serve requests.
dw.utilization
The number of threads that are being used currently and which are not idle.
dw.utilization-max
The ration of the number of threads currently in use to the maximum number of threads the pool can grow into.It is important to pay attention to the values of all these metrics when assessing your application in production. If the utilization is close to 100% it means that most of the threads in the pool are currently being used. That by itself should not be a matter of concern if the value of utilization-max is low. Because we know the pool can grow and that the pool has not yet approached the maximum number of threads it can grow into. It gets alarming “If the idle server started to receive requests, the thread pool utilization would drop as more threads are allocated, but the max utilization would increase.”
Counters
A counter is a simple incrementing and decrementing integer. According to the post mentioned earlier, Dropwizard’s three default counter metrics should not be relied on for much, because “they only represent the state of the program at an instant, which “lose” information between each time the metric is queried.” Here are the default counter metrics and their definitions:
active-dispatches
Number of requests currently being handled by Jetty’s threads
active-suspended
Number of requests currently suspended.
active-requests
Number of requests currently being serviced. When a request is first received, this counter is incremented and after the server has handled the request, the counter is decremented. Note that active_dispatches + active_suspended = active_requests
Meters
“A meter measures the rate of events over time. In addition to the mean rate, meters also track 1-, 5-, and 15-minute weighted moving averages.” The reason the weighted means are also included is because a basic mean rate reports the rate of events over the entire lifetime of the application and “doesn’t offer a sense of recency”.
Logging Meters
Among the default meter metrics set up by a Dropwizard application are the logging metrics. One for each level of logging (DEBUG, ERROR, INFO, TRACE and WARN). The ALL metric, counts all logging events no matter what logging level they have. Nick Babcock recommends keeping an eye on WARN and ERROR which should ideally be zero. Also keep track of ALL to figure out if your application has too many logs, especially if you save logs on disk.
Jetty Meters
There are also a couple of Jetty meter metrics. For each class of response codes (1xx, 2xx, 3xx, 4xx and 5xx), a meter is registered:
- All 5xx responses should be investigated.
- 4xx responses may reveal a pattern of how clients interact with your API. Maybe they make assumptions on how to create a certain request but receive a Bad Request response. So they might be worth investigating.
Asynchronous Requests Meters
Dropwizard also meters asynchronous request dispatches and timeouts.
Timers
From dropwizard-metrics guide we learn that a timer measures both the rate that a particular piece of code is called and the distribution of its duration and so are the most descriptive of metrics.
Now that we know how to setup and interpret the output of the ConsoleReporter
, the next step is to add some metrics to our application.
Adding Metrics to Dropwizard Resources
Dropwizard has made it very easy to add metrics to resource classes. With annotations provided in the metrics-annotations module, one only needs to annotate a resource method with one or more of the following: @CachedGauge, @Counted, @ExceptionMetered, @Gauge, @Metered, @Timed.
When you annotate a resource method, Dropwizard augments Jersey to automatically record metrics on that method.
One important detail to notice is that the annotation only work as expected, if your resource is explicitly registered to the jersey environment somewhere in the run() method of your application. This is how resources are usually registered so most people don’t notice that the metrics registry is done within this workflow. We had a couple of resources which were not registered the normal way and their endpoints was served through jersey resource locators as subresources of other registered resources. While the api worked without any particular issues, I noticed metrics from some resources are missing although their methods were annotated with metrics annotations. That’s how I know that environment.jersey.register()
is crucial if you need resource metrics to work magically with Dropwizard.
Adding Metrics to Non-Resource Classes
As we saw Dropwizard comes with out-of box support for adding metrics to your resource methods. But how can we add metrics to those methods that are not part of a resource class?
In our application, we were interested in measuring the performance of our authentication layer as we have implemented our custom authentication and authorization with Firebase. So I went ahead and annotated our authenticate method with @Timed and waited for the console reporter to report on our authentication but no new metrics were reported. So if just adding the annotation does not work, then what is the right way?
Basically the idea is to give your non-resource class an access point to your dropwizard MetricRegistry
. You can get access to this object from environment.metrics()
within your application’s run() method. For us we decided to pass the MetricRegistry
to our authenticator’s constructor when we initialize custom authentication:
environment.jersey().register(new AuthDynamicFeature( new FirebaseAuthFilter(new FirebaseAuthenticator(environment.metrics()), new FirebaseAuthorizer(accountDAO))));
Once the FirebaseAuthenticator has access to the MetricsRegistry object, it can add other metrics to it. In our case we needed a Timer metric for the authenticate method. This is how FirebaseAuthenticator looks like:
public class FirebaseAuthenticator implements Authenticator<String, Account> { private final Timer timer; public FirebaseAuthenticator(MetricRegistry metricRegistry) { timer = metricRegistry.timer(MetricRegistry.name(getClass(), "authenticate")); } public Optional<Account> authenticate(String token) throws AuthenticationException { final Timer.Context context = timer.time(); try { // perform authentication and return the optional account } finally { context.stop(); } } }
Notice how we initialize a Timer in the constructor and how we operate that Timer in the main body of our authenticate method.
Now if you wish to use annotations for collecting metrics from arbitrary classes, it gets more complicated and is by itself the subject of another blog post.
Resources
Dropwizard Metrics Configuration
Metrics
Guide and Explanation for Metrics in Dropwizard