Metrics Driven Development?

At the Minneapolis DevOps Days during the open space portion of the program, Markus Silpala proposed the idea of Metrics Driven Development. The hope was that it could bring the same sort of value to monitoring and alerting that TDD brought to the testing world.

Admittedly I was a bit skeptical of the idea, but was intrigued enough to attend the space and holy shit-balls am I glad I did.

The premise is simple. A series of automated tests that could confirm that a service was emitting the right kind of metrics. The more I thought about it, the more I considered its power.

Imagine a world where Operations has a codified series of requirements (tests) on what your application should be providing feedback on. With a completely new project you could run the test harness and see results like

expected application to emit http.status.200 metric

expected application to emit application.health.ok metric

expected application to emit http.response.time metric

These are fairly straight forward examples, but they could quickly become more elaborate with a little codification by the organization.

Potential benefits:

Operations could own the creation of the tests, serving as an easy way to tell developers the types of metrics that should be reported.
It helps to codify how metrics should be named. A change in metric names (or a typo in the code) would be caught.

Potential hurdles:

The team will need to provide some sort of bootstrap environment for the testing. Perhaps a docker container for hosting a local Graphite instance for the SUT.
You’ll need a naming convention/standard of some sort to be able to identify business level metrics that don’t fall under standard naming conventions.

I’m sure there are more, but I’m just trying to write down these thoughts while they’re relatively fresh in my mind. I’m thinking the test harness would be an extension of existing framework DSLs. For an RSpec type example:

describe “http call”

  context “valid response”

    subject { metrics.retrieve_all }

    before(:each)

      get :index {}

end

    it { expect(subject).to have_metric(http.status.200)}

    it { expect(subject).to have_metric(http.response.time)}

end

end

This is just a rough sketch using RSpec, but I think it gets the idea across. You’d also have to configure the test to launch the docker container, but I left that part out of the example.

Leaving the open space I was extremely curious about this as an idea and an approach, so I thought I’d ask the world. Does this make sense? Has Markus landed on something? What are the HUGE hurdles I’m missing. And most importantly, do people see potential value in this? Shout it out in the comments!