Note: This is Part 2 of a 2 Part series. Click here to go to Part 1.

Building the Foundation of the Correlation Model

So that’s the theory. But how do you actually put this into practice?

The most important thing is to start at the beginning.You need to have a clear understanding of the objective or business impact that is driving your need to monitor your metrics in the first place. This can be an initial stumbling block, because IT organizations don’t typically think in those terms. Going back to MTRS, why are you measuring it? What’s the impact of a good or improving MTRS?

The answer is that you measure MTRS because it directly impacts the availability of your services to the business. Of course, your objective should be that your services don’t fail in the first place (that’s another metric stream), but when they do, MTRS is the measure of how quickly you are able to restore that service. So MTRS is tied to the business impact of Improving (or Maintaining) Service Availability. The better your MTRS, the more available services will be.

Because MTRS is the metric that is most closely aligned with your objective and sits at the top of the correlation stream, we refer to it as an Outcome metric. (Note: You may have multiple Outcome metrics for each Objective and you will likely have multiple Objectives.) Outcome metrics measure the result of processes or actions or improvement in technologies.You cannot directly manipulate an Outcome metric. You can only improve it by changing something that is correlated to it.Outcome metrics are typically the KPIs that senior IT leadership is most concerned with and which report the results of IT’s overall efforts and effectiveness.

Ideally, you should start by developing the key objectives and impact you desire and then map the correlation model top-down. In practice, however, you are often starting with a set of metrics and KPIs that you are already measuring and instead will need to fit them together like a puzzle. That’s ok, but you need to ensure that you keep asking yourself, “So what?” to understand why each metric matters in context until that leads you to an Objective that is in-line with your overall IT strategy and which will be intuitively understood by the business. You may find it easiest to tackle this from both ends. Fit some of the pieces together and once you are beginning to make some sense of them, take a fresh look at what the key business-driven objectives and desired impact should be. From there, you should be able to more easily identify your top level Outcome metrics.

Identifying Indicator and Diagnostic Metrics

With your Objectives and Outcome Metrics defined, the fun can begin.

The process is pretty straight forward. You will simply take each Outcome Metric and ask yourself what things may impact or contribute to that outcome. Using our MTRS example, we identified initial response and overall incident volume as key contributing factors to MTRS. Could there be others? Definitely, but you need to be careful to protect the lines of causality.

For instance, you might say that an incorrectly categorized incident would lead to MTRS rising. That’s true, but it is probably a component of your response target breach rate. The resolving team couldn’t respond within the targets because they got the incident after it had already breached – and that occurred because it was incorrectly assigned.

You clearly need to adapt this to your organization and how, for instance, you choose to calculate an initial response time target. But the point is that you should strive to truly understand the correlation between the underlying metrics that support an Outcome metric. It’s also important to understand the concept of Indicator and Diagnostic metrics.

Your objective is to create a model that allows you to iteratively dig deeper to understand the corrective action that is required.

  • MTRS is rising – why?
  • Because too many incidents are breaching our initial response targets – why?
  • Because too many incidents are being categorized incorrectly.

 

To follow this line of thinking, some of our metrics will be Indicator metrics (such as Response Target Breach Rate).They will indicate a potential cause and are really sign posts. You will normally not be able to do too much to directly impact these Indicator metrics, so they should always be correlated to Diagnostic metrics.

Diagnostic metrics are those that enable you to take some direct action. Too many incidents are being categorized incorrectly? You can take direct action to correct this: you can simplify the categorization schema, introduce some new training, add an audit step, etc. Whatever it is, by taking that corrective and proactive action, you will be able to impact all of the upstream, correlated metrics. It is critical that you develop the model to the point that you always end each correlation stream with one or more diagnostic metrics. If you don’t, you will have trouble making your correlation model actionable.

What a Correlation Model Isn’t

Once you have built your correlation model and put it into practice, you will find it to be an extremely powerful tool to help you manage your IT organization and direct it toward the continual achievement of your objectives. But because it can be so powerful, it is not uncommon to want to try to make it something that it’s not.

A Metrics Correlation Model should never be viewed as a replacement for transactional monitoring and management tools typically found in Service Management products. (Although it may be built using their reporting and dashboarding tools.) Those real-time dashboards and tools are designed to help you manage in-flight incidents, changes, etc. as they progress through the process. It’s important to do that well and many of the issues with IT performance can be tackled during the transactional process.

But working at a transactional level is like staring at the bark of a tree trunk. Eventually, you lose perspective and are unable to see the forest. The purpose of the Metrics Correlation Model is to enable you to see all of your points of measure in context and in relation to your overall objectives. It is meant to be a retrospective analysis tool.

The model will enable you to seek out and identify trends that manifest themselves across multiple metrics and over long periods of time. To see the things that you miss at a transactional level. This is critically important, but it is not synonymous with transactional reporting. If you attempt to combine these or make the model more than it should be, you will run the risk of ensuring that it serves neither the transactional or analytical needs effectively.

Leveraging Context to Take Action

Building a correlation model will enable you to overcome the single greatest challenge that most IT organizations have with their metrics efforts – how to go from data to action that has an impact. It is very easy to fall into the trap of measuring metrics for their own sake without having the clear context of how any given metric can be moved or how it affects the goals of the IT organization.

In the end, no business really cares about MTRS. And so, IT shouldn’t care about MTRS either – you should only care about how MTRS affects your ability to meet your objectives and deliver what the business does care about. That’s the power and the promise of a metrics correlation model.

About the Author: