OKRs

At TaskRabbit, we have something called Objectives and Key Results or OKRs for short. Many companies do the same.

We’ve always talked about them as one thing. As in, “What’s the OKR?” That would more or less mean, “What number are we trying to hit?” An example would be getting better at fulfilling tasks by focusing on moving up the percentage of tasks that are posted and get successfully completed.

This metrics-driven approach can really work but its efficacy seems related to the metric chosen. If the team believes it’s the right number to move, it works out. If they see weird loopholes or tricks, then it is less effective. Because of this, there is much discussion each time about the best metric to focus upon. Assuming we don’t focus on something all-encompassing (“Revenue”), it often has turned out to be very difficult to find exactly the right metric.

My recent realization is that there are objectives and key results. Of course, that’s obvious because it’s the name of the thing. However, in my mind, I don’t think I ever separated them to the degree that they deserve to be separated. It was always about hitting the metric and that was the objective. The goal, as I now understand it, should be to make the objective more of the “intent” of the situation. The key result is simply a way to “sample” that intent.

Example

As an example, we could have an objective of “making TaskRabbit a habit.” I think it’s helpful to make it completely about the intent and not something like “increasing lifetime value” or “doubling monthly active users” or whatever. That helps when making decisions throughout the quarter and is open to interpretation as the understanding of it evolves. The key result, then, is for sampling the progress. It could be “increase the average number of tasks per month to X” or “get Y% more people to their 4th completed task,” or “increase the average number of consecutive weeks of usage to Z” or any number of other options.

Each metric will have its holes, but the best metrics will be the one or two that:

  • are highly correlated to the intent of the objective
  • can be sampled frequently to understand and measure progress
  • are easily explainable and understandable to everyone

This produces a metric that doesn’t have to have its nuances explained every time we talk about it, can be on the wall in dashboard form to see if it’s going well, and that we believe is a good enough sample of the goal.

Metric

I think one of the trickier traps is the allure of percentages. Let’s say that we have the retention goal as noted above. One seemingly fine metric is to make up some concept called “habitors” (those that have made TaskRabbit a habit). We could define that means they get 4 tasks done in their first 2 months. The goal could be to get 50% of new users to be a habitor, up from 40%. In my experience, this fails the “can be sampled frequently to understand and measure progress” point.

First, the window is too long. We can only really report a percentage after someone has had their 2 months on the platform. Anything before that and I’d have to caveat it with it being a projection. The graph will also drop off towards the current date. I can explain that, but I believe it has a psychological effect on the team. Finally, the long window tends to suggest that we need to work forward from acquiring the user. We’ll start that tomorrow with a new cohort and we’ll know if we succeeded in 2 months. That’s just not the right level of urgency.

Second, percents as a whole seem to be a problem - at least when combined with the long time period. It’s just hard to visualize what a 1% change means. When prioritizing work, we’ll have to understand if the work will make a meaningful dent in the problem. 1% could be 10 or 10,000 people. That’s helpful to know when calculating the likely return on investment of the feature.

So if percents are not actionable, what is? Straight numbers. I’ve seen better success with a metric like “1000 people post their 4th task each day.” Because the time window is so large, we already know how many people signed up and will fall in the window for at least the next two months. This enables us to just set a monthly, weekly, or daily number. This fixes the mindset issue by noting there are people right now that we should be pushing up that ladder, not starting off a new cohort. It also reduces the metric lag from weeks to being able to report on the number so far today. Importantly, it also has a similar level of correlation to the objective.

Tactics

Another issue I’ve seen is focusing the objective or key result on a tactic. It’s a viable tactic to try to reach the objective by instituting some sort of “punchcard” program like at the coffee shop. Maybe you post 3 tasks and get 1 free. It’s a fine feature, but it would be a mistake to have that hypothesis and make the objective or a key result mention it.

Maybe a year into the program it can be a goal to “increase punchcard participation” or something, but definitely not at the beginning. It’s just one tactic at this point. If it doesn’t move the numbers, we try the next thing.

Learnings

It should be easy to agree on the overall intent (objective). Let’s agree that we’re going to work towards the objective and not do any unnatural acts to hit any specific number. That being said, we need to be be able to sample a few metrics (key results) to be able to course correct and drive motivation. And yes, that’s a good tactic but let’s apply it to something higher-level so we’re open to learn as we go.

It’s completely possible that this is already a chapter in some book or a known theory with slightly different names. Either way, this framework helped make these conversations go much more smoothly this quarter and thought I would share.

Copyright © 2017 Brian Leonard