How to set team KPIs that won't cost your lemonade stand $3B in fines.

Principles for designing inspiring, human-centered performance metrics.
7 minute read

Meet Lola. She’s 9 years old. She has 2 siblings — twins, 7 — Jack and Jane. They have a neighborhood friend, Murray, 8½. Murray isn’t particularly sharp, but Lola says he’s dependable.

Unlike Jack, Jane and Murray, Lola is ambitious.

Lola was never one for dress-up or playtime. Instead of princess costumes, Lola favors pantsuits. She owns 7 of them, one for each day of her work week. She’s never asked her parents for an iPhone; she bought herself a Blackberry two years ago — she claims the physical keyboard is faster for email.

Lola’s latest endeavor is selling lemonade.

Jack, Jane and Murray are dutifully onboard. She’s also recruited other children from the neighborhood and divided her helpers into teams, so they can cover multiple locations across the neighborhood.

Unfortunately…

After the first week, as you may expect, Lola’s helpers didn’t live up to her high expectations.

As an aspiring operator, Lola remembered a quote in John Doerr’s “Measure What Matters” where Larry Page attributes Google’s success to OKRs:

“OKRs have helped lead us to 10x growth, many times over… They’ve kept me and the rest of the company on time and on track when it mattered the most.”

Like many who start on their metrics journey, and, because at 9, Lola has not yet fully-developed abstract reasoning, she’s not sure how to select smart KPIs her team can rally around. She’s also heard the horror stories about Wells Fargo’s $3B in fines and litigation, a consequence of the bank’s singular focus on “# accounts opened.”

Believing in the power of metrics, but wary of negative side effects, Lola starts with first principles. She asks: “What makes a good metric in the first place?”

What makes a good metric?

  • Good metrics are controllable.
  • Good metrics support fair comparisons.
  • Good metrics are hard to game.
  • Good metrics are concrete and familiar.
  • Good metrics instill a sense of pride.

A terrible illustration

1. Good metrics are controllable.

Teams must have agency and ability to impact their metrics.

As an example, let’s look at Lola’s metrics for tracking customer satisfaction. Lola knows that happy customers are key for creating word-of-mouth growth. And she knows that her customers’ satisfaction primarily comes from 2 factors:

  1. The quality of the lemonade
  2. The customer experience (when interacting with her team)

Lola could measure overall customer satisfaction by surveying customers with NPS or PMF surveys, but those measures are only partially driven by team performance, and therefore only partially within her teams’ control. Instead, she needs a metric determined by customer experience, independent of product quality.

One possible way to observe a happy customer is looking at the tip jar. Lola reasons that people will tip after the transaction, but before they’ve tasted the lemonade, so it’s a clearer measure of their experience, independent of the product itself.

So she decides to measure the amount of tips received by each stand. We’ll call this Absolute $ Tips. She knows this still isn’t a very robust metric, but it’s something the team can influence directly, so it’s a good start.

Why does this matter?

Self-efficacy helps determine motivation (src). If the target outcome is outside a team’s control, trying to optimize against it can feel disheartening (src). Beyond that, it also makes the metric noisy and ambiguous, and therefore difficult to track and optimize against.

2. Good metrics support fair comparisons.

Metrics should fairly represent the underlying, often-heterogeneous teams being measured. They shouldn’t advantage some groups over others.

In Lola’s case, tracking Absolute $ Tips is hardly fair…

In an unfortunate bout of nepotism, Lola gave her siblings the prime location (close to the Whole Foods AND the Soul Cycle), whereas Murray and the other children were assigned to lower-traffic stands.

Not only are Jack and Jane seeing more traffic, their location’s demographic is also likely more wealthy and able to give larger tips. Even on their worst days, Jack and Jane will pull in more tips than Murray. This means measuring Absolute $ Tips gives Jack and Jane a measurement advantage over Murray.

Absolute $ Tips is okay for tracking an individual team’s customer service, but it’s not fair or comparable when comparing teams. To correct this, Lola needs to account for demographic and traffic differences between stands.

Instead of Absolute $ Tips, changing the metric to # Tips might account for differences in tip sizes (e.g. instead of “$100 in tips,” it’s “Customers tipped 50 times”). But # Tips is still susceptible to traffic differences between locations, so a further refinement might be to normalize the metric, defining it relative to overall # of Customers.

This gives us: (# Tips) / (# Customers). This is a more fair metric for lemonade stand customer service, across multiple locations.

This is looking better! But there’s another concern…

Why does this matter?

For metrics (or any workplace intervention) to be successful, employees need to “buy in” and trust that it represents their interests (src).

Perceptions of organizational justice can significantly impact employee motivation and acceptance of organizational change (src, src). If team members perceive a metric as unfair, they’ll take it less seriously and it’ll be less influential.

3. Good metrics are hard to game.

It’s good that a single metric can be accomplished in a variety of ways. This leaves room for teams to apply their own creative solutions. But, without care, this can incentivize the wrong things. (For example, tracking employee workstation activity often leads to deceptive “mouse moving.”)

Lola is concerned that even the new normalized customer satisfaction metric is still possible to game.

For example, Murray’s partner Don Jr. has always gotten a little extra help from his parents, and Lola suspects they may be inflating Don’s numbers by tipping multiple times per order.

So she revises her metric again by changing # Tips in the numerator to # Tippers (i.e. number of customers who tipped). This minimizes the influence of any one customer on the ratio and brings the metric closer to the intended outcome: Happy customers.

This gives us: (# Tippers) / (# Customers).

As a further guard, Lola could also “qualify” the metric to only count “real” tips from “real” customers (where “real” is non-employee-family customers). She might offer a friends-and-family discount with a sign-in sheet, then subtract the # F&F of Signatures from # Tippers.

This would give us: (# Tippers - #F&Fs ) / (# Customers - # F&FS).

This is getting relatively abstract though…

Why does this matter?

The biggest challenge with any performance metric: “You’ll get what you ask for, but not necessarily what you want” (src). To avoid incentivizing sketchy behavior, metrics must be “hard to game,” typically by aligning them more closely with the intended result, qualifying them or considering them alongside some counterbalance metric (src).

4. Good metrics are concrete and familiar.

Good KPIs are easy to relate to the underlying work and behaviors. They use language that’s familiar to the team. They’re no more abstract or complex than necessary.

Lola feels good about her “tipper rate” metric as an indicator of team contributions to customer satisfaction, but now she’s worried it’s too abstract, especially considering her team is 100% children. She wants to make sure that, day-to-day, the team isn’t distracted by the particulars of how a metric is calculated. So she takes a step back. What #s should her team focus on “in the moment?” That’s easy. Back to Absolute $ Tips. Seeing a customer put a $ into the jar is a visceral experience. And the tip jar itself is a visual indicator of that metric - the fuller the tip jar, the better you’re doing.

Why does this matter?

Presenting information concretely makes it easier to remember and stick in our minds (src, src). For metrics to have an impact, people must be able to easily relate the numbers to the underlying behaviors and actions.

5. Good metrics instill a sense of pride.

Well-designed metrics memorialize a team’s hard work. They demonstrate how individual contributions build to some greater good for the business, or ideally, for our fellow humans. They look good on a resume or a humble-brag on LinkedIn.

Lola’s last concern with her “tip rate” metric is that it sets a selfish tone. Her team obviously benefits from the tips, but how does that relate to their overall mission? Lola has, of course, articulated their mission statement: “Hydrating and refreshing the world with organic, citrus beverages.”

So, after a busy week, when the team reflects back on their work, how can they keep the focus on the mission and avoid fixating only on the tips they’ve earned for themselves (or Lola’s questionable labor practices)?

Lola, thinking back to a case study she’d read on Ray Kroc, remembers the #s on Golden Arches around the world: “500 billion served.” Lola adapts this for her enterprise: ### Customers Refreshed.

After calculating each day’s sales, she updates the “customers refreshed” metric and sends it out to her team. This helps align her budding organization on the people they’re helping (v.s. the money they’ve earned). And, at the end of the summer, when the number has accumulated a few extra zeros, they can reflect and feel proud of keeping their community hydrated.

Why does this matter?

We all want to be part of something bigger than ourselves. This is a fundamental part of what motivates us as people (src). The metrics we choose should reflect that. Further, as humans, we’re terrible at weighing “value now” v.s. “value later” (src), so the more visible you can make long term metrics, the easier it is to relate current actions to long-term outcomes.


Takeaway:

It’s true that “what you measure, you’ll improve.” But it’s also true that “getting what you ask for” doesn’t mean “getting what you want,” and your efforts at performance tracking might do more harm than good. To solve for this, design your metrics intentionally. Ensure that each KPI is actionable, fair, robust, concrete and inspiring.



Would you like to be a guinea pig?

Sign up for 3iap’s mailing list and get early access to the latest research, writing & experiments.

Note: No guinea pigs (or humans) have been harmed in the course of 3iap’s research, writing or experiments.