What goes down at performance review calibration?

The problem with people is that they’re all different. It’s really annoying. This manifests in performance reviews in two ways:

Direct reports all have different talents/skills/opportunities. Many spike in some areas but could improve in other areas. This makes it tough to compare people to “the bar” for their given role. What rating do you give someone who is reaching into the next level in some areas but not meeting their current expectations in other areas?
Managers all have different interpretations of what it means to be “at-level” for any given level. Some managers have higher expectations than others. Some managers tend to give higher ratings and earlier promotions than others.

Performance review calibration is an attempt to solve these problems. For non-managers, it can seem annoyingly opaque. So if you’re curious, here’s how it goes down. The exact process varies from company to company, but most seem to follow the same basic formula.

Step 1: Before the meeting, each manager enter ratings and justifications for their direct reports into a document/spreadsheet/app ahead of time. Depending on the company, the justifications can be anywhere from a few sentences to multiple pages of evidence explaining what behaviors and actions led the manager to choose that rating.

The evidence should be specific and at most companies there’s an emphasis on measurable impact. So “employee did X” is nowhere near as persuasive as “employee did X which led to Y% improvement in Z.” Good managers will work with their people to make sure the evidence is results based instead of actions based.

Step 2: Then all of the managers in the org meet and go person by person, talking about if the rating is fair and correct by comparing the expectations of their job profile with the person’s behaviors and achievements. That’s why it’s called calibration: each employee is calibrated against the expectations of their job profile and each manager’s interpretation of what those expectations mean in practice. Sometimes during the course of the discussion, the group decides to adjust a person’s rating up or down.

(Tangent rant) This, by the way, is one common problem with calibrations: managers are sometimes tempted to defend their ratings instead of trying to work to find the best rating even if it means they were wrong. In those situations, managers who are charismatic presenters or not afraid to argue loudly can end up getting their people higher ratings than they have earned. And managers who are soft spoken unskilled speakers or are just nervous can fail at getting their people the ratings and promotions they deserve.

That’s why it’s so important that the company culture around calibrations is one of healthy conflict and psychological safety. And making as much of it async as possible can help a lot too. I’ve ranted about this before. (End tangent rant)

So an example question could be: “Based on your description, it sounds like this person needs a bit more oversight than we’d expect in an Software Engineer Level 2 [or whatever leveling system your company uses] with an 8 rating [or whatever rating system your company uses]. Is that true?” And then the group would discuss whether that’s a legit concern and whether it warrants lowering the rating or if it’s balanced out by other things.

Step 3: By the end, the goal is that the group has “calibrated” their understanding of the levels, and that they’ve all agreed on ratings for each of their direct reports. The meeting is over once that’s true.

Step 4: Then they go off and enter those numbers into Workday [or whatever employee management app your company uses] and the rest is history.

Thanks for reading! Subscribe via email or RSS, follow me on Twitter, or discuss this post on Reddit!

Related