Calibration Sessions: The Quiet Engine of Fair Performance Management

60-Second Summary

Manager rating tendencies vary by 1–1.5 points on a 5-point scale — calibration normalizes the gap.
Calibrate in groups of 6–10 peer managers, facilitated by HR.
Discuss outliers, not every employee. Use the time on the disagreements.
Force comparison: 'Why is A a 4 and B only a 3?'
Audit calibration outcomes by protected class — bias compounds in unaudited sessions.

Industrial-organizational psychologists have measured manager rating tendencies for decades. The result is consistent: a 5-point performance scale, used by 20 untrained managers, produces a 1–1.5 point spread for identical performance. Calibration is the only known correction.

Why calibration exists

Without calibration, the rating an employee receives is partly a function of who their manager is. Lenient managers protect their teams; stringent managers punish theirs. The result is unfair compensation, unfair promotion, and lost trust in the entire performance system.

Structure of a 90-minute session

A facilitated calibration that produces decisions

1
Pre-read (sent 48h before)
Manager-proposed ratings for all team members, with one-paragraph rationale per non-meets rating. Forced distribution NOT used as a target.
2
Opening (10 min)
Facilitator restates definitions of each rating level. Names the bias risks (recency, similarity, leniency, halo).
3
Outlier review (60 min)
Group focuses on proposed top-rating and bottom-rating cases. Force comparison across teams.
4
Distribution check (10 min)
Look at final distribution by gender, ethnicity, tenure, team. Flag anything that looks off for HR follow-up.
5
Close (10 min)
Confirm decisions, owners, timeline for manager-employee conversations. Everything stays confidential to this room.

The facilitator's playbook

Open with the definitions — people drift in 12 months.
Never let a manager defend their own team unchallenged. Ask peers: 'Does this rating compare cleanly to someone in your team at the same level?'
Name biases out loud when you see them. 'I think we're recency-biased here — what did this person do in Q1?'
Time-box every case to 7 minutes. Endless debate hides indecision.
Capture decisions in real time on a shared screen. No 'we'll write it up later'.

Post-calibration discipline

Within 48h: HR analyses final distribution by protected class and reports back to the group.
Managers brief their direct reports within 2 weeks using the calibrated rating.
Compensation decisions flow from calibrated ratings, not pre-calibration proposals.
Notes from the room never leave the room. Confidentiality is the only reason managers will speak honestly next time.

Why calibration exists

Structure of a 90-minute session

The facilitator's playbook

Post-calibration discipline

Read next