Kirkpatrick's four levels in modern L&D — what's still useful, what's been replaced
The 1959 evaluation model that still anchors enterprise L&D, the persistent measurement gap at Level 3 and 4, and the Phillips ROI addition most companies skip.
On this page▾
- Donald Kirkpatrick's four-level evaluation model (Reaction, Learning, Behavior, Results) is still the dominant L&D framework 65 years after publication — because no replacement has been better.
- Most L&D programs measure Level 1 (satisfaction) and stop there. ATD research: only 14% of organizations measure Level 4 (results), and only 5% calculate Phillips' Level 5 ROI.
- The measurement gap isn't about tooling — it's about willingness. Measuring behavior change requires honest before/after comparison and isolation from other variables.
- Modern L&D adds two layers Kirkpatrick didn't: predictive analytics (will this person apply the learning?) and continuous-rather-than-event measurement (Behavior at 30/60/180 days, not Behavior at 'after the course').
Donald Kirkpatrick wrote his four-level framework as a UW PhD dissertation in 1954, published it in 1959, and still anchors every serious L&D evaluation conversation in 2026. That's a 65-year shelf life in a field that loves new frameworks. There's a reason.
The four levels revisited
| Level | Question | Typical measure | Honest difficulty |
|---|---|---|---|
| 1 — Reaction | Did they like it? | Post-program survey, NPS | Easy and over-measured |
| 2 — Learning | Did they learn it? | Pre/post test, certification | Moderate — depends on assessment quality |
| 3 — Behavior | Are they doing it differently at work? | Manager observation, work-product analysis, 360 feedback | Hard — requires longitudinal observation |
| 4 — Results | Did it move a business outcome? | KPI movement attributable to the intervention | Very hard — confounded by other variables |
Where most programs stop
The measurement collapse between Level 2 and Level 4 is the L&D function's central credibility problem. CFOs see a $5M training spend and ask what changed; the L&D team has detailed satisfaction scores and no behavior or results data. The CFO concludes correctly that nobody knows whether the spend worked.
Phillips' ROI extension
Jack Phillips added a fifth level in the 1980s: Return on Investment, calculated as ((Benefit − Cost) ÷ Cost) × 100. It's controversial because attribution is hard — but the discipline of attempting the calculation forces L&D teams to specify what behavior change they expect and what business outcome flows from it.
Tight ROI calculation is realistic for skill-based training tied to revenue-generating behavior (sales training, customer-success methodology, technical certification). It's harder for leadership development and culture work — where the right metric is more often retention or engagement than P&L.
Modern additions
- 1Predictive analytics (Level 0)Before the program: who is likely to apply this learning based on role, manager, and team context? Saves spend on participants with no chance of transfer.
- 2Continuous Level 3 measurementBehavior measured at 30, 90, 180 days post-program — not 'right after.' Kirkpatrick's original framing was event-based; the modern version is longitudinal.
- 3Manager-as-multiplierWhether learning transfers is more dependent on the manager's behavior reinforcement than on program design. Track manager support as a Level 3 leading indicator.
- 4Skills-based credential captureLevel 2 verification now flows into skills inventory for internal mobility (closes the loop with talent marketplace systems).
Frequently asked questions
Is Kirkpatrick obsolete?
No — but standalone Level 1 measurement is. The Kirkpatrick framework is still the cleanest way to think about what to measure; the obsolescence is in stopping at Level 1 and calling it evaluation.
How do we measure Level 3 without surveillance?
Manager observation at structured check-ins (30, 90, 180 days) with specific behavior anchors. Self-report with manager triangulation. 360 feedback if the program targets management or interpersonal skill. Skills demonstration for technical skills.
Should we measure ROI on every program?
No — tight ROI calculation is expensive and only worth it for major programs (>$500k spend). For smaller programs, manager-rated Level 3 behavior change at 90 days is the most honest cost-effective measure.
- Evaluating Training Programs (Kirkpatrick & Kirkpatrick) — Berrett-Koehler
- The Phillips ROI Methodology — ROI Institute
- ATD State of the Industry 2023 — Association for Talent Development
Read next
All playbooksThe 70-20-10 model is the most cited and most misused framework in L&D. This is how to operationalize it — experience, exposure and education — into a real…
Google's Project Oxygen and a generation of follow-on research keep finding the same thing: the single biggest driver of team performance and growth is the…
Skill matrices and capability models are the spine of modern L&D. Done right they replace fuzzy job descriptions, calibrate hiring and promotion, and tell…