The Challenges
Our client was looking to deliver greater value for the organization through L&D programs, technologies, and resources that were aligned with the business strategy and desired business outcomes. The client also wanted to leverage program evaluation to support L&D’s continuous improvement efforts, highlight where L&D could deliver greater value, and help justify the investments that the organization was making as well as gain buy in for additional investments needed. In order to do all those things, the client needed a way to measure L&D’s effectiveness and impact across the organization over a large footprint..
#1: A War with Many Fronts
Like most large organizations, this client was running many different L&D initiatives and programs concurrently, from new hire onboarding to leadership development programs, supporting organizational change initiatives, and addressing ad hoc business needs that arose throughout the year. How could they demonstrate the value that those efforts were delivering across so many fronts?
#2: Do More with Less
Senior leaders were constantly competing for the organization’s finite budget, resources, and attention; certainly, L&D leaders were no exception. With a new CEO in place, this client faced increasing pressure to justify costs, do more with less, and stretch their budget and team members to address bigger and broader organizational needs without adding headcount and expenses. Where could they gain the economies of scale and the leverage to deliver more value without adding resources?
#3: What Metrics Really Matter?
Depending on which business leaders they asked, the client got different success criteria for their L&D programs and initiatives. When running a program evaluation survey for a national sales meeting, for example, the organizing committee came up with 78 different success factors that they wanted to measure. All of them related to satisfaction with some aspect of the week-long event: the programming, the team building activities, the catering, the awards banquet, and so on. But what metrics really mattered?
The Solution Architecture

Part 1: Evaluation Framework
As we often do, we started with the end results and worked backwards into the needed solutions. We asked ourselves and the client what decisions we want program evaluation data to inform.
To aid with this conversation, we shared Kirkpatrick’s Levels of Evaluation and Phillips’ Return on Investment and discussed the kinds of questions that each can address as well as the pros and cons.

For example, we noted that Level 1 smile sheets are easy to administer and often point to favorable results but business leaders tend to be more interested in the training’s impact on results. Based on that, should we prioritize level 1 feedback, or should we allocate more resources to measure level 3 impact?
What L&D activities are most closely aligned with driving business results (level 4), and where would it make sense to cross-tabulate L&D metrics with KPIs?
Are there certain L&D programs that are more dependent than others on a Return-on-Investment business rationale (level 5), such as customer education and cost-reduction programs? Where is the effort to calculate ROI justified?
In addition to prioritizing which level(s) of evaluation mattered most, we also discussed what level of granularity was important to the client.

At the Component level, we have many individual learning assets: elearning modules, virtual and in-person instructor-led sessions, documents, videos, etc.
Learning Pathways string together those assets to help a learner achieve a certain skill development goal.
At the Curriculum level, an entire learning journey can span months or even years.
Then there are Function-level metrics that reflect how well the L&D department is influencing learning and business outcomes.
L&D was involved in many activities, big and small, some of which were taken for granted but still require time and resources. We worked with the L&D team to identify all those activities and then prioritize which ones would be within scope of the department’s Level 1 Dashboard and which would be considered too incidental. We organized the various activities into a few big buckets, including:
- Recurring Programs were curriculum-based learning journeys, such as new hire onboarding and emerging leader development. These programs ran regularly on an as-needed basis, quarterly, or annually. They typically involved many resources, events, and moving parts, so they were closely monitored by L&D Program Managers and Coordinators.
- Major Initiatives were closely linked to the business strategy and sometimes spun off programs that would sustain efforts after the initiative was concluded (e.g., a core values initiative).
- Special Events were unique, one-time gatherings, such as national meetings and town hall meetings. Each was custom designed to achieve a particular set of objectives.
- Learning Resource Collections could include documents, videos, subscriptions, 360 assessments, elearning courses, and other learning components.

Once we established purpose and priorities, we were able to start mocking up the dashboards, taking into account the many types of organizational capabilities that L&D supported.
We started with the highest-level Function dashboard view–the departmental scorecard that the client would use when talking with the CEO about the value the L&D team was delivering for the organization.
What would those top-line metrics be?

an example dashboard displaying Function-level evaluation results
Part 2: Holistic Evaluation Strategy
Once we had established the top-line dashboard, we needed to determine how best to drill down into the more granular metrics that would be needed for each curriculum, program, initiative, event, resource, etc.
While Satisfaction was easy enough to measure across the framework, Effectiveness, Transfer, and Impact were not.
Metrics That Matter
Once we catalogued the many L&D Activities and Organizational Capabilities that L&D supported, we shifted our focus to the metrics. We analyzed the framework to determine what specific metrics would be needed to gauge effectiveness and impact. For example, all recurring programs would require a summative evaluation. For short programs lasting a few weeks that might suffice, but for yearlong and multi-year programs, we would also want to have interim evaluations, so we could monitor more closely and adjust more quickly.
In addition to Level 1 participant satisfaction metrics, we also wanted Level 2 learning effectiveness metrics, so we looked for ways to capture pre-to-post gains with knowledge test scores, skills assessments, 360-assessments, and other instruments.
We also considered Level 3 transfer and Level 4 impact metrics, using post-training follow-up surveys that could be sent to both the participant and their manager. With some high-ticket initiatives and events, we also considered Level 5 return-on-investment metrics. We developed an impact-per-touchpoint metric that helped quantify L&D’s resource utilization.
From Metrics to Measurement
Once we zeroed in on the metrics that mattered, we shifted our focus to measurement and identifying sources of data that could be used to calculate metrics. This was as much art as science because we didn’t always have perfect data sources. We had to improvise, using available data where possible and establishing new data sources where needed.
For example, how could we measure participant engagement in a program? We could measure completion of learning tasks through the LMS, but that didn’t tell us how engaged the participant was when completing a task. We looked at possible measures, such as how far in advance of a deadline tasks were completed, how much free text they journaled, and how many questions they asked in class.
*
As we looked across programs and how we could apply our framework, we identified not only the metrics that mattered most to our client but also the sources of data that would eventually be needed to fuel our data funnels and dashboards.


an example dashboard displaying program-level results

The program dashboard allows you to drill down into individual sources like survey results
Another common approach that we used was to break results down by groups, typically business units or participant cohorts. This enabled leaders to make comparisons and identify where a particular group needed special attention.
A higher-level priority for our client was being able to compare current scores against historical trends to gauge whether the learning program was improving or experiencing some bumps in the road. We used sparklines to display the historical pattern going back 10 classes. We also displayed that 10-class rolling average. Between the two, leaders were able to spot trouble quickly.

Part 3: Integrate L&D Dashboards with Business Metrics
Although L&D metrics matter to L&D leaders, the metrics that really drive change are the ones that are closely linked to the business. For that reason, our strategy channeled the most evaluation efforts towards initiatives that were closely linked to the business’ Key Performance Indicators (KPIs).
For enterprise-wide programs, events, and initiatives, we were able to generate a spirit of competition by displaying dashboards that compared the results across job families, functional areas, business units, and sites. Business leaders filtered team-specific dashboards, so that they could use those comparisons to motivate their teams to do better.

For metrics that were linked to the business’ KPIs, we were able to cross-reference our L&D data with business metrics to develop an integrated view. By making that connection between training and performance more explicit, we were able to drive the right behaviors. In this example, the vertical line could represent a leading indicator like employee engagement, with the blue bubbles showing how well a particular group scored in comparison to that enterprise average and to the other groups, business units, etc.

Design Principles at Work
This case study illustrates the real-world application of four program evaluation principles:- Start with Decisions. Before you start collecting program evaluation data, ask yourself why you’re doing it. What business questions are you actually trying to answer? Do you just want to know if an L&D program is well received by the audience? Or do you want to know if that program is teaching skills that are getting used on the job? Or do you want to know if improving those skills is yielding the impact that the organization needs? Or do you want to know if it was worth the money spent building it and running it? Or are you trying to influence certain behaviors or skill-development?.
- Prioritize. You probably don’t have the time and money to answer all the questions you might want to answer about all your programs, so you’re going to need to prioritize. Deciding what’s important, what’s value added, and what’s mission critical will help you make sure that you channel the right amount of time, money, and effort towards answering the most important questions for your most important offerings.
- Get Holistic. Avoid designing new evaluation instruments one at a time as the need arises. Instead, take some to think holistically about all the decisions you’re trying to make, all the L&D activities and offerings you plan to have, how those activities and offerings intersect with the business KPIs, and develop an over-arching framework for what and how you will evaluate. This will guide you in developing the metrics, measurements, and dashboards that you will need.
- Foster Competition. Program evaluation can help you with more than just proving your worth: it can also be used to stimulate competition within the various business units you serve. Dashboards can compare business unit level L&D metrics cross-tabulated with the leading indicators of business performance. For example, you can cross-reference new hire onboarding metrics with productivity metrics to show which business units have achieved the best time-to-proficiency with their new hires.