Defining Evaluation II


  1. Foreword
    • Note to class
    • Ana-Paula Quote
    • Fitzpatrick Book Reference
  2. Definitions of ‘Evaluation’ by Various Resources
    • Google Definition of Evaluation
    • Wikipedia Definition of Evaluation
    • University of North Carolina Greensboro Evaluation Resources
    • Handbook of Human Performance Technology
  3. Discussion Questions and Response
    • Question 1. How does Fitzpatrick, Sanders & Worthen (2011) define “evaluation”?
    • Question 1. How does Kirkpatrick (1998) define “evaluation”?
    • Question 1. How does Newby (1992) define “evaluation”?
    • Question 2. What are the differences and similarities in the meanings, contexts and scopes of their definitions?
  4. Resources


I’m looking forward to reading everyone’s thoughts on the definition of evaluation, thoughts on the required reading, and how they might be using these in real life.

“This week’s discussion is around other broad concepts in Evaluation. For example, what does evaluation mean? How to define evaluation? We are also looking at how to make distinctions about different terminology: Needs Assessment, Monitoring, Outcome Studies, Testing, Measurement, and Assessment.”

— Ana-Paula, Welcome to Week 3 Online Discussion

I highlighted pages 7 and 9 of Fitzpatrick, J., Sanders, J. & Worthen, B. (2011). Program Evaluation: Alternative Approaches and Practical Guidelines (4th ed.). They seem to define the word evaluation for the book. The book includes the dictionary definition, a research definition, and a methods-based definition for  evaluators with external stakeholders.

Definitions of ‘Evaluation’ by Various Resources

1. Google Search: ‘define evaluation’:

e·val·u·a·tion əˌvalyəˈwāSH(ə)n/ noun: evaluation; plural noun: evaluations

the making of a judgment about the amount, number, or value of something; assessment. “the evaluation of each method” synonyms: assessment, appraisal, judgment, gauging, rating, estimation,consideration

2. Wikipedia: ‘Evaluation’

For other uses, see Evaluation (disambiguation).

Evaluation is a systematic determination of a subject’s merit, worth and significance, using criteria governed by a set of standards. It can assist an organization, program, project or any other intervention or initiative to assess any aim, realizable concept/proposal, or any alternative, to help in decision-making; or to ascertain the degree of achievement or value in regard to the aim and objectives and results of any such action that has been completed. The primary purpose of evaluation, in addition to gaining insight into prior or existing initiatives, is to enable reflection and assist in the identification of future change.

Evaluation is often used to characterize and appraise subjects of interest in a wide range of human enterprises, including the arts, criminal justice, foundations, non-profit organizations, government, health care, and other human services.”

3. University of North Carolina Greensboro

“Program evaluation is a systematic process of gathering evidence to inform judgments of whether a program, product, policy, or system is meeting is goals and how it can be improved to better do so. This definition of program evaluation reflects the evolution in the conceptualization of program evaluation that began with the early work of Scriven (1967) and extends to the later works of Michael Scriven (1991), Carol Weiss (1998), Melvin Mark, Gary Henry, and George Julnes (2000), Robert Stake (2000), Daniel Stufflebeam (2001), and Thomas Schwandt (2008).

Several key terms are commonly used in describing the evaluation process. These key terms are:

EvaluandThe evaluand is the program, product, policy, or system that is being evaluated.

Evaluator: The evaluator is an individual involved in conducting the program evaluation. Evaluators who are internal to the client’s organization or group are referred to as internal evaluators. Evaluators who are hired from outside of the client’s organization or group are referred to as external evaluators.

Stakeholders: The stakeholders associated with a program evaluation are the individuals who participate in, or are affected by, the program, product, policy, or system being evaluated.

Formative Evaluation: The primary purpose of a formative evaluation is to provide ongoing information for evaluand improvement.

Summative Evaluation: The primary purpose of a summative evaluation is to provide information to make programmatic decisions or to judge whether a program should be adopted, continued, or expanded.”

input outputs

7. Handbook of Human Performance Technology

According to the Handbook of Human Performance Technology (2006), “Evaluation is the means to ascertain the worth or value of a performance-improvement initiative. It can be used to improve a performance-improvement process or to decide to discontinue the effort. It is also useful in judging the relative worth of performance-improvement alternatives. Two types of evaluation are formative and summative.”

This definition seems to correspond well with pages 7-9 of Program Evaluation: Alternative Approaches and Practical Guidelines (4th ed.). The HHPT continues on page 25 to explain the definitions of formative and summative evaluation methods. Formative being a method for improvements while a products is being built or designed and summative being a method used to judge merits after a project is complete.

  • Harold Stolovitch, Erica Keeps, James Pershing (2006). Handbook of Human Performance Technology. (3rd Edition). Pfeiffer.

8. Dictionary Definition of Evaluation

“…one typical dictionary definition of evaluation is “to determine or fix the value of: to examine and judge.”

  • Fitzpatrick, J., Sanders, J. & Worthen, B. (2011). Program Evaluation: Alternative Approaches and Practical Guidelines (4th ed.), page 5. New York: Pearson.

Discussion Questions and Response

Question 1. How do Fitzpatrick, Sanders & Worthen (2011), Kirkpatrick (1998) and Newby (1992) each define “evaluation”?

Question 1. How do Fitzpatrick, Sanders & Worthen (2011), Kirkpatrick (1998) and Newby (1992) each define “evaluation”?

According to page 5 of Program Evaluation: Alternative Approaches and Practical Guidelines (PEAAPG), the authors, Fitzpatrick, J., Sanders, J. & Worthen (who I will refer to as the PEAAPG authors for brevity) state that, “..among professional evaluators there is no uniformly agreed-upon definition of precisely what the term evaluation means” they go on to imply that the role is also not defined.

Still on page 5 of PEAAPG the authors define the uses for inquiry and judgement methods in three guiding principles including, ” 1. determining standards for judging quality and deciding whether those standards should be relative or absolute, 2. collecting relevant information, and 3. applying the standards to determine value, quality, utility, effectiveness, or significance.”

On pages 6-7 the authors differentiate the evaluation and research based on their purpose, definition, implementation, stakeholders, criteria, and methodologies.

On pages 8-9, a section about informal versus formal evaluation the authors show examples of how old the concept of evaluation is based on the description, “examining and judging to determine value.” These examples include the Welsh longbows in the Hundred Year War, Neanderthals, students, clients, and managers. Other sections like Uses and Objects Evaluation (page 14) also layout examples and use cases for evaluators. These examples set a foundation for the authors to show how ‘faulty’ judgments which are, “characterized by an absence of breadth and depth…(or) systematic procedures and formally collected evidence…(or) based on past experience” can cause the evaluators to make inaccurate decisions based on “biases” that may lead to unintended consequences or outcomes. The authors further identify the flaws in informal evaluations and evaluators as less cognizant of limitations, biased, lacking information, unrealistic, but sometimes the only realistic approach due to time/budget constraints. They do note that sometimes, if the evaluator is, “knowledgeable, experienced, and fair” an informal approach may be the best option for situations where formal evaluations can not be conducted.

On page 10 the authors reiterate, “that the basic purpose of evaluation is to render judgments about the value of whatever is being evaluated. Many different uses may be made of those value judgments, as we shall discuss shortly but in every instance the central purpose of the evaluative act is the same: to determine the merit or worth of some thing (in program evaluation, of the program or some part of it).”

Page 11 offers differing views and methods on evaluation, but they are not relevant to this question.

On page 13 the authors seem to agree with Lipsey (2000) that evaluators may play a secondary role as a scientific expert whose, “expertise is to track things down, systematically observe and, analyze, and interpret with a good faith attempt at objectivity.” They also state that the role of an evaluator is to educate users to an evaluators purpose including operational roles, managerial, program management, business development, and requirements gathering while advocating for users who may be disenfranchised.

On page 15, they define the evaluation object, evaluand, or evaluee is defined as anything/anyone who is being evaluated. The authors however have decided to avoid technical jargon (“precise language”) except in direct quotes.

Pages 23-25 define the roles of internal vs. external evaluators.

Based on the consistent references to Scriven and the positive references to his quotes, I would also assert that the authors of PEAAPG agree with most of his methods and definitions related to evaluation.

Page 27 defines the limitations of evaluations. In short, the authors state that it is not a magic wand and won’t effect a solution. It is a model to give context and bring about improvement.

Page 27-28, section, “Major Concepts and Theories” specifically defines and recaps the six key definitions relating to evaluation in this chapter:

  1. Evaluation is the identification, clarification, and application of defensible criteria to determine an evaluation object’s value, its merit or worth, in regard to those criteria. The specification and use of explicit criteria distinguish formal evaluation from the informal evaluations most of us make daily.
  2. Evaluation differs from research in its purpose, its concern with generalizability, its involvement of stakeholders, and the breadth of training those practicing it require.
  3. The basic purpose of evaluation is to render judgments about the value of the object under evaluation. Other purposes include providing information for program improvement, working to better society, encouraging meaningful dialogue among many diverse stakeholders, and providing oversight and compliance for programs.
  4. Evaluators play many roles including scientific expert, facilitator, planner, collaborator, aid to decision makers and critical friend.
  5. Evaluations can be formative or summative. Formative evaluations are designed for program improvement and the audience is, most typically, stakeholders close to the program. Summative evaluations serve decisions about program adoption, continuation, or expansion. Audiences for these evaluations must have the ability to make such “go-no go” decisions.
  6. Evaluators may be internal or external to the organization. Internal evaluators know the organizational environment and can facilitate communication and use of results. External evaluators can provide more credibility in high-profile evaluations and bring a fresh perspective and different skills to the evaluation.

  • Fitzpatrick, J., Sanders, J. & Worthen, B. (2011). Program Evaluation: Alternative Approaches and Practical Guidelines (4th ed.). New York: Pearson.

Question 1. How does Kirkpatrick define “evaluation”?

Question 1. How do Fitzpatrick, Sanders & Worthen (2011), Kirkpatrick (1998) and Newby (1992) each define “evaluation”?

Kirpatrick is introduced on page 15 of the book Program Evaluation: Alternative Approaches and Practical Guidelines (PEAAPG). In a section that defines the object of a formal evaluation study, Kirkpatrick’s (1983) model for evaluating training efforts with broad and varying categories. This however, is not directly related to the question regarding the 1998 definition of evaluation by Kirkpatrick.

In Evaluating Training Programs: The Four Levels (ETP), Kirkpatrick creates a format to evaluate training programs in a four step sequence. These levels are cross dependent and meant to be followed in sequential order. The levels are reaction, learning, behavior, and results. In the intial (formative) phase, reaction, Kirkpatrick defines customer satisfaction as a phase object criteria and customers and in-house participants of required programs as stakeholders.

Learning is still a formative phase in Kirkpatrick’s training evaluation methodology. It is what program managers would call the norming and storming phase of a program. It requires a change agent and behavioral modeling. Kirkpatrick defines a successful criteria of this phase as, “attitudes changed, knowledge increased, or skills improved.”

In the behavioral phase, Kirkpatrick’s model measures the success of phase 2, or the learning phase, by assessing the quality and extent of behavioral change related to the criteria laid out in phase two. He also notes limitations or risks to development in these criteria including organizational, climate, social, rewards, discouragement, and managerial. This phase appears to me, to be a summative phase since it is evaluating the progress of learning after the training has been completed.

The results phase is also summative. Kirkpatrick defines the object criteria for this phase as increased production, improved quality, decreased costs, reduced frequency and/or severity of accidents, increased sales, reduced turnover, and higher profits. Kirkpatrick also notes that not all results are quantitative or measureable with money, some results are qualitative and measured by a change in the office culture, motivation, time management, empowerment, communication, or improved morale.

In the summary, Kirkpatrick states that the four levels are “considered in reverse” to plan and evaluate reaction.

Based on pages 19-24 of Chapter 3, I do not see a clear cut definition of evaluation laid out by Kirkpatrick. I suppose he implies a certain framework for the definition in his model as constantly reacting to behavioral criteria and pre-set goals by the evaluators.

  • Fitzpatrick, J., Sanders, J. & Worthen, B. (2011). Program Evaluation: Alternative Approaches and Practical Guidelines (4th ed.). New York: Pearson.
  • Kirkpatrick, D. (1998). Evaluating Training Programs: The Four Levels (2nd ed). San Francisco: Berrett-Koehler (pages 19-24).

Question 1. How does Newby (1992) define “evaluation”?

Question 1. How do Fitzpatrick, Sanders & Worthen (2011), Kirkpatrick (1998) and Newby (1992) each define “evaluation”?

Newby offers a model for evaluating training, called context evaluation.

In the Training Evaluation Handbook, Newby activities to their stated intention as a method to “make investment decisions about merits and demerits of activities.” He further identifies example activities to evaluate like training, development, morale, and communication. The object criteria or goal is to identify positive and negative risk in order to encourage positive change and mitigate negative activites.

Since Newby is specifically interested in facilitating change for industrial business his methodology towards evaluation is based on, “the context of purposeful activity” or goal oriented training and developments. Newby’s strategic approach to evaluation reminds me a lot of S.M.A.R.T. goals and PMBOK program management planning. In the first phase, Newby encourages readers to create an action plan which will become, “a detailed plan for implementing strategic and evaluation concepts.”

On page 2, Newby asks the reader to calculate costs of training, poor trianing, and the cost of deficiences where performance may suffer due to lack of training. This I assume is to create a business case to advocate to management, while developing a model to evaluate return on investment (ROI). This might be considered the object criteria for success, or the metrics to determine if training is worth-while.

By page 8, Newby begins to define the role of evaluation. This is still the formative phase of the evaluation process. It helps to justify and identify the usefulness and validity of maintaining training programs. Newby points out on page 12 that training requires clear cut expectations and clear corporate purposes to be successful. This strategy offers various models including sales based approach, diagnosis, a behavioral approach, and a marketing based approach.

Newby goes on to compare good and bad approaches to training, why some work better than others. How they can be effective or wasteful, and what management needs to understand before initiating a training program. This chapter, “emphasised the close relationship between corporate strategy, successful training and evaluation. Corporate purpose needs to be tested regularly against external reality, so that the mission statement – graduated down in the form of work performance standards – can drive the organization (and its training activities) to respond appropriately.

  • Fitzpatrick, J., Sanders, J. & Worthen, B. (2011). Program Evaluation: Alternative Approaches and Practical Guidelines (4th ed.). New York: Pearson.

Question 2. What are the differences and similarities in the meanings, contexts and scopes of their definitions?

Each definition of evaluation offers limitations to inform evaluators so that they can be cognizant of risk and influences to decisions.

On page 20, Newby emphasises the importance of meta-evaluation or evaluation of the evaluation. The entire process for Fitzpatrick and Newby state the importance of a clear objective. Kirkpatrick, Newby, and Fitzpatrick, Sanders, and Worthen all state that evaluation requires criteria of achievement. Kirkpatrick and Newby did not specificly identify the terms formative and summative but they use the premise in their methodologies. All three texts require accuracy, metrics (qualitative or quanitative), stakeholder identification, a market analysis or analysis of existing programs, and ongoing post-evaluation is encouraged.

Fitzpatrick, Sanders, and Worthen primarily focus on defining terminology; while Kirkpatrick and Newby focus on methods for implementation with little value placed on external jargon. Kirkpatrick strictly adheres to a four step model- while Newby encourages varying stages of evaluation and change that can jump around. Newby is also very pessimistic about stakeholders and optimization of training.

Additional discussion points

This book was last published in 2011. It is five years old, and was published four years after the first iPhone. All of the information related to the terms evaluation are still very relevant and valid, except perhaps the perception that evaluation is undefined, it may be disagreed upon, but based on my brief research for this discussion board it appears to be thoroughly discussed, referenced, and defined in multiple formats for various types of work and methodologies. I wonder what the authors would say about modern evaluation methodology and its influence on stakeholders.


via Blogger


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s