Michael Yeap...PhD Candidate: 20101127 - concepts & defintions of Evaluation research (Stern)

2. Can evaluation be defined?

Stern, E. (2004). Philosophies and types of evaluation research. In Descy, P.; Tessaring, M. (eds), The foundations of evaluation and impact research.

There are numerous definitions and types of evaluation. There are, for example, many definitions of evaluation put forward in handbooks, evaluation guidelines and administrative procedures, by bodies that commission and use evaluation. All of these definitions draw selectively on a wider debate as to the scope and focus of evaluation. A recent book identifies 22 foundation models for 21st century programme evaluation (Stufflebeam, 2000a), although the authors suggest that a smaller subset of nine are the strongest. Rather than begin with types and models, this chapter begins with an attempt to review and bring together the main ideas and orientations that underpin evaluation thinking.

Indicating potential problems with 'definition' by a question mark in the title of this section warns the reader not to expect straightforward or consistent statements. Evaluation has grown up through different historical periods in different policy environments, with inputs from many disciplines and methodologies, from diverse value positions and rooted in hard fought debates in philosophy of science and theories of knowledge.

While there is some agreement, there is also persistent difference: evaluation is contested terrain. Most of these sources are from North America where evaluation has been established – as a discipline and practice – and debated for 30 or more years.

2.1. Assessing or explaining outcomes

Among the most frequently quoted definitions is that of Scriven who has produced an evaluation Thesaurus, his own extensive handbook of evaluation terminology: '"evaluation" refers to the process of determining the merit, worth or value of something, or the product of that process […] The evaluation process normally involves some identification of relevant standards or merit, worth or value; some investigation of the performance of evaluands on these standards; and some integration or synthesis of the results to achieve an overall evaluation or set of associated evaluations.' (Scriven, 1991; p. 139).

This definition prepares the way for what has been called 'the logic of evaluation' (Scriven, 1991; Fournier, 1995). This logic is expressed in a sequence of four stages:
(a) establishing evaluation criteria and related dimensions;
(b) constructing standards of performance in relation to these criteria and dimensions;
(c) measuring performance in practice;
(d) reaching a conclusion about the worth of the object in question.

This logic is not without its critics (e.g. Schwandt, 1997) especially among those of a naturalistic or constructivist turn who cast doubt on the claims of evaluators to know, to judge and ultimately to control. Other stakeholders, it is argued, have a role and this changed relationship with stakeholders is discussed further below.

The most popular textbook definition of evaluation can be found in Rossi et. al.'s book Evaluation – a systematic approach: 'Program evaluation is the use of social research procedures to systematically investigate the effectiveness of social intervention programs. More specifically, evaluation researchers (evaluators) use social research methods to study, appraise, and help improve social programmes in all their important aspects, including the diagnosis of the social problems they address, their conceptualization and design, their implementation and administration, their outcomes, and their efficiency.' (Rossi et al., 1999; p. 4).

Using words such as effectiveness rather than Scriven's favoured 'merit worth or value' begins to shift the perspective of this definition towards the explanation of outcomes and impacts. This is partly because Rossi and his colleagues identify helping improve social programmes as one of the purposes of evaluation. Once there is an intention to make programmes more effective, the need to explain how they work becomes more important.

Yet, explanation is an important and intentionally absent element in Scriven's definitions of evaluation:

'By contrast with evaluation, which identifies the value of something, explanation involves answering a Why or How question about it or a call for some other type of understanding. Often, explanation involves identifying the cause of a phenomenon, rather than its effects (which is a major part of evaluation). When it is possible, without jeopardizing the main goals of an evaluation, a good evaluation design tries to uncover microexplanations (e.g. by identifying those components of the curriculum package that are producing the major part of the good or bad effects, and/or those that are having little effect).

The first priority, however, is to resolve the evaluation issues (is the package any good at all, the best available? etc.). Too often the research orientation and training of evaluators leads them to do a poor job on evaluation because they became interested in explanation.' (Scriven, 1991, p. 158).

Scriven himself recognises that one pressure moving evaluation to pay greater attention to explanation is the emergence of programme theory, with its concern about how programmes operate so that they can be improved or better implemented. A parallel pressure comes from the uptake of impact assessment associated with the growth of performance management and other managerial reforms within public sector administrations.

The intellectual basis for this work was most consistently elaborated by Wholey and colleagues. They start from the position that evaluation should be concerned with the efficiency and effectiveness of the way governments deliver public services. A core concept within this approach is what is called 'evaluability assessment' (Wholey, 1981). The starting point for this assessment is a critical review of the logic of programmes and the assumptions that underpin them. This work constitutes the foundation for most of the thinking about programme theory and logical frameworks. It also prefigures a later generation of evaluation thinking rooted more in policy analysis that is concerned with the institutionalisation of evaluation within public agencies (Boyle and Lemaire, 1999), as discussed further below.

These management reforms generally link interventions with outcomes. As Rossi et al. recognise, this takes us to the heart of broader debates in the social sciences about causality: 'The problem of establishing a program's impact is identical to the problem of establishing that the program is a cause of some specified effect. Hence, establishing impact essentially amounts to establishing causality.' (Rossi et al., 1999).

The difficulties of establishing perfect, rather than good enough, impact assessments are recognised by Rossi and colleagues. This takes us into the territory of experimentation and causal inference associated with some of the most influential founders of North American evaluations such as Campbell, with his interest in experimental and quasi-experimental designs, but also his interest in later years in the explanatory potential of qualitative evaluation methods.

The debate about experimentation and causality in evaluation continues to be vigorously pursued in various guises. For example, in a recent authoritative text on experimentation and causal inference, (Shadish et al., 2002) the authors begin to take on board contemporary criticisms of experimental methods that have come from the philosophy of science and the social sciences more generally. In recent years, we have also seen a sustained realist critique on experimental methods led in Europe by Pawson and Tilley (1997). But, whatever their orientations to experimentation and causal inference, explanations remain at the heart of the concerns of an important constituency within evaluation.

2.2. Evaluation, change and values

Another important strand in evaluation thinking concerns the relationship between evaluation and action or change. One comparison is between 'summative' and 'formative' evaluation methods, terms also coined by Scriven. The former assesses or judges results and the latter seeks to influence or promote change.

Various authors have contributed to an understanding of the role of evaluation and change. For example, Cronbach (1982, 1989) rooted in policy analysis and education, sees an important if limited role for evaluation in shaping policy 'at the margins' through 'piecemeal adaptations'. The role of evaluation in Cronbach's framework is to inform policies and programmes through the generation of knowledge that feeds into the 'policy shaping community' of experts, administrators and policy-makers.

Stake (1996) on the other hand, with his notion of 'responsive evaluation', sees this as a 'service' to programme stakeholders and to participants. By working with those who are directly involved in a programme, Stake sees the evaluator as supporting their participation and possibilities for initiating change. This contrasts with Cronbach's position and even more strongly with that of Wholey (referred to earlier) given Stake's scepticism about the possibilities of change at the level of large scale national (or in the US context Federal and State) programmes and their management.

Similarly, Patton, (1997 and earlier editions) who has tended to eschew work at programme and national level, shares with Stake a commitment to working with stakeholders and (local) users. His concern is for 'intended use by intended users'.

Virtually everyone in the field recognises the political and value basis of much evaluation activity, albeit in different ways. While Stake, Cronbach and Wholey may recognise the importance of values within evaluation, the values that they recognise are variously those of stakeholders, participants and programme managers.

There is another strand within the general orientation towards evaluation and change which is decidedly normative. This category includes House, with his emphasis on evaluation for social justice and the emancipatory logic of Fetterman et al. (1996) and 'empowerment evaluation'. Within the view of Fetterman and his colleagues, evaluation itself is not undertaken by external experts but rather is a self-help activity in which – because people empower themselves – the role of any external input is to support self-help.

So, one of the main differences among those evaluators who explicitly address issues of programme and societal change is in terms of the role of evaluators, be they experts who act, facilitators and advocates, or enablers of self help.

Source: http://www.cedefop.europa.eu/EN/Files/BgR1_Stern.pdf

Comments: Boring. Reading thru all these concepts and definitions on Evaluation Research.

Michael Yeap...PhD Candidate

Saturday, November 27, 2010

20101127 - concepts & defintions of Evaluation research (Stern)

No comments:

Post a Comment

Search This Blog

Links

Blog Archive

Facebook Badge

Twitter

Followers