Let's stop chasing our tails on impact measurement

2018-05-16

In blogs?

This blog was written by James Noble, Impact Management Lead at NPC. It is part of a pair blogs, written by James and Bethia McNeil, Director of the Centre for Youth Impact, questioning the focus of impact measurement in youth work. You can read Bethia's blog “Thoughts on the Holy Grail” here.

It is easy to understand the impulse for impact measurement. We want to support young people to achieve good things, so logically we should try to understand how effective we are at this, and learn which kinds of practices get the best results for young people. Funders and providers want to know that their money and effort have made a difference. But just because something is understandable, and maybe even desirable, doesn’t make it easy to act on, and the youth sector has been stuck on this point for a while.

We need to face the fact that ‘measuring’ impact is difficult, particularly for youth work. Our big problems are that youth work supports a developmental process that takes place over 18+ years, with umpteen influences. We cannot measure or ‘capture’ all these influences and we may not see impact until years have passed— and the Government doesn’t collect or share data we can use, in the way it does in other policy areas like schools and health.

I see these challenges as singularly practical and methodological, but the reaction to them often goes in different, and quite polarised directions.

Firstly, there’s the tendency to deny the challenges: to continue to assert the underlying argument that we should be able to test ‘what works’, perhaps citing medical research as an exemplar. The problem is that the methodological challenges are intractable so there’s always disappointment. Some wrongly blame the sector for this and start to doubt its appetite for testing itself.

A different direction is to question the idea of ‘measurement’ altogether: to see it is as a means of control or denigration of the sector, as a fundamental misunderstanding of what youth work is—even the product of a neoliberal worldview. I see this as the co-option of methodological challenges to make a political argument, and I want to remain focussed on the methodological issues, which I think can be better negotiated to help resolve these tensions.

A first—brief—point is that there is benefit to setting out what we want to achieve with and for young people and how. NPC calls this developing a “theory of change”, but what you call it doesn’t really matter. It’s basically a question of agreeing and articulating:

The nature of the issue you want to address (context)
The characteristics of the people you want to reach (users)
The long-term positive difference you hope they will achieve for themselves (impact)
The shorter-term changes / improvements / assets you want to develop in people, which you think will make impact more likely (outcomes)
How your provision plans to help people achieve these outcomes (activities and mechanisms).

This process is useful whatever your underlying perspective or aim: whether it’s trying to build young peoples’ ‘employability’ or empowering “questioning, compassionate young citizens committed to the development of a socially just and democratic society”.  It helps, because anyone doubting the sector’s commitment to evaluating outcomes and impact should be reassured by the articulation of a clear plan that can be tested; while those who want to argue for different approaches have an opportunity to do so on equal terms.

However, my main suggestion is that we make the measurement question more manageable by acknowledging that studying longer-term outcomes and impact is difficult, and that we should do it sparingly. In particular, I think it helps providers to think about two distinct questions:
1.     Are we delivering our service ‘well’? In other words, are we effectively implementing the plan described in our theory of change? Do we reach, engage, and build positive relationships with the young people we want to support?
2.     Does the service we are delivering make a difference? (the outcomes and impact part of our theory of change).

Providers should aim to answer the first question routinely by collecting user, engagement and feedback data—because this is part and parcel of delivering a good quality service. But, critically, providers do not have to answer the second question all the time. Once we are confident something is effective we can stop testing it. Measuring outcomes and impact should be reserved for when there are gaps in our understanding; new approaches or practices, user groups, contexts and other unanswered questions. And we should start small; record observational evidence of outcomes where possible, leading to more robust studies with small samples and eventually larger evaluations—but only if there is funding for it and a strong rationale in terms of improving the evidence base. Moreover, larger studies should be coordinated across providers and run by specialists to maximise quality—like the ongoing learning element of the Youth Investment Fund.

The effect of this change could be profound. If providers feel less pressure to ‘prove’ their own version of the youth work model, our energies might be better directed towards strategic questions like how to reach and engage those young people experiencing the greatest need, understanding the mechanisms that work across different settings, and which aspects of programme design are most valuable, for whom and why. This is the real opportunity for an evidence led social sector, not the endless cycle of programme evaluations, which are often more about organisations ticking a box rather than learning, and so have limited influence on wider practice.

This argument does raise the question of when we can be confident something is effective? And my third point is to reject the view that confidence can only be provided by programme-level Randomised Control Trials (RCTs) which compare treatment to control groups[1]. The logic behind RCTs is powerful, but this power diminishes as the focus of what is being studied is broadened. So, a scientist in a laboratory can, theoretically, control all conditions to be sure they are testing the effect of one thing on another, but a youth provider cannot control or limit the countless daily processes and choices needed to deliver a ‘programme’ (nor should they try to).

The results of RCTs of youth programmes are the product of a unique context, innumerable events that will never be repeated, and at best give us a clue that something about a ‘programme’ has worked for some participants. This has limited generalisability, so does not ‘prove’ that the programme will work elsewhere, and it doesn’t explain how the programme worked. It is argued that repeated RCTs will start to turn these clues into an evidence base, and this has happened—over decades—in specialist fields like cognitive behavioural therapy, but we are a long, long way from that. For example, the Realising Ambition programme took six years and a good part of its budget on producing just two inconclusive RCTs. It is hard to argue that more of these kinds of studies are the best use of our resources.

This is not an admission of defeat, but an opportunity. Once we accept the natural limits to the level of ‘proof’ available in the youth sector, we can refocus on genuinely useful research questions. It should also help us to appreciate the value in all types of research, from validated surveys, benchmarking and value for money analysis, to practitioners’ observations; what Nancy Cartwright has referred to as ‘vouching’ rather than ‘clinching’ evidence.[2] Indeed, if we draw on all the vouching evidence already available, we could probably make a very strong case for ‘what works’ in supporting young people.

In summary, this is a call is for a rethink of what ‘measurement’ is for, and what it can achieve. I want us to move away from the cycle of funders and commissioners always expecting outcomes and impact data, and providers trying to ‘prove’ themselves in denial of the constraints, towards a ‘real world’ attitude to measurement, with a better set of research questions and methods that can answer them.

----------
[1] And by extension the standards of evidence that privilege these kinds of studies.