The Principled Agent

Thoughts on development economics and impact measurement

What can “Pay for Success” learn from health care?

Linking payment to performance on pre-defined outcomes makes a lot of sense. And in the social sector, where payment has historically been linked to simply providing services, it’s easy to understand the excitement for Social Impact Bonds and “Pay for Success” contracts. You get what you pay for, the thinking goes, and a shift from outputs to outcomes could be transformative.

Yet it is worth reflecting on the experience of those who have come this way before us, and just as the social sector has followed the health care industry’s lead on randomized controlled trials, so too the recent enthusiasm for “Pay for Success” follows the health care sector’s embrace of “pay for performance” (P4P) and “value-based purchasing”.

Since the early ‘00s, private and public health care plans have experimented with paying providers based on their performance on various outcome measures (as well as compliance with guidelines). With over 100 private and public pilots underway and a mandate for the government to shift to “value-based” purchasing, adoption has been significant.

Yet now as the industry looks back on more than a decade of P4P, they haven’t seen the transformation. In what I’ll call the “The P4P Paradox”, were the proponents of health care P4P paid based on the performance of P4P, they alas would not be paid.

“If you were around and read those opinion pieces that came out in 2001 and 2002, there was excessive exuberance about how pay-for-performance was going to solve everything. We actually have remarkably few evaluations that have a comparison group of any kind, so the evidence on pay-for-performance is rather spotty. The programs we’ve evaluated over the last five years have been largely unimpressive in their results,” said Meredith B. Rosenthal, PhD, P4P researcher and associate professor of health economics and policy at the Harvard School of Public Health.

The Cochrane literature review summarizes, “There is insufficient evidence to support or not support the use of financial incentives to improve the quality of primary health care. Implementation should proceed with caution and incentive schemes should be more carefully designed before implementation.”

As noted, the absence of proof is not proof of absence, yet in light of the excitement for “Pay for Success”, the lack of proof of any sort of positive impact after a decade should be humbling. Looking forward, few would be satisfied if in 2023 we had to say there was “no evidence that financial incentives can improve [beneficiary] outcomes.”

At the same time, there is much for the social sector to learn from the health care experience to date. Perhaps the most important takeaway is that the devil is in the details:


Many speculate that the amount of performance pay –in some cases, 5-9% of revenue for a provider– is too small. Some physicians have suggested that 10% of revenue would need to be performance-based to have an impact. For the typical non-profit organization, what percentage of potential donor revenue do “pay for success” donors comprise?  When outcomes-linked dollars are relatively scarce, a large amount of money may serve as a small incentive if it’s a small amount relative to the total pie.

Frequent payment may increase awareness of performance targets. A one-time payment may not influence behavior as much as a monthly check, for example, as the latter may increase awareness and act as a more effective feedback mechanism.

Who cashes the check and who dictates success are not always the same person. Organizations typically receive performance incentives, yet it is the staff that’s responsible for dictating success. On the other hand, rewarding individual staff for performance may not provide sufficient institutional incentive.

Defining what “performance” means is often difficult. For example, the quality of surgical processes doesn’t necessarily correlate with surgical outcomes. Quality can be difficult to measure, and the causal link between many outputs (e.g., successful surgeries) and outcomes (e.g., long-term complications, mortality) often isn’t as consistent or strong as our intuition would have us believe.

Unintended Consequences

Providers may avoid people in greatest need as they require the most assistance to meet performance targets. If threshold targets are used, there may be less of an incentive to dramatically improve the outcomes of the worst off than to marginally improve the outcomes of those much better off. One innovation has been “risk adjustment”, whereby compensation takes into account the underlying health status of the person. This adjustment avoids penalizing providers for working with the poorest and indeed in theory allows scaling up compensation for helping those at greatest risk.

Rather than improving care, P4P may catalyze greater documentation. Improvements in performance measures may be due to better data capture, rather than actual improved outcomes.

IT and data collection requirements of P4P may disadvantage smaller organizations. Larger organizations which typically have more robust IT systems often have a greater ability to monitor and report metrics.

Ongoing monitoring of incentive programs is critical to determine whether incentives are having unintended effects on quality of care. For example, there’s always the risk of “teaching to the test.”

While I find his evidence less than compelling, Dan Ariely suggests that in our desire to draw up more “complete contracts”, we may just increase administrative costs and crowd out non-monetary motivation.



Examine how the effectiveness of P4P in improving outcomes varies by intervention. Health research suggests it will vary considerably.

Invest in evidence for the effectiveness of pay-for-performance in low- and middle-income countries. In health care, the quality is too poor to draw any conclusions.


P4P does lead to greater IT adoption, improved data collection, and accountability for quality measures. A short-term outcome of P4P is more IT staff.

Payers and providers do appreciate the greater multistakeholder collaboration, emphasis on quality improvement and IT investments, and greater focus on accountability and transparency.

An incentive can catalyze a large change in behaviors or practices and yet have a minimal effect on outcomes. Provider perception and empirical evidence suggests that pay-for-performance has led to changes in health care provider practices, it’s just not translating into improved outcomes.

Theoretical cost savings arguments that in part motivated payers initially to take part in P4P did not bear fruit.

More broadly, marginal results discourage payers.  Health plans were left feeling that “the P4P program had not achieved the stated goal of breakthrough quality improvements.” As noted, the theoretical cost savings were also not realized. As a result, just when many believe that the secret will be increasing the performance-linked pay amount, “plans rated [increasing the amount] as a low priority because of the marginal results attained thus far and questions about what other types of investments might yield larger improvements.”


Written by Chris Prottas

January 14, 2013 at 9:01 pm

Posted in measurement

