Pay-for-Performance: Practical Guidance for Decision-Making
and the Latest Evidence
May 16, 2006
2:00-3:30 p.m. EST
A Free Web Conference for Health Care Purchasers, Providers, Health Plans,
and Health Policy Makers
Audience Questions and Faculty Responses
from a May 2006 Web Conference
On May 16th, 2006 AHRQ conducted a Web conference for health care purchasers, health plans, and providers. The Web conference featured a panel of health services researchers and purchasers who shared the latest evidence and practical experiences on a range of issues related to the design and implementation of pay-for-performance initiatives. The discussion that followed resulted in a number of questions from the audience. These questions are answered below by the panelists who participated in the Web conference.
R. Adams Dudley, M.D., M.B.A., University of California, San Francisco
David Kelley, M.D., M.P.A., Pennsylvania Department of Public Welfare
Douglas Libby, R.Ph., Maine Health Management Coalition (MHMC)
Meredith B. Rosenthal, Ph.D., Harvard University School of Public Health
Gary Young, J.D., Ph.D., Boston University School of Public Health
Financial Incentives & Reimbursement |
What percentage of a physician's income from P4P incentives is needed to influence a physician's behavior?
There are (at least) two ways of thinking about this—how much money will get an individual physician's attention and be worth altering current patterns of care? Or, how much will be needed to balance the cost of improvement and thus make change profitable for a physician/hospital? In both cases, the answers are going to vary depending upon the setting—there is published evidence that very small incentives work in some cases while very large ones have failed in other cases. We generally advise taking the second perspective and trying to learn something about what it takes for a physician/medical group/hospital to deliver the quality measure—then try to make the incentive match up with the additional (unreimbursed) costs that will be incurred to make the desired improvement. M. Rosenthal
What percentage of reimbursement have you seen at risk...for hospitals and for physicians?
For hospitals, the typical dollar amount at risk is 1 to 2 percent of payments (sometimes restricted to the diagnoses covered by the measures); for physicians 5 to 10 percent is the typical range of dollars at risk. M. Rosenthal
What level of financial incentives is necessary to change physician behavior or change practice patterns?
This depends on several things, such as your market share (Medicare is getting almost all hospitals to report 10 to 20 measures for a 0.4 percent difference in payment, but a small purchaser could never do that) and how hard it is to do what you are asking providers to do (you will need to offer a bigger incentive to get doctors to make smokers actually quit than you will just to get them to offer smoking advice or make referrals to smoking cessation programs).
In addition, you probably shouldn't focus just on the financial issues, as there are many other things that could ease the process and increase the probability of real behavior change. For instance, if purchasers were to provide financial incentives and do a public report, the public report increases the strength of the incentive. In addition, purchasers could fund a community-wide educational and quality improvement collaborative that would reduce the costs to providers of improving and build positive relationships at the same time. Having a multi-pronged approach might reduce some of the friction over the exact magnitude of the financial payout. A. Dudley
Please comment on the use of absolute increases that exceed a threshold vs. relative increases in performance. How to reward the high performer already at 80 to 90 percent of goal? How to reward the poor performer that moves from 10 to 30 percent toward goal.
One could imagine charting out a reward system where payments are a function of the reduction in the failure rate (1 minus the performance rate) so that a small improvement from 80 percent performance counts more than a small improvement from 10 percent. This assumes that it is easier to make improvements at the low end than the high end (which is surely true at some level, but probably varies depending upon the measure). M. Rosenthal
What level of financial incentives is necessary to change physician behavior or change practice patterns?
This depends on several things, such as your market share (Medicare is getting almost all hospitals to report 10 to 20 measures for a 0.4 percent difference in payment, but a small purchaser could never do that) and how hard it is to do what you are asking providers to do (you will need to offer a bigger incentive to get doctors to make smokers actually quit than you will just to get them to offer smoking advice or make referrals to smoking cessation programs).
In addition, you probably shouldn't focus just on the financial issues, as there are many other things that could ease the process and increase the probability of real behavior change. For instance, if purchasers were to provide financial incentives and do a public report, the public report increases the strength of the incentive. In addition, purchasers could fund a community-wide educational and quality improvement collaborative that would reduce the costs to providers of improving and build positive relationships at the same time. Having a multi-pronged approach might reduce some of the friction over the exact magnitude of the financial payout. A. Dudley
What about combining physician and patient compliance to evidence-based clinical guidelines by incenting both for following such guidelines, e.g. $ 10 to patients who complete a survey regarding their primary care visit and $ 15 to the physician for following the EB guidelines?
Incentives to patients can be a valid approach, but only if used VERY carefully. It is clear that incentives simply to reduce utilization lead patients to make bad decisions (they are just as likely to drop necessary care as unnecessary care, because they mostly can't tell the difference). These decisions can then be costly to payors, such as when a patient with difficult-to-control hypertension stops taking his medications because all the copayments have been raised, then has a stroke.
However, incentives to patients where you are certain the behavior is in the patient's and purchaser's best interests as well can be useful. Examples might include giving discounts to non-smokers, offering diabetics who complete all their preventive care a choice of movie tickets or a small cash prize, etc. (The response of consumers to airline frequent flier miles programs, for instance, strongly suggests that incentives do not always have to be cash.) A. Dudley
Doug's model has a 1 percent "carrot" and a 1 percent "stick." How does this tie in with the earlier comment that "carrot" is always better?
MHMC's employer/purchaser members may not agree that the carrot is always better. In fact, they believe that “new money” bonuses are likely not sustainable, absent a concrete/guaranteed ROI model. Thus, they feel that, absent bullet-proof evidence that savings in one area won't be offset by increased costs/utilization in other areas, a reformed reimbursement system must be a zero-sum game….that is, increased payments to higher value providers will ultimately come from decreased reimbursement for lower value providers or that increased payments to one provider sector (example: PCP's) will have to be offset by decreased costs in other sectors (example: specialists). D. Libby
How do we adequately address risk adjustment issues and truly reward provider performance?
This is a very complex issue that will likely receive more attention as P4P becomes more widely diffused throughout the country. At present, most programs focus on process (versus outcomes) measures and so risk adjustment has been less of a concern on the part of providers. If programs begin to include outcomes measures, risk adjustment will almost certainly become a salient issue. There has been much progress in the last decade in developing good risk adjustment techniques though providers always worry that their performance on quality measures is negatively affected by the fact that they treat patients with less favorable clinical or socio-demographic characteristics than those of their colleagues'. G. Young
Have non-fiscal incentives, e.g. peer/public recognition versus fiscal rewards been tried as rewards? for achieving quality goals?
Both public reporting (which is recognition for some and public humiliation for others) and other types of quality awards are fairly commonplace. There is not an enormous literature supporting their effectiveness although there are a couple of recent studies showing that public reporting encourages providers to change their behavior—both in positive (quality improvement) and negative (avoidance of sick patients) directions. When surveyed about the impact of public recognition, hospital administrators indicated that it did/would get them to respond but that such response would not be sustained unless there was a financial consequence. M. Rosenthal
What is an acceptable number of performance measures upon which physicians should be measured?
The goal should be to measure as much of a physician's practice as possible, as this would give a more accurate and fair assessment of performance. Unfortunately, however, the large measure set this would require is not easily available in any community. Therefore, all stakeholders must balance the costs of data collection versus the benefits of understanding performance. Historically, however, too much emphasis has been placed on the data collection and analysis costs, with the result that quality problems have run rampant through the system (the cost of complications we fail to prevent is almost certainly much greater than even a fairly elaborate performance measurement system would be). A. Dudley
What is the minimum number of cases that should be used to rate physician performance across a measure?
There is no magic threshold at which you suddenly get a valid measurement, but with one case fewer your measurement is invalid. Rather, each increase in sample size increases your confidence in your estimate of performance. Furthermore, while random events will have a big effect on a particular performance indicator if a physician has only a few patients for that indicator, if you are actually using lots of indicators, the random effects across indicators would be expected to roughly balance each other out. As a result, many health plans currently use relatively low thresholds to include a physician or hospital in a measure (5-15 cases is common). A. Dudley
Do you have P4P experience with outcome measures such as mortalities?
There has been very little P4P on mortality, mainly because most P4P has been on the physician and outpatient side, where mortality is a long term issue influenced by many non-physician factors (e.g., death from hypertension occurs after many years and is greatly influenced by the patient's willingness to diet and exercise). Most mortality measures are calculated for hospitals (e.g., after surgeries or major medical events like heart attacks), and hospital P4P is just getting started. On the other hand, there is, in general, a much longer history of measuring outcomes at the hospital level than the physician level and there are more outcomes measures available for hospitals. A. Dudley
For some of the national measure sets a number of hospitals may have reached a high degree of compliance. How then can they differentiate providers well?
It is still unusual for all providers in an area to be doing well on all indicators. However, if this is the case in your area, your best option is to congratulate those providers on a job well done and ask them what new area of care should be measured and improved. Then offer to assist them (either in kind or through a planning grant) since they cannot simply take a national measure off the shelf. They will need some convener/organizer to call them together to select and define new local measures. A. Dudley
Under what circumstances would you retire a measure that is part of a well established P4P program?
You might consider this when all providers have achieved near-optimal performance, or even before then if performance community-wide is good enough that you think you can get more return on your investment of measurement resources by changing to a different measure. However, you would want to ask yourself whether performance is going to fall once you stop measuring it, and may consider checking it again every few years. If you announce ahead of time that you will be rechecking the measure at some point, it is less likely to fall off of radar screens.
For instance, by now, most patients having a heart attack receive an aspirin right away (this was one of The Joint Commission's first core measures and CMS' first measures on its Web site). In many situations, this high level of performance might be “sticky,” in that it has been achieved primarily by creating protocols at hospitals that are followed whenever a heart attack patient comes in. It is unlikely, if you remove aspirin from your P4P, that hospitals will go and remove aspirin from their protocols or get rid of the protocols altogether, so you might be able to drop this measure at some point in the near future. However, you should plan to go back and check performance later, since no one really knows how “sticky” quality improvements really are (in fact, it's a bit difficult to really understand why quality is so bad right now!) A. Dudley
What are some examples of efficiency or cost savings measures that researchers would recommend?
AHRQ has funded a review of the evidence on efficiency measures that should come out soon. However, in general there is much less data to support the validity of the efficiency measures currently being widely marketed than there is to support the widely used quality measures. That said, there are some situations in which you can probably get local providers to agree on important efficiency-related measurements, like risk-adjusted length of stay for certain conditions or the appropriateness of procedures. A. Dudley
Are there any P4P programs that measure performance based on measures such as A1C, LDL, etc?
Many programs use as quality measures the administration of these tests — whether patients received them. But far fewer programs use as quality measures the actual levels or scores from the tests themselves. Most programs use claims data for measurement purposes and the scores from these tests are not available from such data. G. Young
Is it feasible to group specialists and develop measures for the grouping rather than individual specialists? e.g. site verification for all surgical specialists vs. time to consult for non-procedural specialists
There is certainly some interest on the part of program sponsors and provider groups to develop such measures. However, I do not believe any group has yet really tested the feasibility of such measures. G. Young
If one is publicly reporting the measured quality, should one report incentivized quality differently than quality that is not incentivized?
The main reasons to report publicly are: 1) to help consumers make decisions, and 2) to give providers additional incentive beyond the direct payments. For consumer decision-making, it doesn't matter whether a financial incentive is also offered. All that matters is level of performance, however achieved. For creating a reputational incentive, I can't see a reason why it would be important to denote those measures on which you also pay, but maybe I'm missing something. A. Dudley
Can you discuss the importance of making public the results of any P4P program? Judith Hibbard's analysis of the Wisconsin hospital reporting program indicated that providers respond much better to quality improvement in response to public reports rather than confidential reports.
Yes, these and other studies suggest that public reporting and P4P should be considered complementary incentives that work in different ways. The key issue is that the hardest part of both public reporting and P4P is getting valid data in the first place. Once you have done that, you might as well use it as fully as possible. The only thing to watch out for is that, for new measures with which providers have little experience, they often may not want public scrutiny of their initial performance estimates (they also may not want to have their pay based on this—providers vary about whether they are more sensitive about public reporting or P4P). So, a “pilot period” may be appropriate. A. Dudley
Are some measures better suited for P4P, while others are better suited for public reporting without any financial incentives?
In fact, I can't think of much reason not to pay on all measures on which you report EXCEPT when you are reporting something that is really a “style” measure rather than a performance measure. For instance, if you chose to report a C-section rate in maternity care, no one could tell you what the right rate is, but some expectant parents would have a preference for places with high or low rates and might want to choose on that basis, so public reporting could make sense. However, paying providers to hit some arbitrary target C-section rate when you can't defend that rate as better than any other would only cost you credibility with providers. A. Dudley
Please describe how "performance" is best operationally defined. How are measures selected? How is physician buy-in best generated?
Defining “performance” is much harder than answering the question about how to select measures and get physician buy-in. The best way to choose measures and get buy-in is to involve physicians and/or hospitals from the start, to take their input seriously (even when it seems they only offer an endless stream of objections), but then insist that they come up with a workable approach to measuring key aspects of care. This understanding but firm approach has worked in several major projects. A. Dudley
Patient Experience/Satisfaction |
Are any patient experience (i.e. CAHPS) indicators involved in the compensation formulae in any of these programs?
Many health plans and other payers are using CAHPS or other patient experience measures for pay for performance—see for example, the IHA measures (www.iha.org). Our research suggests that more than half of health plan pay-for-performance programs use patient experience measures. M. Rosenthal
Should patient perception/satisfaction be considered along with clinical measures and cost for P4P standards?
Some programs do include patient satisfaction measures along with clinical measures. At least in principle, including such measures may help to create a more balanced set of incentives for providers that deter them from certain undesirable behaviors that are conducted solely to achieve better clinical scores. G. Young
For those clinics without electronic medical records or for those county clinics addressing the needs of underserved populations, what data is available as to the effectiveness of pay for performance?
Pay-for-performance programs implemented to date rarely rely on electronic medical records (EMRs) for data—even the large integrated groups in California do not generally have robust enough EMRs in place. Most rely on claims data—so here is where the financing of community clinics may pose an extra challenge, to the extent that claims/encounter data don't exist. But in NY and MA (among no doubt others), there are quality measurement and pay-for-performance programs involving community health centers. In some cases, the programs relate to Medicaid claims data, in others data are abstracted from medical records and submitted by the centers. M. Rosenthal
Is rewarding "improvement" as opposed to absolute performance an effective way to engage and reward safety-net providers?
Rewarding improvement explicitly is one way to make sure a pay-for-performance program does not put rewards out of reach for providers with low baseline performance—some of which may be safety-net providers. There is a long discussion on incentive issues in Pay-for-Performance: A Checklist of Issues for Purchasers to Consider (question 9) that I won't repeat, but other strategies to consider in the context of safety-net providers are: provide technical assistance with quality improvement; provide bigger rewards than for providers serving well-educated, insured populations; offer consumer/patient incentives. M. Rosenthal
P4P for Ancillary and other Providers |
Does pay for performance work for ancillary providers? At present, few if any programs include ancillary providers.
There is some interest in expanding programs to include such providers but at present we have no data on how well P4P would work for this group. G. Young
What, if any, are the plans for implementing P4P in nursing home settings?
There appears to be mounting interest in this area because of increasingly available data (including CAHPS for nursing homes, CMS nursing home compare). See the section on Medicaid in the AHRQ-supported tool, Pay-for-Performance: A Checklist of Issues for Purchasers to Consider where we discuss this issue and provide some more specific examples. M. Rosenthal
What about the political difficulties of P4P at the individual doctor level? Has there been resistance from that community?
Yes, there certainly have been concerns and what some might call resistance from providers. However, as I noted during the Web conference the surveys of physicians that my research team and I have conducted indicate that physicians are comfortable with P4P as a concept and their concerns focus largely on how programs have been designed and implemented. Particular concerns relate to the size of the financial incentives and communication on the part of program sponsors about how programs are administered. G. Young
I'm glad to see PA scoring Managed Care Plans as well. Is there any provider hesitancy to do P4P without a corresponding effort by health plans to improve performance?
Several of our health plans initiated provider P4P programs prior to us implementing a health plan P4P in July 2005. We have not surveyed the managed care (HealthChoices) providers about their perception of P4P. I am meeting with the medical directors of our health plans 6/1/06 to review their provider P4P programs. I will informally pose this question to them. D. Kelley
Why do you need to choose either a physician or hospital P4P program? Is there a way to have both programs but also develop interconnectivity between the two?
In principle, I agree with you completely. However, at present we do not have good models for how to develop this interconnectivity. G. Young
Are the panelists aware of any Medicaid pay-for-performance programs, and if so, could they share any findings?
Yes. There is a section in Pay-for-Performance: A Checklist of Issues for Purchasers to Consider which deals with this (question 18) and describes some example programs. Most of them—in particular the ones that pay providers as opposed to managed care plans—are too new for results to have been evaluated. There are a couple of pilot programs that have been evaluated in the literature—mostly with not very encouraging results but these programs were focused on a single quality measure and may not generalize to current efforts. M. Rosenthal
Are PA's Medicaid Managed Care Organizations required to be NCQA accredited so that they are already compiling audited HEDIS data?
Our managed care organizations are not required to be NCQA accredited but six plans are voluntarily NCQA accredited at the “excellent” status and one plan at “commendable.” We picked HEDIS parameters because the plans were already reporting results which were published in a public report. D. Kelley
For the Maine Health Management Coalition initiative, could you identify your process and sources for obtaining patient experience and cost data?
We used the results of a common patient experience survey administered by all but one Maine hospitals as part of a project coordinated by the Maine Hospital Association. Avatar was the vendor. Cost data was derived from a proprietary claims database maintained by our Coalition. We calculate the “allowed” amount (employer and employee paid dollars) for a market basket of case mix adjusted diagnoses/procedures on both inpatients and outpatients for each Maine hospital. D. Libby
Does Medicare have a formal policy favoring use of P4P? Where is this documented?
CMS is currently conducting demonstration programs to test the P4P concept for hospitals and physician organizations. Recent legislation has also called for a plan to be developed to include P4P in the Medicare program. In public statements, CMS officials have also signaled their interest in using P4P concepts in the Medicare program. I am not aware of any formal policy beyond what I have noted above. G. Young
Back to Top
|