Did paying drugs misuse treatment providers for outcomes lead to unintended consequences for hospital admissions? Difference-in-differences analysis of a pay-for-performance scheme in England

Aims To estimate how a scheme to pay substance misuse treatment service providers according to treatment outcomes affected hospital admissions. Design A controlled, quasi-experimental (difference-in-differences) observational study using negative binomial regression. Setting Hospitals in all 149 organisational areas in England for the period 2009 – 2010 to 2015 – 2016. Participants 572 545 patients admitted to hospital with a diagnosis indicating drug misuse, de ﬁ ned based on International Classi ﬁ cation of Diseases 10th Revision (ICD-10) diagnosis codes (37 964 patients in 8 intervention areas and 534 581 in 141 comparison areas). Intervention and comparators Linkage of provider payments to recovery outcome indicators in 8 intervention organisational areas compared with all 141 comparison organisational areas in England. Outcome indicators included: abstinence from presenting substance, abstinent completion of treatment and non-re-presentation to treatment in the 12 months following completion. Measurements Annual counts of hospital admissions, emergency admissions and admissions including a diagnosis indicating drugs misuse. Covariates included age, sex, ethnic origin and deprivation. Findings For 37 245 patients in the intervention areas, annual emergency admissions were 1.073 times higher during the operation of the scheme compared with non-intervention areas (95% CI = 1.049; 1.097). There were an estimated additional 3 352 emergency admissions in intervention areas during the scheme. These ﬁ ndings were robust to a range of secondary analyses. Conclusion A programme in England from 2012 to 2014 to pay substance misuse treatment service providers according to treatment outcomes appeared to increase emergency hospital admissions.

Most studies have examined P4P schemes within either primary or secondary healthcare-there are relatively few assessments in substance misuse treatment.Evaluations of two United States (US) schemes tentatively suggest that their introduction led to improvements in certain treatment outcomes [16,17], but their interpretation is limited by study design; lack of control group, missing baseline measurements and reliance on self-reported abstinence.A more recent Washington-based study adopted a robust, quasi-experimental design to evaluate a care quality improvement scheme within substance misuse treatment [18] and found no evidence of improved care quality overall.
New models of contracting in the public sector have become increasingly common internationally [19][20][21][22].In the previous decade the United Kingdom (UK) government increased its focus on paying providers of public services according to outcomes, including a scheme incentivising recovery success within substance misuse treatment.This scheme was adopted amidst development of the UK government's drug strategy, prioritising the goal of 'recovery'-a nebulous term that superseded 'abstinence' [23][24][25].
The P4P pilot scheme represented a key pillar of the revised drug strategy that would 'incentivise the system to deliver on recovery outcomes' [25] by making treatment providers financially dependent on their achievement.The P4P pilot was also aimed at increasing efficiency and value for money [25].It was introduced in 2012 and linked payment for treatment to nationally agreed outcomes.
Previous studies have considered the impact of this P4P scheme during its operation, from 2012 to 2014 [23,26,27].These studies showed that the policy led to: lower treatment initiation; treatment completion (free of dependence); treatment completion without subsequent re-presentation; and higher waiting times.However, these studies did not consider impacts across the wider drugs misusing population, rather being limited to those in treatment.Such an approach fails to identify the impacts that lower initiations and higher waiting times may have on the population at risk of the harms associated with drugs misuse.This wider population is observed across government services such as welfare, criminal justice and hospitals.The latter allows for examination of a wider set of economic consequences and the health outcomes of those experiencing the most severe harms.
Furthermore, previous evaluations only had access to data until 2014.The pilot scheme ended in April 2014 without national adoption of the scheme-although commissioners had autonomy to incorporate elements of P4P in reimbursement models.Analyses of outcomes after the scheme's termination can provide firmer evidence of overall impact.Insight into the wider impact of the scheme can inform local commissioners and decision makers of its potential efficiency; P4P schemes remain popular with decision makers internationally [28] including within substance misuse treatment [18].
This study is therefore focused on whether the P4P scheme, operational between 2012-2013 and 2013-2014, led to unintended consequences in a wider population at risk of drugs misuse using quasi-experimental methods applied to individual-level hospital data for England from 2009-2010 to 2015-2016.This covers the period before, during and after the scheme.Specifically, we examine the following research questions.For individuals admitted to hospital with a diagnosis indicating substance misuse during the study period, did the introduction of the scheme lead to a differential change in their rate of: 1 hospital admissions, 2 emergency admissions, and 3 hospital admissions including a diagnosis indicating drugs misuse.

Intervention
Eight organisational areas (local authorities) participated in the pilot scheme in England from April 2012 to March 2014.Participating areas were selected to be representative from an applicant sample of 29 areas.Non-participating areas were permitted to operate their own commissioning models, which had been traditionally based on block grants [26,27].The P4P models applied to all services within participating areas; with all service users moved to P4P funding models during the study period.
Participant areas enjoyed a level of autonomy, such as adjusting income weights allocated to particular performance indicators, provided their schemes adhered to a national outcomes framework (Supporting information Tables S1-S4).Payments were conditional on performance indicators in three nationally specified domains: (i) progression toward abstinence from presenting drug(s) of dependence (and crack/heroin); (ii) reduction(s) in offending; and (iii) improvements in health and wellbeing.
Initial assessments and treatment plans were conducted by the newly constituted local area single assessment and referral system (LASARS).Complexity scores for individuals informed the adjustment of provider payments.LASARS could be established independently of treatment service providers, albeit they were often co-located.

Design
To examine the effect of participation in the scheme on hospital admissions, we used difference-in-differences (DiD) estimation to compare changes in participant areas to those not participating.

Participants
There is no authoritative measure of the populations needing drugs misuse treatment in each area in each year.We used International Classification of Diseases 10th Revision (ICD-10) codes related to substance misuse (Table 1) consistent with previous academic studies and government reports [29][30][31][32][33] to identify drug-related hospital admissions.The study population was then defined as individuals with at least one drug-related hospital admission (a 'qualifying admission') during the study period (2009-2010 to 2015-2016).This definition of the 'at risk' population allowed us to focus on those who experienced the most severe harms at some point before, during, or after the adoption of the experimental payment system Intervention and comparison groups were based on area of residence first recorded within the study period.Change of area of residence was possible within the study period and within financial years, possibly representing self-selection into intervention/control groups in response to or anticipation of the policy.We, therefore, fixed individuals to their initial area of residence, but generated a 'migration' sub-population for testing the sensitivity of the results to the identifying assumption of fixed residence.

Data
We used activity data from the admitted patient care data files in Hospital Episodes Statistics (HES) for the financial years 2009-2010 to 2015-2016.HES are episode-level data on care received in hospitals across England.This includes; privately paid patients treated in National Health Service (NHS) hospitals, patients resident outside of England and care delivered by non-NHS providers that is funded by the NHS [34].HES data provide clinical information (such as diagnoses and operations), patient information (such as age and sex), administrative information (such as admission date and discharge method) and information on patient area of residence [34].
Episodes in HES represent a period of care delivered under a specific consultant in a single hospital [34].These episodes are uniquely identified within spells, defined as, the period from admission to discharge.We aggregated episodes into spells, referred to as admissions herein.From the admission-level data, we constructed a longitudinal data set for each individual in each financial year from 2009-2010 to 2015-2016 (Fig. 1).
We examined the effect of the scheme on the following annual counts of hospital activity per person; the number of hospital admissions, the number of emergency hospital admissions and the number of hospital admissions containing a diagnosis of substance misuse.The DiD is the change over time in average outcome in the intervention areas minus that in the comparison areas [35].DiD assumes that, conditional on the other variables in the model, changes over time would have been the same in the intervention areas as in the comparison areas in the absence of the intervention.We formally tested for parallel pre-trends to assess the plausibility of this assumption.

Analysis
We also compared intervention areas to a restricted subset of 51 local authorities, matched for similarity on two population domains (% of persons with opiate and crack disorders in drugs misuse treatment populations (selected because of relatively high hospitalisation rates for persons with opiate/crack disorders [36] and 2010 indices of multiple deprivation [IMD] [26,37]).
We conducted analyses at the individual-level using annual counts of hospital admissions for each financial year from 2009-2010 to 2015-2016 as the outcomes.We included individual-level controls including age intervals, sex, IMD deprivation deciles and ethnic origin (all of which have been shown to affect health outcomes in substance misusers) [23,26,27,41]; in addition to year indicators, a binary indicator for being initially resident in the intervention areas, an indicator for the intervention period, an interaction between the intervention period and areas (the DiD term) and an interaction between the post-intervention period and intervention areas.
All models were clustered by individual, control for hospital facility fixed effects and were estimated using the nbreg command in Stata version 16.Estimating equations for all analyses and statistical tests are presented in the Supporting information.
We present both incident rate ratios (IRRs) and estimated marginal effects (MEs) on the component parts of the DiD term [42][43][44].The marginal effects are estimates of the magnitude of the effects in terms of the number of admissions and were estimated using the margins postestimation command [44].

Secondary analyses
As secondary analyses, we restricted the sample to exclude individuals who changed residence during the study period.We tested the sensitivity to the sample inclusion criteria by restricting the sample to those who had a qualifying admission during the intervention period.We also tested the sensitivity of the results to a lagged definition of the intervention period (2013-2014 and 2014-2015).Finally, our definition of the 'at risk' population could have caused bias if the intervention had prevented individuals who had not previously had a drugs misuse hospitalisation from having any drugs misuse hospitalisations during and after the intervention-as these individuals would not have been included in the study population, and such an impact would be missed.
We used two approaches to examine this.The first approach examined impacts only on those individuals who had at least one qualifying admission in the pre-intervention period.This measure of the at-risk population is not affected by the introduction of the experimental payment system.The second approach was to estimate a two-part area-level model.The first part of the area-level model considered the impact of the scheme on the number of individuals in each area having at least one qualifying admission in each year, with the at-risk population being the total resident population.The second part of the area-level model considered the impact of the scheme on the number of hospital admissions amongst those who had at least one qualifying admission within that year.This two-part approach separately identified the effect of the experimental payment system on the size of the population having a qualifying admission during the intervention period.
The full results of secondary analyses are presented in the Supporting information.The analysis in this study was not pre-registered and the results should be considered exploratory.

Descriptive statistics
The intervention areas were less ethnically diverse than the comparison areas and generally had a higher percentage of individuals resident in areas of lower deprivation (Table 2).The populations were generally similar in terms of age and gender in both the intervention and comparison areas.Matched comparison areas were more similar to the intervention areas in terms ethnicity, age and IMD deprivation.

Trends in admissions before, during and after the intervention
For unadjusted mean values of the outcomes, the trends in pre-intervention years (2009-2010 to 2011-2012) were similar in intervention and comparison areas (Table 2; Fig. 2).There was then a differentially higher increase in the intervention areas during the scheme's operation, followed by a reversion to lower levels of admissions after the scheme was discontinued in the intervention areas.For matched comparison areas, pre-intervention trends appeared more parallel, and there was a pronounced differential increase in the pilot areas, with a more noticeable decrease after the cessation of the scheme.
Formal testing of parallel pre-trends could not reject the null hypothesis of parallel pre-trends in all primary and secondary analyses, except one secondary analysis of all admissions (Table 3).

Regression analyses
We Table 3 outlines the estimated effects of the P4P intervention on all outcomes for all primary and secondary analyses in both relative (IRRs) and absolute terms (MEs).
Table 4 presents full regression estimates for primary analyses.

All hospital admissions
For individuals initially resident in P4P pilot areas, annual hospital admissions (of all types) were 1.064 times higher in the intervention period compared with non-intervention areas (95% CI = 1.035, 1.094).This corresponds to 0.062 annual hospital admissions per person in the intervention areas.For analyses restricted to matched comparators, admissions were 1.065 times higher (95% CI = 1.035, 1.096); 0.059 more admissions yearly per person.The largest relative increase in admissions was for the restricted sample who had a qualifying admission in the pre-intervention period (IRR = 1.085; 95% CI = 1.039, 1.133).

Emergency admissions
For individuals originally located in P4P pilot areas, annual emergency admissions were 1.073 times higher in the intervention period compared with non-intervention areas (95% CI = 1.049, 1.097).This corresponds to 0.045 annual emergency admissions per person.For analyses restricted to matched comparators, emergency admissions were also 1.066 times higher (95% CI = 1.041, 1.092); 0.039 more per person yearly.

Drugs misuse diagnosed admissions
For individuals initially resident in P4P pilot areas, annual admissions including a substance misuse diagnosis were 1.055 times higher in the intervention period compared with non-intervention areas (95% CI = 1.026, 1.084).This represents 0.013 annual admissions per person.The largest relative increase in drugs misuse diagnosed admissions was for the restricted sample who had a qualifying admission in the intervention period (IRR = 1.079; 95% CI = 1.023, 1.138).There were no differences in drugs misuse admissions for the restricted sample that had a qualifying admission in the pre-intervention period.

Summary of main findings
Hospital admissions were differentially higher during the scheme for individuals initially resident in the intervention areas.For the 37 245 individuals identified in intervention areas, over 2012-2013 and 2013-2014 we estimated; 4 618 additional hospital admissions, 3 352 emergency admissions and 968 with a diagnosis indicating substance misuse.
Secondary analyses showed that the relative increase in qualifying (drugs misuse diagnosed) admissions was largest in those who had at least one qualifying admission in the intervention period; and results from two area-level secondary analyses were consistent with this, showing that the scheme increased the population rate of individuals having at least one qualifying admission in the intervention areas during the scheme.The relative increase in all/ emergency admissions was largest in those who had a qualifying admission in the pre-intervention period.

Explanation of notable findings
Differentially increased hospitalisations in the intervention areas may have been partly related to changes in access to treatment (i.e.lower treatment retention and initiation), outlined by previous evaluations of the scheme [23,26,27].New drug misuse treatments decreased by 6.57% in intervention areas in 2012-2013 and 2013-2014 compared with 2010-2011 and 2011-2012-more than double the relative decrease in comparison areas (Table 5) [23].
Causes of reduced access included; higher waiting times because of implementation difficulties, and changes in the acceptability of treatment for some patients (for example changes in substitute prescribing) [23].
Reduced treatment retention and initiation represent higher levels of unmet need, where need is defined as the capacity to benefit from treatment [45].Unmet need in individuals with substance misuse disorders is significant and well established [46], and engagement with treatment services for people with substance misuse disorders has been shown to reduce hospitalisations [47].
The results of primary analyses incorporate the net effects of two impacts of the scheme; differentially increasing the number of individuals having qualifying admissions in the intervention areas during the intervention, and also increasing overall admissions amongst those who had qualifying admissions before the intervention period.The latter effect-especially for emergency admissions-indicates that this latter population's health needs were not being met as well during the scheme.
Evaluations of the scheme did not find evidence of cherry-picking by providers [23].This may either indicate the effectiveness of (i) the risk adjusted tariff payments, and (ii) the quasi-independence of LASARS, which were introduced as a feature of the intervention with the aim of establishing an independent function responsible for the assessment of all patients in and referred to the treatment system and their subsequent tariffing.Alternatively, cherry-picking may have occurred in ways that were not possible to identify from available data.

Limitations
The generalisability of this study in part depends on the identification of differential changes over time in the appropriate population of interest; individuals with a (potentially unmet) need for substance misuse treatment, identified using hospital diagnosis codes related to substance misuse.Although this allows for the capture of possible wider impacts of the scheme on access to treatment and treatment acceptability in addition to on the hospital sector, it does not precisely capture each individual with a need for substance misuse treatment.Our approach treated admission to hospital including a drugs misuse diagnosis at any point during the study period as a reasonable proxy of a potential  need for drugs misuse treatment.Although this excludes estimating impacts on those not hospitalised at all, it includes those subject to the most severe health harms.The sample we identified is comparable to the substance misuse treatment population; the relatively young mean age of patients in this study, at 35, is closer to the substance misuse treatment population (age = 33) than the general admitted population in England (age = 53) [34].However, we sought to also identify potential impacts on unmet need for treatment; and individuals with unmet need are unlikely to be representative of those in treatment [48][49][50].
Only a relatively small proportion of those in substance misuse treatment in 2012-2013 had planned hospital inpatient detoxification (1.57%) [51], with the vast majority of hospitalisation relating to drugs misuse being independent of planned drugs misuse treatment.
Differential changes over time in factors not captured in the analyses (including factors entirely unrelated to the payment scheme) could in theory provide alternative explanations of some the estimated effects.For example, there was an increase in opioid/crack using subjects in treatment during the study in the intervention areas.If this increase represented something important about differential comparative change in drug use, it might help explain the increase in hospital admissions.Equally, current evidence on community prevalence of opioid/crack use suggests no significant increase in use in the intervention areas during the pilot (https://www.ndtms.net/).
Diagnosis recording in hospital is inexact-in some cases drugs misuse may be suspected but not recorded, and in other cases not suspected despite drugs misuse being a contributing cause of an admission [38].

Implications for policy and research
Although the effect sizes estimated at patient-level in each year may at first appear modest, we estimated that there were 3 352 additional emergency admissions for 37 245 patients in the 8 intervention areas in 2012-2013 and 2013-2014, equivalent to a minimum £2.1 million additional hospital expenditure [52,53].If the effects on admissions were repeated (even weakly) under national adoption of the scheme across all 149 areas, the economic (and clinical) impact to hospitals would be substantial.
More effective P4P schemes have been targeted on process indicators for health conditions linked to best practice pathways applicable to a majority of the target population (such as Stroke) [54].Such examples contrast with this scheme, which was focused on outcomes in a setting characterised by a heterogeneous population with complex needs and that involves multiple agencies.Future policies focused on reforming payment systems for substance misuse treatment providers to include P4P may prefer to link incentive payments to processes evidenced to be universally effective in specific groups of patients (for example, vaccination against Hepatitis B infection [55]).
Considering impacts on hospital admissions and not just those in contact with treatment agencies in research allows for capture of populations at risk of the most severe harms.Administrative data are well suited to this as information is captured on the whole population of interest, not just those in treatment.Furthermore, examination of this wider population is useful as there may be effects on who enters and remains in treatment.
The 'efficiency savings' that can be generated by P4P schemes may be offset by costs in other sectors.This scheme is a reminder that research should consider possible effects across various government services-especially for substance misuse policy that affects a range of agencies.

Declaration of interests
None.

Figure 1
Figure 1 Estimation sample construction estimated regression analyses on n = 3 910 286 annual observations on N = 572 545 individuals for the period 2009-2010 to 2015-2016.Analyses of matched comparison areas included n = 1 458 038 observations on N = 213 525 individuals.Data excluded after initial sample construction are outlined in Fig. 1.

Figure 2
Figure 2 Unadjusted trends in outcomes in intervention and comparison areas.Dotted lines indicate beginning and end of the P4P scheme in the intervention areas [Colour figure can be viewed at wileyonlinelibrary.com] 05. ** P < 0.01.*** P < 0.001.n = number of annual observations on N = number of individuals; DiD = difference-in-differences; ME = Marginal effect; models estimated using negative binomial regression; DiD of MEs takes the DiD between the estimated marginal effects of the components of the DiD term (see Supporting information); null hypothesis in test of pre-trends assumes no significant difference in slope comparing intervention and comparators.

Table 1
ICD-10 codes used to define diagnoses of substance misuse in hospital

Table 2
Outcomes and population characteristics in 2 years before the pilot (descriptive statistics and t tests; primary analysis sample) = index of multiple deprivation; mean values of outcomes are mean counts per person year; mean values of age and IMD are continuous variables; means of male and ethnicity groups are proportions; N = number of individuals in areas in specified year.

Table 3
Difference-in-differences estimates and tests of pre-trends; primary and secondary analyses

Table 5
New drugs misuse treatment journeys [23] from Donmall et al.[23].Source used in study is the National Drug Treatment Monitoring System (NDTMS).