Charting the Opioid Terrain: A Decade of Hydrocodone and Oxycodone Distribution Patterns Across the U.S.


The opioid epidemic in the United States has unfolded as a devastating public health crisis, characterized by a marked increase in opioid misuse, addiction, and overdose-related deaths. The crisis began in the late 1990s, driven by an upsurge in prescribing opioids for pain management amidst aggressive marketing tactics by pharmaceutical companies. The epidemic has evolved, seeing a shift from prescription opioids to more potent illicit substances, such as heroin and fentanyl, complicating the efforts to manage and mitigate its impacts (Judd, King, & Galke, 2023). As of 2021, the opioid epidemic has cost over 80,000 lives in the US (Madera, et al., 2023).

The scope and severity of the epidemic varies significantly across different geographic and demographic groups, with some communities experiencing disproportionately higher rates of opioid misuse and mortality. This variation underscores the importance of examining the epidemic through a geographically nuanced lens, considering factors like healthcare accessibility, socio-economic status, and urban versus rural settings (Lyden & Binswanger, 2019; Balidemaj, 2021).

Recognizing the spatial characteristics of the opioid epidemic, this research incorporates advanced geostatistical analyses to explore the distribution patterns of opioid prescriptions quantified through morphine milligram equivalents (MME) of hydrocodone and oxycodone. By applying methods such as cluster analysis and Moran’s I, the study aims to identify regions of high opioid prescription densities and assess spatial autocorrelation to determine whether high-prescription areas are geographically clustered. In this analysis, the Drug Enforcement Agency’s Automation of Reports and Consolidated Orders System (ARCOS) dataset was used.The ARCOS data closely aligns with opioid dispensing reports from state prescription drug monitoring programs and has been utilized in previous studies (Madera, et al., 2023, Cabera, et al, 2019, Collins, et al., 2019). These analyses will provide critical insights into the spatial dynamics of the epidemic, helping to tailor public health interventions more effectively and efficiently.

Furthermore, this study will correlate MME rates per capita with key demographic variables such as healthcare facilities per capita, race, urbanization, median household income, gender, and median age, aiming to uncover the multi-layered drivers of opioid use. This comprehensive approach not only seeks to map the current landscape of opioid usage but also to inform future policy and clinical practices aimed at curbing the ongoing epidemic.


In this analysis, I developed a dataset, tract_mme_rates, derived ARCOS dataset, to bring out the geographic and demographic distribution of opioids across various communities. This dataset compiles opioid transaction metrics into weighted averages and total sums of MME per census tract, thereby offering a more granular view of distribution patterns adjusted for population sizes. Notably, the original data from the DEA lacked geocoding. For the purpose of geocoding, I utilized the Geocodio service, which successfully geocoded 99.6% of the 200,753 addresses tied to unique DEA buyer numbers. Addresses that could not be geocoded were predominantly located outside the continental United States, with the majority in Puerto Rico. While the opioid crisis may certainly be of concern in Puerto Rico, the current analysis will focus on the continental United States.

For demographic context, I selected the 2010 Census population data. This choice reflects an alignment with the midpoint of the study period (June 2012) and provides a representative demographic snapshot that simplifies the analysis.

A weighted average of opioid transaction size per capita was created by dividing the total MME of a census tract by the number of transactions in that tract, and then dividing the result by the 2010 population. These weighted averages reflect the volume of transactions, effectively scaling the impact of each transaction to its true magnitude. This methodological refinement enhances the accuracy of representing opioid distribution patterns by accounting for both the frequency and volume of transactions.

Additionally, I calculated per capita metrics of opioid distribution, normalized to 10,000 residents within each census tract. This normalization facilitates the comparison of opioid exposure across tracts with varying population densities, thereby pinpointing areas with disproportionately high levels of opioid distribution.

For the regression analysis, I included the number of healthcare facilities per capita as a covariate. I sourced the location data for healthcare facilities from OpenStreetMap, using the osmium tool to filter relevant entries:

osmium tags-filter us-latest.osm-2.pbf \                                                                                                            
  nwr/amenity=hospital \
  nwr/amenity=clinic \
  nwr/amenity=doctors \
  nwr/healthcare=doctor \
  nwr/healthcare=hospital \
  -o healthcare_facilities.osm.pbf
ogr2ogr -f GPKG healthcare_facilities.gpkg healthcare_facilities.osm.pbf

I then utilized the ogr2ogr tool to convert this data into a geopackage, which I subsequently imported into QGIS for integration into my database.

Additionally, I acquired rural/urban status, median household income, median age, and race, from the US Census Bureau.

For this project, I decided to exclude observations where the Morphine Milligram Equivalent (MME) rate is missing. This decision is based on the understanding that these missing values correspond to census tracts without healthcare facilities or pharmacies. Consequently, these tracts inherently report zero opioid distribution, which does not contribute to the analysis of opioid distribution patterns among areas with active opioid transactions. Including these tracts would introduce bias into the model, as they do not reflect the dynamics or factors influencing opioid availability and usage. By excluding these observations, we ensure that the analysis accurately reflects the distribution patterns and factors within tracts where opioid distribution occurs.

Exploring Clustering Patterns

To determine if local clusters were present, I conducted a cluster analysis using inverse distance spatial relationship, euclidean distance method, and 999 false discovery rate permutations. The resulting cluster map shows many small areas of high-rate clusters and several regional areas of low rate clusters. However, the Moran scatterplot created by the cluster analysis indicates an R2 value of 0. This low R2 value suggests that while the cluster analysis has identified specific areas with notably high or low opioid distribution rates, the overall explanatory power of geographical location in predicting these rates is minimal. In other words, despite the presence of localized clusters, the geographical location alone does not strongly predict the variation in opioid distribution across the entire study area. This may indicate that other non-spatial factors, such as socio-economic or policy differences, could play significant roles in influencing these distribution patterns, or that the spatial configuration of the data is complex and influenced by multiple interacting factors that are not captured solely by the geographical coordinates.

I then conducted a cluster analysis on the SEB smoothed MME rates using the inverse distance spatial relationship, Euclidean distance method, and 999 permutations of the false discovery rate correction. The resulting cluster map indicated several small areas of high-rate clusters and several regional areas of low-rate clusters. However, like the non-smoothed rate, the smoothed rate’s Moran scatterplot indicated an R2 of 0, indicating that there is likely more than a spatial relationship to the data at play.

The presence of statistically significant spatial autocorrelation in the rates, combined with the detection of localized high and low rate clusters, indicates that opioid distribution is not randomly distributed but exhibits a pattern influenced by geographic proximity. However, the low R2 value from the Moran’s scatterplots suggest that while the spatial distribution of opioid rates does follow a pattern, the specific geographic location alone explains very little of the overall variability in opioid distribution.

This apparent contradiction can be interpreted to mean that while opioids are clustered in certain areas more than would be expected by chance—indicating some level of dependency between neighboring locations—the specifics of these locations and the degree to which one can predict opioid distribution based on location alone are limited. This could imply that other underlying variables, which are spatially structured but not directly observed, are influencing these distributions. For instance, socioeconomic status, healthcare accessibility, local regulations, or cultural factors that correlate with geography might be driving the observed patterns of opioid distribution. We will be looking at these factors in the regression analysis section.

I have chosen to use the SEB smoothed rate as the dependent variable for subsequent analyses. This decision is grounded in the enhanced stability and reliability that SEB smoothing provides, particularly in handling variability associated with small area sizes and sparse data points (Assunção & Reis, 1999). The SEB method adjusts for these local discrepancies, offering a more consistent and representative measure of opioid distribution across diverse geographic settings. This smoothed variable reduces the impact of outliers and random fluctuations that can distort the analysis, making it particularly useful for developing more accurate models that are generalizable across different populations (Clayton & Kaldor, 1987). Additionally, using the SEB smoothed rate allows for a clearer interpretation of underlying trends and patterns, facilitating more robust conclusions and informed policy recommendations based on the geographic distribution of opioid usage.

Regression Modelling

Before implementing regression models, we must first determine if the variables are spatially autocorrelated. The results from the Moran’s I tests, which were statistically significant for most variables, are documented in the accompanying table.

Variable Moran_I Expectation Variance Std_Error Z_value P_value CI_lower CI_upper
Smoothed MME 0.0005987 -0.0000152 0.0000037 0.0019220 0.3194156 0.375 -0.0031684 0.0043659
Smoothed Weighted MME 0.0000017 -0.0000140 0.0000043 0.0020668 0.0075947 0.497 -0.0040492 0.0040526
Sm. Weighted MME Per 10k 0.0000069 -0.0000138 0.0000041 0.0020346 0.0101549 0.496 -0.0039809 0.0039946
Healthcare Facs. Per 10k 0.9179674 -0.0000138 0.0000059 0.0024242 378.6768105 < 0.001 0.9132160 0.9227188
Smoothed White Race % 0.8325875 -0.0000152 0.0000067 0.0025888 321.6123154 < 0.001 0.8275134 0.8376617
Smoothed Urban % 0.5662481 -0.0000526 0.0000313 0.0055968 101.1820818 < 0.001 0.5552783 0.5772180
Med. Household Income 0.5760376 -0.0000138 0.0000059 0.0024335 236.7147079 < 0.001 0.5712679 0.5808074
Smoothed Male % 0.1476578 -0.0000157 0.0000070 0.0026394 55.9490147 < 0.001 0.1424845 0.1528311
Median Age 0.3795698 -0.0000138 0.0000059 0.0024336 155.9766409 < 0.001 0.3748000 0.3843397

Implementation of Spatial Regression Models

To analyze the spatial dynamics of opioid distribution, I incorporated spatial regression models into the analysis. Specifically, I employed the Spatial Error Model (SEM) and the Spatial Lag Model (SLM), both of which are designed to handle spatial dependencies that are often overlooked in traditional Ordinary Least Squares (OLS) regression.

Spatial Error Model (SEM): The SEM is particularly effective at accounting for spatial autocorrelation present in the regression residuals. This model assumes that the error terms are spatially correlated, meaning that the error for one observation may be similar to errors of nearby observations. This correlation is modeled through a spatial weights matrix, which specifies the relationship between each observation and its neighbors. In our analysis, the spatial weights matrix was constructed using the Geoda software, identifying the four nearest neighbors for each observation, thus ensuring each observation’s influence is appropriately modeled. The SEM helps in providing more accurate and robust estimates, particularly important in the spatial analysis of opioid distribution patterns, where data points are not independently distributed but influenced by their spatial context.

Spatial Lag Model (SLM): Conversely, the SLM explicitly incorporates a spatially lagged dependent variable into the regression equation. This model captures the influence of neighboring regions directly, assuming that the value of the dependent variable for a given observation is influenced by the values of that variable for its neighbors. This approach is suited to understanding diffusion processes, such as how the opioid distribution in one area might influence neighboring areas, potentially reflecting spill-over effects or the movement of influencing factors across borders.

Model Selection and Application: In this study, variables included in the spatial models were selected based on their potential influence on opioid distribution patterns across census tracts. These included demographic factors such as the percentage of the population identifying as white, median age, and socioeconomic factors such as median household income and the rate of healthcare facilities per 10,000 residents. Initial models included multiple demographic indicators; however, due to multicollinearity, as indicated by high Variance Inflation Factors (VIF), adjustments were necessary. Specifically, certain overlapping demographic variables had to be removed to stabilize the model and enhance the precision of our estimates.

An evaluation of several statistical models was conducted to address the spatial characteristics inherent to the data. The Spatial Lag Model (SLM) was ultimately selected over the Ordinary Least Squares (OLS) and the Spatial Error Model (SEM) after a detailed comparative analysis, primarily based on its performance in handling spatial dependencies and its effectiveness in integrating socio-economic variables.

Rationale for Model Selection:

The decision to select the Spatial Lag Model (SLM) was driven by its robust ability to capture the spatial diffusion effects that are vital for understanding how opioid distribution in one area may influence adjacent areas. This model incorporates a spatially lagged dependent variable, directly modeling the influence of neighboring regions, which is particularly relevant given the geographic spread and local variations in opioid usage.

Comparative Analysis of Model Outcomes:

  1. Ordinary Least Squares (OLS):
    • R-squared: 0.979
    • AIC: 65102.5
    • Schwarz Criterion (BIC): 65157.7
    • Moran’s I (error): 80.5, p<0.001
    • Lagrange Multiplier (lag): 11364.3, p<0.001
    • Despite its high R-squared, OLS was found inadequate due to significant spatial dependence and heteroskedasticity, evidenced by the diagnostic tests. These flaws potentially compromise the reliability of the model’s outputs.
  2. Spatial Lag Model (SLM):
    • R-squared: 0.982
    • AIC: 53046.3
    • Schwarz Criterion (BIC): 53110.7
    • Spatial Lag LR Test: 12058.2, p<0.001
    • SLM not only showed an improvement over OLS in terms of fit but also effectively modeled spatial interactions, albeit the healthcare facilities variable showed non-significant results (p = 0.27489).
  3. Spatial Error Model (SEM):
    • R-squared: 0.983
    • AIC: 56424.1
    • Schwarz Criterion (BIC): 56479.3
    • Spatial Error LR Test: 678.4460, p<0.001
    • While SEM demonstrated a solid performance in handling unobserved spatial errors, it did not significantly outperform SLM concerning AIC and handling of spatial autocorrelation.

Discussion of SLM Coefficients and Model Fit:

The spatial lag coefficient (0.126) from the Spatial Lag Model (SLM) plays a crucial role in interpreting the interconnectedness and diffusion effects in the context of opioid distribution across census tracts. This coefficient, with a statistically significant z-value of 114.43 (p < 0.001), indicates a moderate yet meaningful influence of neighboring areas on opioid distribution metrics within a given census tract. The positive sign of this coefficient suggests that areas surrounded by regions with high opioid distribution rates are likely to exhibit similar characteristics. This diffusion effect highlights the spatial dependency and can be interpreted as a contagion effect, where high opioid distribution in one area potentially contributes to similar outcomes in adjacent areas.

Implications of Other Model Coefficients:

  • Median Age (Coefficient = 0.004, p < 0.001): The positive coefficient for median age, statistically significant with a z-value of 21.13, implies that areas with an older population tend to have higher levels of opioid distribution. This could reflect higher prescription rates among older adults due to chronic pain or other health issues.
  • Male Percentage (Coefficient = 0.144, p < 0.001): The substantial positive coefficient for the percentage of males in a population, with a remarkably high z-value of 923.22, suggests a significant association between the male demographic and opioid distribution rates. This might indicate higher opioid usage or prescription rates among males in the sampled regions.
  • White Percentage (Coefficient = 0.0009, p < 0.001): The coefficient for the percentage of those identifying as white is also positively associated with opioid distribution.
  • Median Household Income (Coefficient = -2.2177e-07, p < 0.001): The small negative coefficient, though significant (z-value = -4.53), suggests that as median household income slightly decreases, opioid distribution slightly increases, potentially indicating socioeconomic factors influencing health behaviors or access to healthcare.
  • Healthcare Facilities per 10k (Coefficient = -0.00124, p = 0.27): Although not statistically significant, the negative coefficient for healthcare facilities per 10,000 residents suggests a complex relationship where higher numbers of healthcare facilities do not correlate with increased opioid distribution. This could reflect regional variations in healthcare provision strategies, such as stricter prescribing practices or more robust healthcare resources leading to better management of opioid prescriptions.

Overall Impact on Research and Policy:

The findings from the SLM, particularly the spatial lag coefficient, underscore the necessity for targeted public health interventions that consider the spatial and demographic dynamics of opioid distribution. By acknowledging the contagion effect evident in the spatial lag, policymakers and health practitioners can design more geographically tailored strategies that address not only the areas with high rates of opioid distribution but also their neighboring regions to curb the spread effectively. This approach ensures that interventions are not only localized but also cognizant of the broader spatial network of influence, thereby maximizing their reach and effectiveness in combating the opioid crisis.


In evaluating the effectiveness of the Spatial Lag Model (SLM) and the Spatial Error Model (SEM) in managing spatial dependencies within our dataset, diagnostics reveal that neither model completely mitigates spatial dependence, as evidenced by the results from the Likelihood Ratio Tests (SLM: 𝜒2=12058.2223,𝑝<0.001; SEM: 𝜒2=8678.4460,𝑝<0.001). These tests indicate significant remaining spatial autocorrelation within the residuals of both models.

This persistence of spatial dependence, despite the use of models specifically designed to account for such effects, underscores the complexity of the spatial dynamics at play in opioid distribution. These findings suggest that the opioid distribution patterns may be influenced by additional spatial processes or interactions not captured by the current specifications of the SLM and SEM. For instance, higher-order spatial interactions or non-linear spatial relationships could be present and warrant further investigation.

The implications of these results are twofold. First, they suggest caution in the interpretation of the model coefficients, as the residual spatial dependence may bias the estimates. Second, they highlight the need for continued development and refinement of spatial analytical methods in public health research. Future studies might explore more complex models that incorporate higher-order spatial dependencies or apply non-linear modeling techniques to better capture the intricate patterns observed.

In conclusion, while the SLM and SEM provide valuable insights into the spatial aspects of opioid distribution, this analysis indicates that these models do not fully account for all spatial dependencies. This recognition not only informs an understanding of the current study’s limitations but also sets a clear agenda for further methodological research in this critical area of public health.


  • Assunção, R. M., & Reis, E. A. (1999). A new proposal to adjust Moran’s I for population density. Statistics in Medicine, 18(16), 2147-2162.
  • Balidemaj (2021). Public health strategies to combat opioid epidemic in the United States: A systematic review. European Journal of Public Health, 31(3).
  • Cabrera FF, Gamarra ER, Garcia TE, Littlejohn AD, Chinga PA, Pinentel-Morillo LD, Tirado JR, Chung DY, Pande LJ, McCall KL, Nichols SD, Piper BJ. Opioid distribution trends (2006-2017) in the US Territories. PeerJ. 2019.
  • Clayton, D., & Kaldor, J. (1987). Empirical Bayes estimates of age-standardized relative risks for use in disease mapping. Biometrics, 43(3), 671-681.
  • Judd, King, & Galke (2023). The Opioid Epidemic: A Review of the Contributing Factors, Negative Consequences, and Best Practices. Cureus,15(7)
  • Lyden & Binswanger (2019). The United States Opioid Epidemic. Semin Perinatal, 43(3), 123-131
  • Madera, J. D., Ruffino, A. E., Felix, A., McCall, K. L., Davis, C. S., & Piper, B. J. (2023). Decline but Pronounced State-Level Disparities in Prescription Opioid Distribution in the United States. Pharmacy

Similar Posts

Leave a Reply