Advances in Consumer Research
Issue 4 : 676-687
Original Article
Predicting Hospital Length of Stay Using Machine Learning and Spatial Analytics: A Large Open Health Dataset
 ,
 ,
 ,
1
PhD Scholar, IIC University of Technology, Cambodia, (Enrolment No. FNR210604).
2
Professor at Prin. L. N. Welingkar Institute of Management Developmement and Research (PGDM), Mumbai.
3
Geomatics Scientist
4
London School of Management Education
Abstract

This mixed-methods study develops and validates machine learning models for hospital length of stay (LOS) and cost prediction, integrates Geographic Information System (GIS) spatial analytics for population-level healthcare governance, and examines cross-national transferability through qualitative validation with Indian healthcare professionals. Analysing 1,048,575 inpatient discharge records from the New York State SPARCS database, Random Forest regression achieved R² = 0.41 for LOS prediction (MAE = 3.18 days, 71.3% accuracy within ±3 days) and R² = 0.79 for cost prediction. Clinical classification variables-principally APR-DRG severity-accounted for 74.61% of feature importance versus 8.21% for demographics, establishing a nine-to-one ratio that identifies classification infrastructure as the binding constraint on prediction capability. Extreme severity patients demonstrated 5.11-fold longer stays than minor cases (15.58 vs. 3.05 days), males showed 19% longer LOS than females (6.30 vs. 5.28 days, p < 0.0001), and patients aged 50+ exhibited 68% longer stays. GIS integration extended individual-level predictions to spatial governance, identifying high-burden ZIP code hotspots (~120,000 cases in ZIP 112), a 4.7-fold severity-stratified LOS gradient across geographic units, and a 2-fold county-level cost disparity (New York County ~$41,000 vs. Clinton County ~$20,000) revealing equity gaps invisible to individual-level models. Cost-effectiveness analysis yielded a dominant strategy with negative ICER of –$2,499.55 per bed-day avoided (ROI: 561,637%). Qualitative validation revealed complete APR-DRG unfamiliarity among all seven Indian healthcare professionals, exposing a structural classification gap with cascading consequences for reimbursement fairness, hospital benchmarking, and spatial resource allocation. The findings support phased adoption of severity-adjusted classification frameworks adapted to India’s disease burden, integrated with spatial analytics for district-level healthcare governanc..

 

Keywords
Recommended Articles
Original Article
“FROM TAX EVASION TO CAPITAL FLIGHT: Institutional Weakness, Parallel Economy and Policy Responses”
...
Original Article
Digital Teaching Challenges and Technostress: Post-COVID Mental Well-Being of Teachers in Odisha
Original Article
Adoption Of E-Crm In Smes: Opportunities And Challenges In Emerging Markets
...
Original Article
Examining the Effect of Social Media Marketing Activities on Brand Loyalty and Willingness to Pay Premium Price through Brand Equity in India's Cosmetics Market
Loading Image...
Volume 3, Issue 4
Citations
267 Views
222 Downloads
Share this article
© Copyright Advances in Consumer Research