Advances in Consumer Research
Issue 4 : 676-687
Original Article
Predicting Hospital Length of Stay Using Machine Learning and Spatial Analytics: A Large Open Health Dataset
 ,
 ,
 ,
1
PhD Scholar, IIC University of Technology, Cambodia, (Enrolment No. FNR210604).
2
Professor at Prin. L. N. Welingkar Institute of Management Developmement and Research (PGDM), Mumbai.
3
Geomatics Scientist
4
London School of Management Education
Abstract

This mixed-methods study develops and validates machine learning models for hospital length of stay (LOS) and cost prediction, integrates Geographic Information System (GIS) spatial analytics for population-level healthcare governance, and examines cross-national transferability through qualitative validation with Indian healthcare professionals. Analysing 1,048,575 inpatient discharge records from the New York State SPARCS database, Random Forest regression achieved R² = 0.41 for LOS prediction (MAE = 3.18 days, 71.3% accuracy within ±3 days) and R² = 0.79 for cost prediction. Clinical classification variables-principally APR-DRG severity-accounted for 74.61% of feature importance versus 8.21% for demographics, establishing a nine-to-one ratio that identifies classification infrastructure as the binding constraint on prediction capability. Extreme severity patients demonstrated 5.11-fold longer stays than minor cases (15.58 vs. 3.05 days), males showed 19% longer LOS than females (6.30 vs. 5.28 days, p < 0.0001), and patients aged 50+ exhibited 68% longer stays. GIS integration extended individual-level predictions to spatial governance, identifying high-burden ZIP code hotspots (~120,000 cases in ZIP 112), a 4.7-fold severity-stratified LOS gradient across geographic units, and a 2-fold county-level cost disparity (New York County ~$41,000 vs. Clinton County ~$20,000) revealing equity gaps invisible to individual-level models. Cost-effectiveness analysis yielded a dominant strategy with negative ICER of –$2,499.55 per bed-day avoided (ROI: 561,637%). Qualitative validation revealed complete APR-DRG unfamiliarity among all seven Indian healthcare professionals, exposing a structural classification gap with cascading consequences for reimbursement fairness, hospital benchmarking, and spatial resource allocation. The findings support phased adoption of severity-adjusted classification frameworks adapted to India’s disease burden, integrated with spatial analytics for district-level healthcare governanc..

 

Keywords
Recommended Articles
Original Article
Mapping the Research Patterns of Organic Food Products Marketing through Bibliometric Analysis: A Scopus Insights
Original Article
Impact of Artificial Intelligence in Digital Marketing on Students' Psychology Seeking University Admissions
...
Original Article
Enhancing Trust and Leadership Practices Through Artificial Intelligence and Machine Learning Technologies: A Comprehensive Framework and Strategic Implementation Guide
Original Article
Investigating Skill Enhancement and Career Trajectories of Crowd Workers in Digital Labor Platform
Loading Image...
Volume 3, Issue 4
Citations
53 Views
47 Downloads
Share this article
© Copyright Advances in Consumer Research