Ishanu Chattopadhyay PRO
ML | Data Science Biomedical Informatics | Social Science | Assistant Professor
Zero-burden Risk Assessment for Test-free Screening &
Predictive Prognosis of Complex Diseases
Ishanu Chattopadhyay, PhD
Assistant Professor of Internal Medicine
Institute of Biomedical Informatics
University of Kentucky
The Laboratory for Zero Knowledge Discovery
mathematics
computer science
social science
medicine
Complex systems
AI/ML learning theory and applications
Implication of AI in Future of Societay
Dmytro Onishchenko, staff
Nathan Russel, PhD student
Postdoc TBD
Student Researcher TBD
collaborators
Alex Leow
Psychiatry UIC
Anna Podolanczuk, Pulmonary Care, Weill Cornell
Gary Hunninghake, Pulmonary C, Harvard
Robert Gibbons, Bio-statistics
Daniel Rubins, Anesthesia and Critical Care
Peter Smith, Pediatrics
Michael Msall Pediatrics
Fernando Martinez, Pulmonary Critical Care, Weill Cornell
James Mastrianni, Neurology
James Evans, sociology
Erika Claud, Pediatrics
Aaron Esser-Kahn Molecular Engineering
David Llewellyn
University of Exeter
Kenneth Rockwood
Dalhousie University
Andrew Limper Mayo Clinic
Publications
&
Impact
Nature Medicine
Nature Human Behavior
Nature Commun-ication
Science Advances
(3)
PNAS
JAMA
JAHA
JACC
"test-free" screening?
Lack of Universal Screening at the point of care
Early diagnosis is difficult, late or missed diagnosis costs lives
We lack Universal Screening
for most diseases
Known Co-morbidities of PF
Are there more? Subtle footprints in the medical history that are more heterogeneous?
Aim 1: Map AP Patient Journeys to Identify Risk Patterns in Acute and Recurrent Episodes.
Acute Pancreatitis
(with UK PRIME)
Aim 2: Model Transitions from AP to Type 3c Diabetes for Early Intervention.
Aim 3: Predict ICU Admission in AP Patients Based on Disease Severity Indicators.
Rapid Universal Point-of-care Screening for ILD/IPF Using Comorbidity Signatures in Electronic Health Records
Flag patients before they (or doctors) suspect
Primary Care
Pulmonologist
Zero-burden Co-morbid Risk Score (ZCoR)
Referral
Prognosis at Point-of-Diagnosis
Patient Journey
Early Diagnosis
Interstitial Lung Disease / Pulmonary Fibrosis
1
2
3
>50 years old
more men than women
Rare disease
~5 in 10,000
Post-Dx
Survival
~4 years
At least one misdiagnosis
~55%
Two or more misdiagnosis
38%
Initially attributed to age related symptoms:
72%
ZCoR
~ 4yrs
current survival ~4yrs
~ 4yrs
current clinical DX
ZCoR screening
Onishchenko, D., Marlowe, R.J., Ngufor, C.G. et al. Screening for idiopathic pulmonary fibrosis using comorbidity signatures in electronic health records. Nat Med 28, 2107–2116 (2022). https://doi.org/10.1038/s41591-022-02010-y
n=~3M
AUC~90%
Likelihood ratio ~30
Conventional AI/ML attempts to model the physician
AI in IPF Research
ICD administrative codes
IPF
ILD
target codes appear
Past medical history
No target codes appear
case
control
2yrs
2yrs
prediction
Truven MarketScan (IBM) Commerical Claims & Encounters Database 2003-2018
>100M patients visible
>7B individual claims
>87K unique diagnostic codes
>7% Medicare data present
2,053,277 patients included in study
University of Chicago Medical Center 2012-2021
68,658 patients
Random sample from Optumlabs Data Warehouse courtsey Mayo Clinic
861,280 patients
2,983,215 patients
Data: Onishchenko etal. Nat. Medicine 2022
Data: Onishchenko etal. Nat. Medicine 2022
patient A
patient B
patient C
Beyond "risk factors" to personalized risk patterns
Clinical Trial Cohort Selection
Current screen failure rate ~50-60%
ZCoR cut-down screen failure rate ~20%
cohort size: 2000
initial cohort size: 5000
initial cohort size with ZCoR: 2500
Cost per patient for confirmatory tests: ~7k USD
Savings: ~20M USD
Upto 4 year "signal" resolution
decreases risk
increases risk
Patient Journey: Tracking Risk over time
n=3,294,608 |
---|
average age: 57 years 2 months |
Predicting Acute Pancreatitis
Autism
1 in 59
36
MCHAT/F
Alzheimer's Disease and Related Dimentia
state of art with EHR:
~67% AUC*
ZCoR: ~87%
Preempting ADRD accurately upto a decade in future
ZeD Lab: Predictive Screening from Comorbidity Footprints
CELL Reports
ZCoR | Competition | |
---|---|---|
Autism | >83% | "obvious" |
Alzheimer's Disease | ~90% | 60-70% |
Idiopathic Pulmonary Fibrosis | ~90% | NA |
MACE | ~80% | ~70% |
Bipolar Disorder | ~85% | NA |
CKD | ~85% | NA |
Rare Cancers (Bladder, Uterus) | ~75-80% | Low |
Suicidality (with CAT-SS) | 98% PPV | Low |
Odds ratios combined via ML
1
Data
cases
control
odds ratios for all ICD codes
ML Model
odds-based risk estimator
minimize generalization error by constraining model capacity
Combining ~1000 features with constrained capacity models
Cloud Deployment
Theoretical formulation
Multi-cohort validation
Launch User-Accessible Platform
3 years
2 years
[
{
"patient_id": "P000038",
"sex": "F",
"birth_date": "01-01-2006",
"DX_record": [
{"date": "07-31-2006", "code": "Z38.00"},
{"date": "08-07-2006", "code": "P59.9"},
{"date": "08-29-2016", "code": "J01.90"},
{"date": "09-10-2016", "code": "J01.90"},
{"date": "11-14-2016", "code": "J01.91"}
],
"RX_record": [
{"date": "10-29-2011", "code": "rxLDA017"},
{"date": "05-16-2015", "code": "rxIDG004"},
{"date": "08-08-2015", "code": "rxIDG004"},
{"date": "06-04-2016", "code": "rxIDD013"}
],
"PROC_record": [
{"date": "02-05-2007", "code": "90723"},
{"date": "11-05-2007", "code": "J1100"}
]
}
]
{
"predictions": [
{
"error_code": "",
"patient_id": "P000012",
"predicted_risk": 0.005794344620009157,
"probability": 0.8253881317184486
}
],
"target": "TARGET"
}
Data In
Data Out
Cohort Selection and Risk Analysis Testbed
Misleading Diagnosis of Idiopathic Pulmonary Fibrosis: A Clinical Concern
Javier Ramos-Rossy, MD, Onix Cantres-Fonseca, MD, Ginger Arzon-Nieves, Yomayra Otero-Dominguez, MD, Stella Baez-Corujo, MD, and William Rodríguez-Cintrón, MD
Questions.
ishanu_ch@uky.edu
target codes appear
Past medical history
No target codes appear
case
control
2yrs
2yrs
IPF drugs prescribed
Signature of IPF diagnostic sequence
pirfenidone or nintedanib
ICD Codes can be noisy
"cases" are not always true IPF
PFSAs
from code sequences
Model control and case cohorts seprately
given a new test case, compute likelihood of sample arising from case models vs control models
sequence likelihood defect
Huang, Yi, Victor Rotaru, and Ishanu Chattopadhyay. "Sequence likelihood divergence for fast time series comparison." Knowledge and Information Systems 65, no. 7 (2023): 3079-3098.
Off-the-shelf AI does not suffice
How?
Odds ratios combined via ML
1
Data
cases
control
odds ratios for all ICD codes
ML Model
odds-based risk estimator
Probabilistic Finite State
Map health history to trinary streams
Chattopadhyay, Ishanu, and Hod Lipson. "Abductive learning of quantized stochastic processes with probabilistic finite automata." Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences 371, no. 1984 (2013): 20110543.
2
Longitudinal stochastic patterns
Probabilistic Finite State
Chattopadhyay, Ishanu, and Hod Lipson. "Abductive learning of quantized stochastic processes with probabilistic finite automata." Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences 371, no. 1984 (2013): 20110543.
2
Longitudinal stochastic patterns
Timestamped Diagnostic Data
choose disease category
(e.g. infections)
(specialized HMMs)
Modeling & predicting complex social interactions
Point-of-care screening for complex diseases
Ai
Electronic Healthcare Record
IPF
ASD
ADRD
ZeDlab Research Thrusts
General framework for inferring digital twins in biology and medicine
>5 Million in US. >13 Million in next 10 years
Alzheimer's Disease and Related Dimentia
MOCA, Blood Tests
Current Practice:
state of art with EHR:
~67% AUC*
ZCoR: ~87%
By Ishanu Chattopadhyay
AI for medicine
ML | Data Science Biomedical Informatics | Social Science | Assistant Professor