AML Combo Validation
A mechanism-aware, patient-specific combination therapy predictor for acute myeloid leukemia.
Does mechanism-informed combination prediction outperform best-predicted single drug for AML patients? This project answers that question, finds the precision-medicine subpopulation where combinations win, and ships an end-to-end clinical kit pipeline for new patients.
Headline findings
| Metric | Value |
|---|---|
| Single-drug baseline (per-patient Spearman) | ρ = 0.704 — gate ≥ 0.40 passed |
| FLT3-mutant AML (n=179) combo vs best single | Δ = +16.67 AUC units [95% CI 14.98–18.19] |
| FLT3-mutant win rate | 89.9% of patients |
| FLT3 wild-type (n=434) | Δ = −14.14 (no combo advantage) |
| Permutation p-value | < 0.001 |
| TCGA-LAML validation | Top picks match published clinical combos |
| Clinical drugs annotated | 20/20 |
| Tests passing | 86/86 |
Top combination recommendations
For FLT3-mutant AML patients, the model’s most-frequent rank-1 picks are:
- Gilteritinib + Venetoclax (115 patients) — matches NCT03625505, LACEWING rationale
- Quizartinib + Venetoclax (100 patients) — matches NCT03441555
- Trametinib + Venetoclax (82 patients) — parallel-pathway MEK + BCL2 block
All align with ongoing or completed AML clinical trials.
Documents
📘 Research
- Pre-registered thesis — one-line hypothesis + four acceptable outcomes
- Manuscript draft — full paper with methods, results, discussion
- Week-4 head-to-head summary — Δ distribution + subgroup stats
- Week-5 TCGA independent-cohort validation
🧪 Clinical kit
- Kit readiness report — 104-dim features, ELN computer, top-3 combo API
- RNA-Seq SOP for upstream pipeline — STAR + featureCounts requirements and OOD-detection layers
- Induction-response validation — honest negative finding vs 7+3 CR
🔬 Research pipelines (3+ drug extensions)
- Path A — Clonal coverage × IDA — N-drug scoring via patient-specific clonal decomposition
- Path B — Set Transformer — arbitrary-arity neural architecture
- Path D — Higher-Order Factorization feasibility — tensor-decomposition benchmark
- Route C — Clinical regimen retrieval — trial-matched evidence layer
📂 Drug knowledge base
- Drug mechanism annotation v2 — 32 drugs × 39 mechanism axes
📅 Development log
- Week 1 — Day 1: scaffold
- Week 1 — Day 2: Baseline A single-drug MLP
- Week 1 — Day 3: DrugComb ETL
- Week 1 — Day 4: TCGA-LAML ETL
- Week 1 — Day 5: integration + Week-3 smoke test
Using the kit
For a new AML patient:
from combo_val.clinical.kit_schema import KitInput, MutationCall
from combo_val.clinical.kit_predict import predict_for_patient, pretty_print_kit_output
import pandas as pd
kit = KitInput(
patient_id="P-2026-0001",
mutations=[
MutationCall(gene="FLT3", is_ITD=True, allelic_ratio=0.58),
MutationCall(gene="NPM1"),
],
karyotype_text="46,XX[20]",
wbc=90, platelet=35, hemoglobin=8.1, ldh=1100,
age=48, sex="female",
blast_pct_bm=72,
is_initial_diagnosis=True,
)
# gene_symbol → raw count, matching BeatAML pipeline (STAR + featureCounts)
rna_counts = pd.read_csv("patient_counts.tsv", sep="\t").set_index("gene")["count"]
out = predict_for_patient(rna_counts, kit)
print(pretty_print_kit_output(out))
Output: ELN 2017 risk class, top-5 combination recommendations with predicted AUC + mechanism scores, top-5 single-drug picks, driver-specific cautions, RNA-Seq QC confidence notes.
Data sources
| Source | Role | Size |
|---|---|---|
| BeatAML 2.0 (Bottomly et al., Cancer Cell 2022) | Training + internal validation | 613 patients × 165 drugs × ~55K AUC measurements |
| DrugComb v1.5 (Zagidullin et al., NAR 2019) | Combination synergy training | 186 strict AML pairs (ALMANAC on HL-60) |
| TCGA-LAML PanCancer Atlas 2018 | Independent cohort | 173 patients |
Citation
Manuscript in preparation. For now, cite the repository:
Tom, E. (2026). AML Combo Validation: mechanism-aware combination therapy prediction for AML.
https://github.com/Smugpigeon/AML-Combo-Validation
Contact
Erick Tom — ericktom94720@gmail.com