Building predictive models for cancer drug resistance requires longitudinal data (serial measurements over time), not just a single snapshot of tumor genetics; current single-timepoint data hits a hard ceiling regardless of algorithm sophistication.
OncoTraj is a public benchmark dataset of 813 lung cancer patients for predicting treatment resistance to osimertinib. It combines data from three clinical sources and defines three prediction tasks: whether patients progress within 12 months, how long until progression, and what resistance mechanism emerges.