Johnson & Johnson

Predicting Alzheimer's Disease Progression

🏢 Industry Research, Johnson & Johnson Internship (Unpublished)

Alzheimer's disease doesn't affect every cognitive ability equally. Memory, language, and spatial reasoning can decline at very different rates in different patients. Most prediction models only forecast the total ADAS-Cog score, collapsing all that heterogeneity into a single number. During my summer internship at Johnson & Johnson, I built Gaussian Process models with ordinal likelihood that predict each of the 11 individual ADAS-Cog subscores. This lets trial designers and clinicians see how specific cognitive abilities change over time, not just the total.

ADAS-Cog gold-standard cognitive endpoint in AD trials
Ordinal likelihood respects ordered score structure
GP models provide calibrated uncertainty bands
Subscore level predictions across 5 cognitive domains

Why predict individual subscores?

Most Alzheimer's models focus on the total ADAS-Cog score (0–70), which sums across all cognitive domains. But this total hides important patterns. One patient might be losing language skills rapidly while memory stays relatively intact; another might show the opposite trajectory. Predicting all 11 subscores individually lets us:

  • Spot which cognitive domains are declining fastest for a given patient, so interventions can be targeted accordingly
  • Detect early signs that a treatment is helping one aspect of cognition even if the total score hasn't moved much
  • Design smarter clinical trials by enriching for patients whose specific decline patterns match the drug's mechanism

How we approached it

Each ADAS-Cog subscore is an ordered integer. For example, a Word Recall score of 5 is meaningfully "between" 4 and 6, not just a category label. A standard Gaussian likelihood ignores this structure and can produce predictions like 4.7 that don't correspond to any real score. So we used an ordinal likelihood that maps the latent GP function through learned thresholds to produce proper probability distributions over each discrete score level.

We compared five GP configurations, from a continuous-response baseline to a full variational GP with ordinal likelihood, using 5-fold cross-validation to see which setup gives the best predictions with well-calibrated uncertainty bands.

ADAS-Cog 11 Subscales

ADAS-Cog 11 subscales breakdown by cognitive domain with maximum scores
ADAS-Cog 11 subscales. The assessment covers 5 cognitive domains: Memory (Word Recall, Delayed Word Recall, Orientation, Word Recognition), Language (Commands, Naming, Spoken Language, Word Finding), Visuospatial (Constructional Praxis), Praxis (Ideational Praxis), and Attention (Remembering Test Instructions). Total score range: 0–70 (higher = worse cognition).

Modeling Pipeline

Modeling pipeline: CPAD Database → Feature Engineering → GP Models → 11 ADAS-Cog Subscore Predictions
End-to-end pipeline. Patient-level data from CPAD (demographics, APOE4, MMSE, visit timeline) feeds into 5 GP model configurations evaluated with 5-fold cross-validation. Each configuration outputs uncertainty-aware predictions for all 11 ADAS-Cog subscores.
Illustrative GP-predicted ADAS-Cog trajectories with uncertainty bands for different severity levels
Illustrative GP trajectories. ADAS-Cog total score predictions over 36 months for Normal Cognition (green), Mild AD (blue), and Moderate AD (red). Shaded bands represent 68% and 95% prediction intervals. Uncertainty grows with prediction horizon, reflecting genuine clinical unpredictability in long-term disease progression.

Ordinal Likelihood

Why ordinal matters

ADAS-Cog subscores are ordered integers (e.g., Word Recall: 0-10). Treating them as continuous means a prediction of "4.7" has no clinical meaning. The ordinal likelihood maps the GP's latent function through ordered thresholds \(\tau_k\) to produce proper probabilities for each score level:

\[ p(y_i = k \mid f_i) = g\big(\tau_k - f_i\big) - g\big(\tau_{k-1} - f_i\big) \]

where \(\tau_0 = -\infty\), \(\tau_K = +\infty\), the interior thresholds \(\tau_1 < \cdots < \tau_{K-1}\) are learned, and \(g = \Phi\) is the standard normal CDF (probit link).

What this gives us

The ordinal model naturally captures something intuitive: if the true score is 5, the model should be more likely to guess 4 or 6 than 0 or 10. This "nearness-aware" structure produces clinically meaningful uncertainty estimates. A wide band between 4 and 7 means something very different from a wide band between 1 and 10.

Combined with the GP's flexible longitudinal modeling, this gives us patient-specific progression curves with properly calibrated confidence intervals, which is what clinical trial planners need for power calculations and what physicians need for treatment conversations.