Individual Patient Data (IPD) NMA

Author

Xiaoge Zhang

Published

June 5, 2026

The Gold Standard for Handling Heterogeneity in Evidence Synthesis

1 Executive Summary: Why IPD is the Gold Standard

In Standard Network Meta-Analysis (NMA), we synthesize Aggregate Data (AD)—summary statistics like Log-Hazard Ratios or Log-Odds Ratios extracted from published literature. While AD NMA is standard practice, it suffers from a critical limitation: the Ecological Fallacy.

When processing trial-level summaries, we cannot accurately adjust for patient-level covariates (e.g., age, baseline severity, biomarker status). If Trial 1 enrolled systematically older, sicker patients than Trial 2, direct comparison of their aggregate results will be biased, and trial-level meta-regression is often underpowered to correct it.

Individual Patient Data (IPD) NMA solves this by pooling the raw, row-by-row patient data from multiple trials. It allows us to:

  1. Adjust for patient-level covariates directly in the regression model.
  2. Standardize outcome definitions and follow-up times across different trials.
  3. Conduct reliable subgroup analyses (e.g., precise treatment effect estimation in patients with biomarker X).

1.1 Data Acquisition Paths: How to get Competitor IPD?

In HTA submissions, a manufacturer usually possesses the IPD for their own product’s trials. However, obtaining the IPD for competitor trials is a major challenge. The industry-standard paths for lawful data acquisition include:

  1. Independent Data Sharing Platforms:
    • YODA (Yale University Open Data Access): Yale partners with companies like Johnson & Johnson to share clinical trial data.
    • CSDR (ClinicalStudyDataRequest.com): A multi-sponsor platform sharing data from companies like GSK, Novartis, and Roche.
    • Project Data Sphere: A platform specifically focused on sharing historical patient-level data from oncology trials.
    • Vivli: One of the largest global platforms for searching and requesting clinical trial data.
  2. The Acquisition Process: Access is not guaranteed and requires submitting a detailed Statistical Analysis Plan (SAP). Once approved by an independent review panel, researchers typically analyze the data within a secure, closed cloud environment (sandbox) where data downloading is strictly prohibited.

1.2 Methodological Differences: IPD vs. AD NMA

The transition from Aggregate Data (AD) to Individual Patient Data (IPD) represents a fundamental shift in evidence synthesis methodology:

  • Data Granularity: AD uses summary statistics (e.g., a trial’s average age is 55). IPD uses the exact characteristics of every single patient (e.g., Patient A is 45, Patient B is 65).
  • Ecological Fallacy: AD models are vulnerable to ecological bias when estimating treatment-by-covariate interactions. IPD models are immune to this fallacy by analyzing within-trial patient variability.
  • Covariate Adjustment: In AD NMA, meta-regression is underpowered because the sample size is the number of studies. In IPD NMA, the sample size is the total number of patients, providing massive statistical power to detect effect modifiers.
  • Endpoint Standardization: AD NMA is forced to use whatever outcome definitions were published in the papers (which often vary). IPD NMA allows researchers to apply a single, unified cleaning rule to raw data across all trials to calculate perfectly standardized endpoints.

2 The Minimal Mock Case: Binary Outcomes with Patient Covariates

To demonstrate IPD NMA, let’s consider a scenario evaluating treatments for a chronic disease with a binary outcome. For example, in autoimmune diseases like Rheumatoid Arthritis or Psoriasis, we often measure Clinical Response (meaning the patient achieved a predefined threshold of improvement, such as a 50% reduction in symptoms, coded as 1) vs. No Response (failing to meet the improvement threshold, coded as 0).

2.1 The Scenario & Treatments

We have an evidence network with 3 treatments:

  • Treatment 1: Placebo (Reference)
  • Treatment 2: Drug A
  • Treatment 3: Drug B

We have successfully obtained the raw patient data from 2 large trials:

  • Study 1: Compares Placebo vs Drug A.
  • Study 2: Compares Placebo vs Drug B.

2.2 Extracted Data (Patient-Level)

Instead of extracting single effect sizes per study, our dataset has one row per patient (e.g., \(N=1000\) rows total). We observe:

  • \(y_i \in \{0, 1\}\): The binary clinical response for patient \(i\).
  • \(trt_i \in \{1, 2, 3\}\): The treatment received by patient \(i\).
  • \(study_i \in \{1, 2\}\): The study patient \(i\) belongs to. - \(x_{i}\): A patient-level continuous covariate (e.g., Baseline Disease Severity Score).

2.3 Regression Input Data

In a One-Stage IPD NMA, we fit a single hierarchical logistic regression model to all patients simultaneously. The probability of response \(p_i\) for patient \(i\) is modeled as:

\[logit(p_i) = \mu_{study_i} + \delta_{trt_i} + \beta \cdot (x_i - \bar{x}_{study_i})\]

Where: - \(\mu_{study_i}\): Study-specific baseline log-odds (handled as fixed or random effects). - \(\delta_{trt_i}\): The relative treatment effect of the patient’s assigned drug vs Placebo (with \(\delta_1 \equiv 0\)). - \(\beta\): The prognostic effect of the covariate \(x_i\) on the clinical outcome. Centering \(x_i\) by the study mean \(\bar{x}_{study_i}\) is a crucial statistical technique to separate true within-study patient effects from across-study ecological confounding.

2.4 Results

The model outputs the covariate-adjusted treatment effects (\(\delta_2\) for Drug A and \(\delta_3\) for Drug B). Because we adjusted for individual disease severity (\(x_i\)), these estimates are robust to baseline imbalances between Study 1 and Study 2—a level of precision impossible to achieve with standard AD NMA.

3 Methodological Foundation: One-Stage vs. Two-Stage Models

When conducting IPD NMA, statisticians must choose between two distinct modeling architectures:

  1. One-Stage Approach: All patient data from all trials are pooled into a single hierarchical regression model (as demonstrated in the mock case). This is statistically elegant, preserves all individual-level correlations exactly, and readily accommodates complex covariate interactions (e.g., treatment-by-covariate interactions \(\delta_{trt_i} \cdot x_i\)).
  2. Two-Stage Approach:
    • Stage 1: Fit a separate regression model within each study independently to estimate the aggregate treatment effect (e.g., log-Odds Ratio) and its standard error, adjusted for covariates.
    • Stage 2: Feed these adjusted aggregate estimates into a standard AD NMA model.

4 The Stan Model for One-Stage IPD NMA

We implement a One-Stage Bayesian Hierarchical Logistic Regression for the IPD NMA.

data {
  int<lower=1> N;                // Total number of individual patients
  int<lower=0,upper=1> y[N];     // Binary outcome (1 = Response, 0 = No Response)
  int<lower=1> NS;               // Number of studies
  int<lower=1> study[N];         // Study ID for each patient
  int<lower=1> NT;               // Number of treatments
  int<lower=1> trt[N];           // Treatment ID for each patient
  vector[N] x;                   // Patient-level covariate (e.g., Baseline Severity)
}

transformed data {
  // Center covariate within each study to avoid ecological bias
  vector[N] x_centered;
  for (j in 1:NS) {
    real sum_x = 0;
    int n_j = 0;
    for (i in 1:N) {
      if (study[i] == j) {
        sum_x += x[i];         // Accumulate: sum_x = sum_x + x[i]
        n_j += 1;
      }
    }
    real mean_x_j = sum_x / n_j;
    for (i in 1:N) {
      if (study[i] == j) {
        x_centered[i] = x[i] - mean_x_j;
      }
    }
  }
}

parameters {
  vector[NS] mu;                 // Study-specific baselines (fixed effects for simplicity)
  vector[NT-1] delta_raw;        // Treatment effects for trt 2, 3, ..., NT
  real beta;                     // Prognostic effect of the covariate
}

transformed parameters {
  vector[NT] delta;              // Full treatment effect vector including reference
  delta[1] = 0.0;                // Reference treatment (Placebo) fixed to 0
  for (k in 2:NT) {
    delta[k] = delta_raw[k-1];
  }
}

model {
  // Priors
  mu ~ normal(0, 5);             // Weakly informative prior for baselines
  delta_raw ~ normal(0, 5);      // Weakly informative prior for relative effects
  beta ~ normal(0, 5);           // Weakly informative prior for covariate effect

  // Likelihood (One-Stage Hierarchical Logistic Regression)
  vector[N] logit_p;
  for (i in 1:N) {
    logit_p[i] = mu[study[i]] + delta[trt[i]] + beta * x_centered[i];
  }
  y ~ bernoulli_logit(logit_p);
}

This model directly processes the N rows of individual patient data. By internally calculating and subtracting the study-specific mean mean_x_j from the covariate x, the model effectively separates the pure within-trial prognostic effect (beta) from any across-trial baseline confounding.

5 Simulating and Executing IPD NMA in R

To complete the Bayesian workflow, we simulate a patient-level dataset with known true parameters and check if our Stan model can recover them accurately.

View the code
library(rstan)
set.seed(42)

# 1. Define simulation parameters
N_per_study <- 100
NS <- 2
N <- N_per_study * NS

# Study 1: Placebo (trt 1) vs Drug A (trt 2)
# Study 2: Placebo (trt 1) vs Drug B (trt 3)
study <- c(rep(1, N_per_study), rep(2, N_per_study))
trt <- c(
    rep(1, N_per_study / 2), rep(2, N_per_study / 2),
    rep(1, N_per_study / 2), rep(3, N_per_study / 2)
)

# Generate patient-level covariate (e.g., Baseline Severity)
x <- c(
    rnorm(N_per_study, mean = 0, sd = 1),
    rnorm(N_per_study, mean = 1, sd = 1)
)

# True parameters
mu_true <- c(-1.0, -0.5) # Study baselines (log-odds)
delta_true <- c(0, 1.0, 1.5) # Reference, Drug A, Drug B effects
beta_true <- 0.8 # Covariate effect

# 2. Calculate probability and generate outcomes
logit_p <- mu_true[study] + delta_true[trt] + beta_true * (x - ifelse(study == 1, 0, 1))
p <- 1 / (1 + exp(-logit_p))
y <- rbinom(N, size = 1, prob = p)

# 3. Prepare data for Stan
stan_data <- list(
    N = N,
    y = y,
    NS = NS,
    study = study,
    NT = 3,
    trt = trt,
    x = x
)

# 4. Run Stan model
fit <- stan(
    file = "IPD_NMA_Binary.stan", data = stan_data,
    chains = 4, iter = 2000, warmup = 1000, refresh = 0
)

# 5. Extract results for inline R
sum_fit <- summary(fit, pars = c("delta", "beta"))$summary

# 6. Print results to output
print(fit, pars = c("delta", "beta"))
Inference for Stan model: anon_model.
4 chains, each with iter=2000; warmup=1000; thin=1; 
post-warmup draws per chain=1000, total post-warmup draws=4000.

         mean se_mean   sd 2.5%  25%  50%  75% 97.5% n_eff Rhat
delta[1] 0.00     NaN 0.00 0.00 0.00 0.00 0.00  0.00   NaN  NaN
delta[2] 1.15    0.01 0.47 0.26 0.84 1.15 1.46  2.11  2164    1
delta[3] 1.90    0.01 0.48 0.99 1.58 1.89 2.22  2.82  2961    1
beta     1.06    0.00 0.20 0.67 0.91 1.05 1.19  1.47  3520    1

Samples were drawn using NUTS(diag_e) at Mon May 11 23:38:52 2026.
For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at 
convergence, Rhat=1).

5.1 Parameter Recovery Results

Running the above simulation yields the following posterior summaries. We compare the estimated values against the true simulation parameters to validate model correctness:

Parameter True Value Posterior Median 95% CrI Lower 95% CrI Upper Recovery Status
delta[2] (Drug A vs Ref) 1.00 1.15 0.26 2.11 Successfully Recovered
delta[3] (Drug B vs Ref) 1.50 1.89 0.99 2.82 Successfully Recovered
beta (Covariate Effect) 0.80 1.05 0.67 1.47 Successfully Recovered

By recovering the true parameters within the 95% Credible Intervals, we demonstrate that the One-Stage IPD NMA successfully adjusts for the baseline severity imbalance across trials without falling into ecological bias.

6 Industry Standard Alternative: multinma Package Syntax

In real-world HEOR consulting, writing custom Stan code is reserved for non-standard problems. For a standard IPD NMA with covariate adjustment, modellers typically use the multinma package (which uses Stan under the hood but wraps it in a simpler syntax).

Here is how the above analysis would be written using multinma (this code is for display and is not executed here to avoid dependency issues):

View the code
library(multinma)

# 1. Create the data frame from simulated vectors
df_ipd <- data.frame(
    study = study,
    trt = trt,
    y = y,
    x = x
)

# 2. Set up the network (specifying reference treatment)
net <- set_ipd(df_ipd,
    study = study,
    trt = trt,
    r = y
)

# 3. Run the NMA with regression on covariate x
# multinma handles centering and the network consistency relations automatically
fit_multinma <- nma(net,
    regression = ~x,
    likelihood = "bernoulli",
    link = "logit",
    refresh = 0
)

# 4. View results
print(fit_multinma)
A fixed effects NMA with a bernoulli likelihood (logit link).
Regression model: ~x.
Centred covariates at the following overall mean values:
        x 
0.4725156 
Inference for Stan model: binomial_1par.
4 chains, each with iter=2000; warmup=1000; thin=1; 
post-warmup draws per chain=1000, total post-warmup draws=4000.

           mean se_mean   sd    2.5%     25%     50%     75%   97.5% n_eff Rhat
beta[x]    1.06    0.00 0.21    0.66    0.92    1.06    1.20    1.48  3204    1
d[2]       1.17    0.01 0.48    0.25    0.84    1.16    1.48    2.15  2433    1
d[3]       1.90    0.01 0.48    0.93    1.57    1.90    2.22    2.85  2719    1
lp__    -110.50    0.04 1.62 -114.45 -111.33 -110.14 -109.31 -108.37  1967    1

Samples were drawn using NUTS(diag_e) at Wed May 20 22:13:20 2026.
For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at 
convergence, Rhat=1).