Background: A preliminary relationship between ORR and OS was established, using MBMA, for chemotherapy, pembrolizumab, and nivolumab in mNSCLC. However, this relationship was not precisely quantified across the broader PD-(L)1 landscape or across patient groups with different population characteristics. Oncology trials are prone to cross-study heterogeneity from multiple sources, including varying trial designs, diverse patients with numerous treatment options, and different prior therapies. MBMA may quantify distinct treatment effects, account for explained and unexplained variability and investigate the correlation between ORR and survival. Also, once developed, MBMA allows for indirect comparisons between compounds and predictions of future trial outcomes. Methods: MBMA with mixed-effects logistic regression was applied to quantify treatment-specific and covariate effects on ORR. MBMA with semi-parametric longitudinal mixed-effects models of PFS and OS (Kaplan–Meier curves) were developed as a function of observed ORR and other influential factors. Non-parametric reference survival curves described the baseline hazard and covariates were added using the proportional hazard assumption. Key treatments were immune checkpoint inhibitors as monotherapy and in combinations. Model-based head-to-head trials were simulated to predict hazard ratios for PD-1 vs PD-L1 inhibitors. Results: Data comprised ≥36 treatments from ≥90 studies. Significant ORR-PFS and ORR-OS correlations were established for each general treatment type. After including significant factors of treatment line (higher ORR for earlier lines), mean PD-L1 expression (higher ORR for higher expression if PD-(L)1-treated), and ECOG Performance Status (PS) (longer PFS/OS for higher proportion of PS=0), between-trial variability remained high. Consistent trends in simulated PFS HR and OS HR favoring PD-1 over PD-L1 (alone or with chemotherapy) were found, but none were statistically significant. Conclusion: ORR is significantly correlated with PFS and OS in mNSCLC. This work promotes evidence-based decision-making for late-stage trial designs by using earlier phase ORR data. It enables accurate benchmarking of emerging data by adjusting for known and unknown variability in existing and emerging data.