Contents
Cox Proportional Hazards Regression
Cox Proportional Hazards Regression models the effect of multiple variables on time-to-event outcomes. Use it to identify which factors accelerate or delay an event such as churn, conversion, or failure.

Quick Hits
- •Models the effect of one or more covariates on time-to-event outcomes
- •Produces hazard ratios: HR > 1 means higher risk, HR < 1 means lower risk
- •Semi-parametric: does not assume a specific survival distribution shape
- •Requires proportional hazards: the hazard ratio must be constant over time
- •Handles censoring and multiple covariates simultaneously
The StatsTest Flow: Time-to-Event / Survival >> Model factors affecting survival
Not sure this is the right statistical method? Use the Choose Your StatsTest workflow to select the right method.
What is Cox Proportional Hazards Regression?
Cox Proportional Hazards Regression (also called Cox regression or the Cox model) is a semi-parametric regression model for time-to-event data. It models how covariates (predictors) affect the hazard function, which is the instantaneous rate of the event occurring at any given time.
The model is "semi-parametric" because it does not specify the shape of the baseline hazard function. It only models the relative effect of covariates on the hazard, expressed as hazard ratios. This makes it very flexible: you do not need to assume the survival distribution is exponential, Weibull, or any other specific shape.
Cox Proportional Hazards Regression is also called the Cox Model, Cox Regression, Cox PH Model, or the Proportional Hazards Model.
Assumptions for Cox Proportional Hazards Regression
Every statistical method has assumptions. Assumptions mean that your data must satisfy certain properties in order for statistical method results to be accurate.
The assumptions for Cox Proportional Hazards Regression include:
- Proportional Hazards
- Time-to-Event Outcome
- Non-Informative Censoring
- Linear Covariate Effects (on the log-hazard scale)
- Independent Observations
Proportional Hazards
The ratio of hazards between any two groups must be constant over time. If Group A has twice the hazard of Group B at day 1, it should have approximately twice the hazard at day 30 and day 90 as well.
Test this assumption using Schoenfeld residuals. If it is violated, consider stratified models, time-varying coefficients, or alternative approaches like the Accelerated Failure Time model.
Time-to-Event Outcome
Your dependent variable is the time from a defined starting point to an event. All the standard survival analysis requirements apply: a clear origin, a well-defined event, and proper handling of censoring.
Non-Informative Censoring
Censored subjects must not be systematically different from those still at risk. If subjects drop out because they are about to experience the event, the model estimates will be biased.
Linear Covariate Effects
Cox regression assumes a linear relationship between covariates and the log hazard. For continuous predictors, check for non-linearity using martingale residuals or by testing polynomial or spline terms.
Independent Observations
Observations must be independent. If subjects are clustered (e.g., patients within hospitals, users within companies), use a frailty model or robust standard errors.
If you want to compare survival curves without covariates, use the Log-Rank Test instead. If you want to estimate a survival curve, use the Kaplan-Meier Estimator.
When to use Cox Proportional Hazards Regression?
You should use Cox Proportional Hazards Regression in the following scenario:
- Your outcome is time until an event (churn, conversion, failure)
- You want to understand which factors affect the rate of the event
- You have multiple covariates to control for
- You have censored observations
- The hazard ratios are approximately constant over time
Multivariate Survival Analysis
The main advantage of Cox regression over the log-rank test is the ability to include multiple covariates simultaneously. While the log-rank test compares groups, Cox regression quantifies the effect of each predictor while controlling for the others.
Censored Data with Covariates
Like all survival methods, Cox regression handles censoring correctly. Unlike logistic regression, it uses the full timing information rather than reducing the outcome to a binary indicator.
If the proportional hazards assumption is violated, consider the Accelerated Failure Time (AFT) model. If your outcome is binary (event within a fixed window: yes/no), Logistic Regression may be more appropriate.
Cox Proportional Hazards Example
Outcome: Time from free-trial signup to paid conversion. Covariates: Plan tier viewed, number of features used in first week, company size, referral source.
We follow a cohort of free-trial users and track when each one converts to a paid plan. Some users never convert during the observation period and are censored.
We fit a Cox model and obtain hazard ratios for each covariate. For example:
- Features used in first week: HR = 1.12 (each additional feature used increases conversion hazard by 12%)
- Enterprise plan viewed: HR = 0.75 (enterprise prospects convert more slowly, likely due to longer procurement cycles)
- Referred by existing customer: HR = 1.45 (referral users convert 45% faster)
Each hazard ratio tells us how the covariate changes the rate of conversion while controlling for all other variables. A -value for any covariate means its effect is statistically significant.
References
- https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3059453/
- https://www.bmj.com/content/317/7172/1572
- https://lifelines.readthedocs.io/
Frequently Asked Questions
What is a hazard ratio?
What if the proportional hazards assumption is violated?
How is Cox regression different from logistic regression?
Key Takeaway
Cox Proportional Hazards regression is the standard method for understanding what factors affect how quickly an event occurs. It produces hazard ratios that quantify risk, handles censored data and multiple covariates, and makes no assumption about the baseline hazard shape. The key requirement is that hazard ratios remain constant over time.