Multi-Group Comparisons

Testing Trends Across Ordered Groups: Jonckheere-Terpstra and Alternatives

When your groups have a natural order (dose levels, experience tiers, usage intensity), standard ANOVA ignores this structure. Learn about trend tests that leverage ordering for more power.

Share

Quick Hits

  • Standard ANOVA ignores group ordering—trend tests use this information for more power
  • Jonckheere-Terpstra is a non-parametric trend test based on ranks
  • Linear contrasts in ANOVA test for monotonic trends parametrically
  • Use trend tests when you expect outcomes to increase (or decrease) with group level

TL;DR

When groups have a natural ordering (low/medium/high, dose levels, experience tiers), standard ANOVA ignores this structure. Trend tests leverage the ordering to detect monotonic patterns more powerfully. Jonckheere-Terpstra is the non-parametric choice; linear contrasts work within ANOVA. Both are more powerful than standard ANOVA when a trend actually exists.


The Problem with Standard ANOVA

ANOVA tests whether any group differs from any other—it treats "Low" vs "High" the same as "Low" vs "Medium." This ignores valuable information.

import numpy as np
from scipy import stats

# Example: Response to increasing dose levels
np.random.seed(42)
placebo = np.random.normal(50, 10, 30)
low_dose = np.random.normal(52, 10, 30)
medium_dose = np.random.normal(55, 10, 30)
high_dose = np.random.normal(58, 10, 30)

# Standard ANOVA
f_stat, p_anova = stats.f_oneway(placebo, low_dose, medium_dose, high_dose)
print(f"Standard ANOVA: F = {f_stat:.2f}, p = {p_anova:.4f}")

# ANOVA might not be significant because it tests for "any difference"
# But there's a clear increasing trend!

Jonckheere-Terpstra Test

A non-parametric test for ordered alternatives. Tests whether there's a tendency for values to increase (or decrease) across ordered groups.

How It Works

  1. For each pair of groups (i < j), count how many times an observation in group j exceeds an observation in group i
  2. Sum these counts
  3. Large sums indicate an increasing trend

Python Implementation

from scipy.stats import mannwhitneyu
import numpy as np

def jonckheere_terpstra(groups, alternative='increasing'):
    """
    Jonckheere-Terpstra test for ordered alternatives.

    groups: list of arrays in order (lowest to highest expected values)
    alternative: 'increasing', 'decreasing', or 'two-sided'
    """
    k = len(groups)
    n_total = sum(len(g) for g in groups)

    # Calculate J statistic: sum of Mann-Whitney U for all ordered pairs
    J = 0
    for i in range(k - 1):
        for j in range(i + 1, k):
            # Count pairs where group j > group i
            for x in groups[i]:
                for y in groups[j]:
                    if y > x:
                        J += 1
                    elif y == x:
                        J += 0.5

    # Expected value and variance under null
    ns = [len(g) for g in groups]
    N = sum(ns)

    E_J = (N**2 - sum(n**2 for n in ns)) / 4

    # Variance (simplified formula)
    var_J = (N**2 * (2*N + 3) - sum(n**2 * (2*n + 3) for n in ns)) / 72

    # Z-score
    z = (J - E_J) / np.sqrt(var_J)

    # P-value
    if alternative == 'increasing':
        p_value = 1 - stats.norm.cdf(z)
    elif alternative == 'decreasing':
        p_value = stats.norm.cdf(z)
    else:  # two-sided
        p_value = 2 * (1 - stats.norm.cdf(abs(z)))

    return {
        'J_statistic': J,
        'z_score': z,
        'p_value': p_value
    }


# Test for increasing trend
result = jonckheere_terpstra([placebo, low_dose, medium_dose, high_dose],
                             alternative='increasing')
print(f"\nJonckheere-Terpstra test:")
print(f"  J = {result['J_statistic']:.0f}")
print(f"  z = {result['z_score']:.2f}")
print(f"  p = {result['p_value']:.4f}")

Using scipy (if available)

# scipy.stats doesn't have J-T built in, but you can use:
# pip install scikit-posthocs
import scikit_posthocs as sp

# Create dataframe
import pandas as pd
all_data = np.concatenate([placebo, low_dose, medium_dose, high_dose])
groups = np.repeat([1, 2, 3, 4], 30)
df = pd.DataFrame({'value': all_data, 'group': groups})

# Note: scikit_posthocs may have Jonckheere-Terpstra

R Implementation

library(clinfun)
jonckheere.test(value, group, alternative = "increasing")

# Or using PMCMRplus
library(PMCMRplus)
jonckheereTest(value ~ group, data = df, alternative = "greater")

Linear Contrast in ANOVA

A parametric alternative: test whether there's a linear relationship between group codes and outcomes.

How It Works

Assign numeric codes to ordered groups (e.g., 1, 2, 3, 4) and test whether the regression of outcome on these codes is significant.

import statsmodels.api as sm

def linear_contrast_test(groups, codes=None):
    """
    Test for linear trend using contrast coding.

    groups: list of arrays in order
    codes: optional linear codes (default: 1, 2, 3, ...)
    """
    k = len(groups)
    if codes is None:
        codes = list(range(1, k + 1))

    # Create data
    all_data = np.concatenate(groups)
    all_codes = np.repeat(codes, [len(g) for g in groups])

    # Regression
    X = sm.add_constant(all_codes)
    model = sm.OLS(all_data, X).fit()

    return {
        'slope': model.params[1],
        'se': model.bse[1],
        't_statistic': model.tvalues[1],
        'p_value': model.pvalues[1],
        'r_squared': model.rsquared
    }


result = linear_contrast_test([placebo, low_dose, medium_dose, high_dose])
print(f"\nLinear Contrast Test:")
print(f"  Slope = {result['slope']:.2f} per unit increase")
print(f"  t = {result['t_statistic']:.2f}")
print(f"  p = {result['p_value']:.4f}")
print(f"  R² = {result['r_squared']:.3f}")

R Implementation

# Linear contrast in ANOVA
# Assign orthogonal polynomial contrasts
contrasts(df$group) <- contr.poly(4)
model <- aov(value ~ group, data = df)
summary.lm(model)  # Linear trend is the .L coefficient

Comparing Methods

def compare_trend_tests(groups):
    """Compare standard ANOVA, J-T, and linear contrast."""
    print("Method Comparison for Ordered Groups:")
    print("=" * 50)

    # Standard ANOVA
    f_stat, p_anova = stats.f_oneway(*groups)
    print(f"Standard ANOVA: F = {f_stat:.2f}, p = {p_anova:.4f}")

    # Jonckheere-Terpstra
    jt = jonckheere_terpstra(groups, alternative='increasing')
    print(f"Jonckheere-Terpstra: z = {jt['z_score']:.2f}, p = {jt['p_value']:.4f}")

    # Linear contrast
    lc = linear_contrast_test(groups)
    print(f"Linear Contrast: t = {lc['t_statistic']:.2f}, p = {lc['p_value']:.4f}")


compare_trend_tests([placebo, low_dose, medium_dose, high_dose])

Power Comparison

When a linear trend exists, trend tests are more powerful:

def simulate_power(n_per_group=30, effect_per_step=2, sd=10, n_sims=1000):
    """Compare power of different tests for trend detection."""
    anova_sig = 0
    jt_sig = 0
    linear_sig = 0

    for _ in range(n_sims):
        # Generate data with linear trend
        g1 = np.random.normal(50, sd, n_per_group)
        g2 = np.random.normal(50 + effect_per_step, sd, n_per_group)
        g3 = np.random.normal(50 + 2*effect_per_step, sd, n_per_group)
        g4 = np.random.normal(50 + 3*effect_per_step, sd, n_per_group)

        groups = [g1, g2, g3, g4]

        # ANOVA
        _, p = stats.f_oneway(*groups)
        if p < 0.05:
            anova_sig += 1

        # J-T
        jt = jonckheere_terpstra(groups, 'increasing')
        if jt['p_value'] < 0.05:
            jt_sig += 1

        # Linear contrast
        lc = linear_contrast_test(groups)
        if lc['p_value'] < 0.05:
            linear_sig += 1

    return {
        'ANOVA_power': anova_sig / n_sims,
        'JT_power': jt_sig / n_sims,
        'Linear_power': linear_sig / n_sims
    }


power = simulate_power()
print("\nPower to detect linear trend:")
for method, pwr in power.items():
    print(f"  {method}: {pwr:.1%}")

When to Use Which

Situation Recommended Test
Ordered groups, expect linear trend Linear contrast
Ordered groups, any monotonic trend Jonckheere-Terpstra
Non-normal data, ordered groups Jonckheere-Terpstra
No expected ordering Standard ANOVA
Non-monotonic pattern possible Standard ANOVA
Confirmatory dose-response Linear or polynomial contrast

Practical Example: User Experience Tiers

# User engagement by experience level
np.random.seed(42)
new_users = np.random.normal(5, 2, 100)      # 0-30 days
regular_users = np.random.normal(7, 2, 100)   # 31-90 days
power_users = np.random.normal(10, 2, 100)    # 91-180 days
veterans = np.random.normal(12, 2, 100)       # 180+ days

print("Engagement by Experience Tier:")
print(f"  New users: M = {new_users.mean():.1f}")
print(f"  Regular users: M = {regular_users.mean():.1f}")
print(f"  Power users: M = {power_users.mean():.1f}")
print(f"  Veterans: M = {veterans.mean():.1f}")

# Test for increasing trend
groups = [new_users, regular_users, power_users, veterans]

jt = jonckheere_terpstra(groups, 'increasing')
print(f"\nTrend test: z = {jt['z_score']:.2f}, p = {jt['p_value']:.4f}")

lc = linear_contrast_test(groups)
print(f"Engagement increases by {lc['slope']:.1f} units per tier")

def trend_effect_size(groups):
    """
    Calculate effect size for trend (standardized slope).
    """
    codes = list(range(1, len(groups) + 1))
    all_data = np.concatenate(groups)
    all_codes = np.repeat(codes, [len(g) for g in groups])

    # Correlation between codes and values
    r = np.corrcoef(all_codes, all_data)[0, 1]

    # Standardized slope
    std_slope = r * (np.std(all_data) / np.std(all_codes))

    return {
        'correlation': r,
        'r_squared': r**2,
        'standardized_slope': std_slope
    }


effect = trend_effect_size([placebo, low_dose, medium_dose, high_dose])
print(f"\nTrend effect size:")
print(f"  r = {effect['correlation']:.3f}")
print(f"  R² = {effect['r_squared']:.3f}")


Key Takeaway

When comparing ordered groups (dose levels, experience tiers, usage intensity), trend tests like Jonckheere-Terpstra or linear contrasts are more powerful than standard ANOVA because they use the ordering information. Use them when you expect outcomes to increase or decrease systematically across groups.


References

  1. https://www.jstor.org/stable/2332611
  2. https://www.jstor.org/stable/2333011
  3. Jonckheere, A. R. (1954). A distribution-free k-sample test against ordered alternatives. *Biometrika*, 41(1/2), 133-145.
  4. Terpstra, T. J. (1952). The asymptotic normality and consistency of Kendall's test against trend, when ties are present in one ranking. *Indagationes Mathematicae*, 14(3), 327-333.
  5. Cuzick, J. (1985). A Wilcoxon-type test for trend. *Statistics in Medicine*, 4(4), 543-547.

Frequently Asked Questions

When should I use a trend test instead of ANOVA?
When groups have a natural order (low/medium/high, dose levels, experience tiers) and you expect a monotonic relationship. Trend tests are more powerful for detecting ordered patterns.
Does Jonckheere-Terpstra require normality?
No. It's a rank-based non-parametric test, so it doesn't assume any particular distribution.
What if the trend isn't linear?
Jonckheere-Terpstra detects any monotonic trend (always increasing or always decreasing). For non-monotonic patterns, use standard ANOVA or polynomial contrasts.

Key Takeaway

When comparing ordered groups (dose levels, experience tiers, usage intensity), trend tests like Jonckheere-Terpstra or linear contrasts are more powerful than standard ANOVA because they use the ordering information. Use them when you expect outcomes to increase or decrease systematically across groups.

Send to a friend

Share this with someone who loves clean statistical work.