Hypotheses testen - Machine Learning Cursussen

In dit labo leer je hoe je statistische hypotheses test op basis van steekproefdata. We gebruiken verschillende statistische tests om beslissingen te maken onder onzekerheid.

import numpy as np
from scipy import stats

✍️ p-waarde¶

Bereken de eenzijdige p-waarde van de nulhypothese $H_0: X \sim \mathcal{N}(55, 81^2)$ voor volgende waarden van $x$ : 15, 120, 63, 888

mu = 55
sigma = 81

for x in [15, 120, 63, 888]:
    p_right = 1 - stats.norm.cdf(x, mu, sigma)
    p_left = stats.norm.cdf(x, mu, sigma)

    if x < mu:
        print(f"x = {x:3d}: P(X≤x) = {p_left:.4f}")
    else:
        print(f"x = {x:3d}: P(X≥x) = {p_right:.4f}")

x =  15: P(X≤x) = 0.3107
x = 120: P(X≥x) = 0.2111
x =  63: P(X≥x) = 0.4607
x = 888: P(X≥x) = 0.0000

✍️ One-Sample t-test¶

Een bedrijf beweert dat de gemiddelde reistijd van werknemers naar het werk 30 minuten is. Een steekproef van 20 werknemers toont de volgende reistijden (in minuten):

[28, 35, 32, 29, 31, 33, 27, 36, 30, 34, 31, 29, 32, 28, 35, 31, 33, 30, 32, 29]

Test of de gemiddelde reistijd significant verschilt van 30 minuten (α = 0.05). Doe dit zowel via manuele berekening van de teststatistiek als via scipy.stats.ttest_1samp.

# Sample data
commute_times = np.array(
    [28, 35, 32, 29, 31, 33, 27, 36, 30, 34, 31, 29, 32, 28, 35, 31, 33, 30, 32, 29]
)
mu_0 = 30  # Hypothesized mean

# Manual calculation
# Calculate sample statistics
sample_mean = commute_times.mean()
sample_std = commute_times.std(ddof=1)  # N-1 for sample standard deviation
n = len(commute_times)
se = sample_std / np.sqrt(n)  # Standard error

# Calculate t-statistic manually
t_statistic_manual = (sample_mean - mu_0) / se

# Degrees of freedom
df = n - 1

# Calculate p-value (two-tailed test)
p_value_manual = 2 * (1 - stats.t.cdf(abs(t_statistic_manual), df))

print("Manual")
print(f"t-statistic: {t_statistic_manual:.4f}")
print(f"p-value (two-tailed): {p_value_manual:.4f}")

# Using scipy.stats.ttest_1samp
t_statistic_scipy, p_value_scipy = stats.ttest_1samp(commute_times, mu_0)


print("\nUSING stats.ttest_1samp():")
print(f"t-statistic: {t_statistic_scipy:.4f}")
print(f"p-value (two-tailed): {p_value_scipy:.4f}")

# Hypothesis test conclusion
print("\nTest result:")
if p_value_scipy < 0.05:
    print("The mean commute time significantly differs from 30 minutes.")
else:
    print("No significant evidence that mean commute time differs from 30 minutes.")

Manual
t-statistic: 2.1904
p-value (two-tailed): 0.0412

USING stats.ttest_1samp():
t-statistic: 2.1904
p-value (two-tailed): 0.0412

Test result:
The mean commute time significantly differs from 30 minutes.

✍️ Two-Sample t-test¶

Een farmaceutisch bedrijf test een nieuw medicijn om bloeddruk te verlagen. Ze vergelijken twee groepen:

Controlegroep (placebo): [120, 118, 122, 125, 119, 121, 123, 120, 124, 122, 121, 119, 123, 120, 122]
Behandelingsgroep (medicijn): [115, 112, 118, 114, 116, 113, 115, 117, 114, 116, 115, 113, 112, 118, 116]

Test of het medicijn de bloeddruk significant verlaagt (eenzijdige test, α = 0.05) - in de veronderstelling van een gezamenlijke sample variantie. Doe dit zowel via manuele berekening van de teststatistiek als via scipy.stats.ttest_ind.

# Blood pressure data
control = np.array([120, 118, 122, 125, 119, 121, 123, 120, 124, 122, 121, 119, 123, 120, 122])
treatment = np.array([115, 112, 118, 114, 116, 113, 115, 117, 114, 116, 115, 113, 112, 118, 116])

# Manual calculation
n1 = len(control)
n2 = len(treatment)
mean1 = control.mean()
mean2 = treatment.mean()
std1 = control.std(ddof=1)
std2 = treatment.std(ddof=1)

# Pooled standard deviation (assuming equal variances)
pooled_std = np.sqrt(((n1 - 1) * std1**2 + (n2 - 1) * std2**2) / (n1 + n2 - 2))

# Standard error of difference
se_diff = pooled_std * np.sqrt(1 / n1 + 1 / n2)

# t-statistic
t_stat_manual = (mean1 - mean2) / se_diff

# Degrees of freedom
df_manual = n1 + n2 - 2

# p-value for one-sided test (control > treatment)
p_value_manual = 1 - stats.t.cdf(t_stat_manual, df_manual)

print("Manual")
print(f"t-statistic: {t_stat_manual:.4f}")
print(f"p-value: {p_value_manual:.4f}")

# Using scipy.stats.ttest_ind
t_stat_scipy, p_value_scipy = stats.ttest_ind(control, treatment, alternative="greater")

print("\nUsing scipy.stats.ttest_ind():")
print(f"t-statistic: {t_stat_scipy:.4f}")
print(f"p-value: {p_value_scipy:.4f}")

# Hypothesis test conclusion
if p_value_scipy < 0.05:
    print("The medicine significantly lowers blood pressure.")
else:
    print("No significant evidence that the medicine lowers blood pressure.")

Manual
t-statistic: 8.8369
p-value: 0.0000

Using scipy.stats.ttest_ind():
t-statistic: 8.8369
p-value: 0.0000
The medicine significantly lowers blood pressure.

✍️ Binomiaaltest¶

Een webshop beweert dat 80% van hun klanten tevreden is. In een steekproef van 100 klanten zijn 72 tevreden.

Test of de tevredenheid significant lager is dan 80% (eenzijdige test, α = 0.05). Gebruik hiervoor scipy.stats.binomtest

# Binomial test parameters
n_trials = 100  # Total number of customers
n_successes = 72  # Number of satisfied customers
p_claimed = 0.80  # Claimed satisfaction rate

result = stats.binomtest(n_successes, n_trials, p_claimed, alternative="less")

# Calculate sample proportion
p_observed = n_successes / n_trials

print(f"Sample proportion: {p_observed:.2%}")
print(f"p-value (one-sided): {result.pvalue:.4f}")
if result.pvalue < 0.05:
    print("Satisfaction is significantly lower than 80%.")
else:
    print("No significant evidence that satisfaction is lower than 80%.")

Sample proportion: 72.00%
p-value (one-sided): 0.0342
Satisfaction is significantly lower than 80%.