Comparing Birth Weights: Probability Using Normal Distributions

Statistics & Probability 11th-12th Grade
Problem
In a long-term study it was found that the distribution of birth weights for full-term male babies was approximately normal with a mean of 3622 grams and a standard deviation of 532 grams. The distribution of birth weights for full-term female babies was also approximately normal with a mean of 3465 grams and a standard deviation of 414 grams. Assuming the birth weights are independent, what is the probability that a randomly selected female baby weighs more than a randomly selected male baby?

What This Problem Teaches

  • Working with differences of independent normal random variables
  • Understanding how variances combine when subtracting random variables
  • Converting probability questions into standard normal calculations
  • Interpreting negative z-scores and their corresponding probabilities
  • Recognizing when to model comparative scenarios using distribution theory

Visualizing the Problem

Before diving into calculations, let's see what we're comparing. We have two overlapping normal distributions with different means and standard deviations:

In a long-term study it was found that the distribution of birth weights for full-term male babies was approximately...

Notice that while males have a higher average birth weight, there's substantial overlap between the distributions. The question asks for the probability that a randomly chosen female weighs more than a randomly chosen male.

Solution: Method 1 — The Difference Distribution Approach

When comparing two normal random variables, the key insight is to work with their difference. This transforms a two-variable problem into a single-variable problem.

Step 1 — Define the random variables

Let M be the weight of a randomly selected male baby: M ~ N(3622, 532²)

Let F be the weight of a randomly selected female baby: F ~ N(3465, 414²)

We want to find P(F > M), which is equivalent to P(F - M > 0).

Step 2 — Create the difference variable

Define D = F - M. Since F and M are independent normal random variables, D is also normally distributed.

For independent normal variables X ~ N(μ₁, σ₁²) and Y ~ N(μ₂, σ₂²):

X - Y ~ N(μ₁ - μ₂, σ₁² + σ₂²)

Step 3 — Calculate the mean of D

The mean of the difference is:

μ_D = μ_F - μ_M = 3465 - 3622 = -157 grams

Step 4 — Calculate the variance and standard deviation of D

The variance of the difference is (note that variances add, even when subtracting):

σ²_D = σ²_F + σ²_M = 414² + 532² = 171,396 + 283,024 = 454,420

Therefore: σ_D = √454,420 ≈ 674.1 grams

Step 5 — Convert to standard normal

We want P(D > 0). Using standardization:

z = (0 - μ_D) / σ_D = (0 - (-157)) / 674.1 = 157 / 674.1 ≈ 0.233

Step 6 — Find the probability

Using the standard normal table:

P(D > 0) = P(Z > 0.233) = 1 - Φ(0.233) = 1 - 0.5920 = 0.4080

Solution: Method 2 — Direct Integration Approach

We can also solve this using the joint distribution of the two independent variables, though this leads to the same calculation through a different path.

Step 1 — Set up the joint probability

Since F and M are independent, their joint probability density function is:

f(f,m) = f_F(f) × f_M(m)

Step 2 — Define the region of interest

We want P(F > M), which means integrating over the region where f > m:

P(F > M) = ∫∫_{f>m} f_F(f) × f_M(m) df dm

Step 3 — Transform the integration

Making the substitution d = f - m, this double integral becomes:

P(F > M) = P(F - M > 0) = P(D > 0)

Step 4 — Apply the normal difference formula

This brings us back to the same calculation as Method 1:

D ~ N(-157, 674.1²), so P(D > 0) ≈ 0.408

Why this works: The integration approach confirms that when we have independent normal variables, finding P(X > Y) is equivalent to finding P(X - Y > 0), which reduces to a single-variable normal probability.

P(Female weighs more than Male) = 0.408 = 40.8%

Verification

Checking our calculation

Mean check:μ_D = 3465 - 3622 = -157 ✓

Variance check:σ²_D = 414² + 532² = 171,396 + 283,024 = 454,420 ✓

Standard deviation:σ_D = √454,420 = 674.07 ≈ 674.1 ✓

Z-score:z = 157/674.1 = 0.2328 ≈ 0.233 ✓

Probability: From standard normal table, P(Z > 0.233) = 0.408 ✓

Sanity check

Our answer of 40.8% makes intuitive sense. Since males have a higher average birth weight (3622g vs 3465g), we'd expect the probability of a female weighing more than a male to be less than 50%. The substantial overlap between the distributions means this probability isn't tiny—40.8% represents significant overlap between the two distributions.

Watch Out For These

❌ Mistake #1: Subtracting variances instead of adding them

Wrong:σ²_D = 414² - 532² = -111,628 (impossible!)

Why it's wrong: When working with independent random variables, Var(X - Y) = Var(X) + Var(Y), not Var(X) - Var(Y). Variances always add because they measure uncertainty, and subtracting two uncertain quantities increases the total uncertainty.

❌ Mistake #2: Using the wrong mean in the z-score calculation

Wrong:z = (0 - 3465) / 674.1 or z = (0 - 3622) / 674.1

Why it's wrong: We're working with the distribution of D = F - M, which has mean μ_D = -157, not the mean of either original distribution.

❌ Mistake #3: Misinterpreting P(Z > negative z-score)

Confusion: "The z-score is positive, so the answer should be less than 0.5"

Why it's wrong: A positive z-score means we're looking to the right of the mean. Since P(Z > positive value) < 0.5, our answer of 40.8% is correct. The fact that it's less than 50% reflects that females typically weigh less than males.

The General Pattern

This problem illustrates a fundamental principle in probability: comparing normal random variables using their difference distribution.

General Formula: If X ~ N(μ₁, σ₁²) and Y ~ N(μ₂, σ₂²) are independent, then:

P(X > Y) = P(Z > (μ₂ - μ₁) / √(σ₁² + σ₂²))

where Z is standard normal.

This pattern appears throughout statistics and applied probability:

  • Quality control: Comparing measurements from two production lines
  • A/B testing: Determining if one version significantly outperforms another
  • Medical research: Comparing treatment effects between groups
  • Finance: Comparing returns from different investment strategies

Important limitation: This approach only works for independent normal variables. If there's correlation between the variables (e.g., birth weights of twins), you need to account for the covariance term.

Why This Matters

Beyond birth weight studies, this type of calculation has widespread applications:

Manufacturing: Quality engineers use this to compare products from different production lines. If line A produces items with mean strength 100 (σ = 8) and line B produces items with mean 95 (σ = 12), what's the probability a random item from B is stronger than one from A?

Education: Comparing test scores between different teaching methods or schools. Educational researchers constantly need to determine if observed differences are statistically meaningful.

Clinical trials: When testing a new treatment, researchers compare patient outcomes between treatment and control groups. The statistical framework is identical to our birth weight problem.

Four "What-If?" Problems

1
Different Weight Threshold
Using the same birth weight distributions, what is the probability that a randomly selected female baby weighs at least 200 grams more than a randomly selected male baby?
Step 1 — Reframe the question

We want P(F - M ≥ 200) instead of P(F - M > 0).

Step 2 — Use the same difference distribution

We already know D = F - M ~ N(-157, 674.1²)

Step 3 — Calculate the z-score

z = (200 - (-157)) / 674.1 = 357 / 674.1 = 0.530

Step 4 — Find the probability

P(Z > 0.530) = 1 - 0.7019 = 0.2981

Answer

29.8% probability that a female weighs at least 200g more than a male.

2
Comparing to Percentiles
What female birth weight corresponds to the 75th percentile of the male distribution? What's the probability a randomly selected female exceeds this weight?
Step 1 — Find the 75th percentile for males

For M ~ N(3622, 532²), find x such that P(M < x) = 0.75

Step 2 — Use inverse normal

z₀.₇₅ = 0.674, so x = 3622 + 0.674 × 532 = 3622 + 358.6 = 3980.6g

Step 3 — Find P(F > 3980.6)

For F ~ N(3465, 414²): z = (3980.6 - 3465) / 414 = 515.6 / 414 = 1.245

Step 4 — Calculate probability

P(Z > 1.245) = 1 - 0.8934 = 0.1066

Answer

The 75th percentile male weight is 3981g. Only 10.7% of females exceed this weight.

3
Larger Sample Comparison
Suppose we randomly select 5 female babies and 5 male babies. What is the probability that the average weight of the females exceeds the average weight of the males?
Step 1 — Define sample means

Let be the mean of 5 females and be the mean of 5 males.

Step 2 — Distribution of sample means

F̄ ~ N(3465, 414²/5) = N(3465, 171.4²)

M̄ ~ N(3622, 532²/5) = N(3622, 238.0²)

Step 3 — Find distribution of difference

D = F̄ - M̄ ~ N(-157, √(171.4² + 238.0²)) = N(-157, 293.8)

Step 4 — Calculate probability

z = (0 - (-157)) / 293.8 = 0.534

P(Z > 0.534) = 1 - 0.7033 = 0.2967

Answer

29.7% probability that the female sample average exceeds the male sample average.

4
Reverse Engineering
If we wanted the probability that a female weighs more than a male to be exactly 45%, what would the female mean birth weight need to be (keeping all standard deviations the same)?
Step 1 — Set up the equation

We want P(F > M) = 0.45, which means P(Z > z) = 0.45

Step 2 — Find the required z-score

If P(Z > z) = 0.45, then P(Z ≤ z) = 0.55, so z = 0.126

Step 3 — Work backwards to find μ_F

z = (3622 - μ_F) / 674.1 = 0.126

3622 - μ_F = 0.126 × 674.1 = 84.9

Step 4 — Solve for μ_F

μ_F = 3622 - 84.9 = 3537.1 grams

Answer

The female mean would need to be 3537 grams for a 45% probability.

Frequently Asked Questions

How do you find the probability that one normal random variable exceeds another?+

Create a new random variable D = X - Y representing their difference. Since both original variables are normal and independent, D is also normal with mean μ_D = μ_X - μ_Y and variance σ²_D = σ²_X + σ²_Y. Then P(X > Y) = P(D > 0), which you find using the standard normal distribution.

Why do variances add when subtracting normal random variables?+

For independent random variables, Var(X - Y) = Var(X) + Var(Y), not Var(X) - Var(Y). This is because variance measures spread, and subtracting two variable quantities increases the total uncertainty. In this problem, σ²_D = 532² + 414² = 454,168, so σ_D = 674 grams.

What does a negative z-score mean in probability problems?+

A negative z-score means the value is below the distribution's mean. When finding P(Z > negative value), you're looking for the area to the right of a point left of center, which is greater than 0.5. Here, z = 0.233 gives P(Z > 0.233) = 0.408 or about 40.8%.

NJ
Neven Jurkovic, PhD

Professor of Computer Science, Palo Alto College, Alamo Colleges District, San Antonio, TX

Developer of Algebrator

Contact

This solution was prepared with AI assistance and reviewed by Dr. Jurkovic for mathematical accuracy and pedagogical clarity.

2026-06-05