Normal Distribution

The normal or Gaussian distribution is one of the most important probability distributions, widely used in statistics, engineering, and applied sciences. As noted by Montgomery and Runger in [1], it is symmetric around its mean, with higher probabilities for values near the mean and decreasing probabilities for extreme values. Wasserman [2] emphasizes its relevance as a foundation for probabilistic models and statistical inference.

Definitions

The normal distribution is characterized by two parameters:

  • \(\mu\): the mean, which represents the central value of the distribution.
  • \(\sigma\): the standard deviation, which quantifies the spread of the distribution around the mean.

The variance, \(\sigma^2\), is the square of the standard deviation. The probability density function (PDF) and the cumulative distribution function (CDF) are essential for describing the normal distribution's behavior.

Probability Density Function (PDF)

The probability density function (PDF) of the normal distribution quantifies the likelihood of a random variable, \(X\), taking on a specific value within its domain. The PDF is defined in Equation 1 as follows:

\[ f(x) = \frac{1}{\sqrt{2\pi\sigma^2}} \exp\left(-\frac{(x - \mu)^2}{2\sigma^2}\right), \quad \text{for } -\infty < x < \infty. \]

(1)

Standard Normal Distribution and Transformations

The cumulative distribution function (CDF) of the normal distribution, as shown in Equation 2, often lacks a closed-form solution. Instead, results are typically presented in terms of a standard normal distribution, which has a mean of 0 and a standard deviation of 1. Any random variable \(X \sim N(\mu, \sigma^2)\) can be transformed into a standard normal variable \(Y \sim N(0, 1)\) using the formula shown in Equation 3:

\[ F(x) = \int_{-\infty}^{x} \frac{1}{\sqrt{2\pi \sigma^2}} \exp\left(-\frac{(t - \mu)^2}{2\sigma^2}\right) dt, \]

(2)

The transformation formula is:

\[ Y = \frac{X - \mu}{\sigma}. \]

(3)

For the standard normal variable \(Y\), the probability density function \(\phi(y)\) and the cumulative distribution function \(\Phi(y)\) are defined as follows:

\[ \begin{align*} \phi(y) = \frac{1}{\sqrt{2\pi}} \exp\left(-\frac{y^2}{2}\right), \quad \text{for } -\infty < y < \infty, \\ \Phi(y) = \int_{-\infty}^{y} \phi(z) dz, \quad \text{for } -\infty < y < \infty. \end{align*} \]

The PDF \(f(x)\) for a variable \(X \sim N(\mu, \sigma^2)\) can be expressed in terms of \(\phi(y)\) as follows:

\[ f(x) = \phi\left(\frac{x - \mu}{\sigma}\right). \]

Probability Intervals

Some useful results concerning a normal distribution are summarized below. These results, shown in Equation 4, define the probabilities of a normal random variable falling within specific intervals of the mean and standard deviation [2]:

\[ \begin{align*} P(\mu - \sigma < X < \mu + \sigma) &= 0.6827, \\ P(\mu - 2\sigma < X < \mu + 2\sigma) &= 0.9545, \\ P(\mu - 3\sigma < X < \mu + 3\sigma) &= 0.9973. \end{align*} \]

(4)

Additionally, due to the symmetry of \(f(x)\), we have:

\[ P(X > \mu) = P(X < \mu) = 0.5. \]

This demonstrates that for a normal distribution, approximately 68% of values fall within one standard deviation of the mean, 95% within two, and 99.7% within three, which is known as the empirical rule or the 68-95-99.7 rule [2].

Confidence Intervals and Applications

The normal distribution is often used to model errors or deviations in manufacturing or production processes. Confidence intervals are defined in terms of the factor \(k\), which represents the number of standard deviations from the mean. For a confidence interval of \(k\) standard deviations, we have:

\[ P(x_{\text{inf}} < x < x_{\text{sup}}) = P[\mu - k\sigma < x < \mu + k\sigma] = \int_{-k}^{k} \phi(y) dy = \Phi(k) - \Phi(-k). \]

The limits \(x_{\text{inf}}\) and \(x_{\text{sup}}\) are used as filters in quality control. For a confidence level of 95.5%, for example, components with dimensions \(x_i < \mu - 2\sigma\) or \(x_i > \mu + 2\sigma\) are considered out of specification. This helps prevent excessive variations from compromising the final product's quality.

Applications in Engineering

The normal distribution finds extensive applications in engineering, particularly in the fields of reliability analysis, quality control, and structural design. As highlighted by Choi et al. [3], it is often used in modeling uncertainties in material properties, load capacities, and environmental conditions.

Montgomery and Runger [1] emphasize its critical role in statistical process control, where control charts based on the normal distribution help monitor manufacturing processes and detect deviations from expected performance. In structural engineering, reliability assessments use the normal distribution to estimate the probability of failure under various loading conditions.

Wasserman [2] discusses the application of the normal distribution in machine learning and data analysis, particularly in Bayesian inference, where it serves as a key component of probabilistic models. These applications illustrate the versatility of the normal distribution in solving complex, real-world engineering problems.

References