NORMAL DISTRIBUTION - Quantitative Techniques for management

The normal distribution is a theoretical function commonly used in inferential statistics as an approximation to sampling distributions. In general, the normal distribution provides a good model for a random variable, when:

  1. There is a strong tendency for the variable to take a central value;
  2. Positive and negative deviations from this central value are equally likely;
  3. The frequency of deviations falls off rapidly as the deviations become larger.

As an underlying mechanism that produces the normal distribution, we can think of an infinite number of independent random (binomial) events that bring about the values of a particular variable. For example, there are probably a nearly infinite number of factors that determine a person's height (thousands of genes, nutrition, diseases, etc.). Thus, height can be expected to be normally distributed in the population.

Since Gauss used this curve to describe the theory of accidental errors of measurements involved in the calculation of orbits of heavenly bodies, it is also called as Gaussian curve.

The Conditions of Normality

In order that the distribution of a random variable X is normal, the factors affecting its observations must satisfy the following conditions:

  1. A large number of chance factors: The factors, affecting the observations of a random variable, should be numerous and equally probable so that the occurrence or non-occurrence of any one of them is not predictable.
  2. Condition of homogeneity: The factors must be similar over the relevant population although, their incidence may vary from observation to observation.
  3. Condition of independence: The factors, affecting observations, must act independently of each other.
  4. Condition of symmetry: Various factors operate in such a way that the deviations of observations above and below mean are balanced with regard to their magnitude as well as their number.

Random variables observed in many phenomena related to economics, business and other social as well as physical sciences are often found to be distributed normally. For example, observations relating to the life of an electrical component, weight of packages, height of persons, income of the inhabitants of certain area, diameter of wire, etc., are affected by a large number of factors and hence, tend to follow a pattern that is very similar to the normal curve.

In addition to this, when the number of observations become large, a number of probability distributions like Binomial, Poisson, etc., can also be approximated by this distribution.

Probability Density Function

If X is a continuous random variable, distributed normally with mean m and standard deviation s , then its p.d.f. is given by

Probability Density Function

Here p and s are absolute constants with values 3.14159.... and 2.71828.... respectively. It may be noted here that this distribution is completely known if the values of mean m and standard deviation s are known. Thus, the distribution has two parameters, viz. mean and standard deviation.

Shape of Normal Probability Curve

For given values of the parameters, m and s, the shape of the curve corresponding to normal probability density function p(X) is as shown in Figure.

Shape of Normal Probability Curve

It should be noted here that although we seldom encounter variables that have a range from - ∞ to ∞, as shown by the normal curve, nevertheless the curves generated by the relative frequency histograms of various variables closely resembles the shape of normal curve.

Properties of Normal Probability Curve

A normal probability curve or normal curve has the following properties:

  1. It is a bell shaped symmetrical curve about the ordinate at X = m . The ordinate is maximum at X = m .
  2. It is unimodal curve and its tails extend infinitely in both directions, i.e., the curve is asymptotic to X axis in both directions.
  3. All the three measures of central tendency coincide, i.e., mean = median = mode
  4. The total area under the curve gives the total probability of the random variable taking values between -¥ to ¥ . Mathematically, it can be shown that

    Properties of Normal Probability Curve

  5. Since median = m, the ordinate at X = m divides the area under the normal curve into two equal parts, i.e.,

    normal curve into two equal parts

  6. The value of p(X) is always non-negative for all values of X, i.e., the whole curve lies above X axis.
  7. The points of inflexion (the point at which curvature changes) of the curve are at X = m ± s .
  8. The quartiles are equidistant from median, i.e., Md - Q1 = Q3 - Md , by virtue of symmetry. Also Q1 = m - 0.6745 s , Q3 = m + 0.6745 s , quartile deviation = 0.6745 s and mean deviation = 0.8s , approximately.
  9. Since the distribution is symmetrical, all odd ordered central moments are zero.
  10. The successive even ordered central moments are related according to the following recurrence formula µ 2n = (2n - 1) σ 2µ 2n - 2 for = 1, 2, 3, ......
  11. The value of moment coefficient of skewness Β1 is zero.
  12. The coefficient of kurtosis

    Β2= µ422= 3 σ44= 3

    Note that the above expression makes use of property .

    coefficient of kurtosis

  1. Area property: The area under the normal curve is distributed by its standard deviation in the following manner.

    Area property

    1. The area between the ordinates at m - sand m + sis 0.6826. This implies that for a normal distribution about 68% of the observations will lie betweenµ - σ and µ + s.
    2. The area between the ordinates at µ– 2σand µ+ 2σis 0.9544. This implies that for a normal distribution about 95% of the observations will lie between µ– 2σand µ+ 2σ.
    3. The area between the ordinates at µ– 3σand µ/em>+ 3σis 0.9974. This implies that for a normal distribution about 99% of the observations will lie between µ– 3σand µ+ 3σ. This result shows that, practically, the range of the distribution is 6σ although, theoretically, the range is from – ∞ to ∞.

Probability of Normal Variate in an Interval

Let X be a normal variate distributed with mean m and standard deviation s, also written in abbreviated form as X ~ N(m, s) The probability of X lying in the interval (X1, X2) is given by

Probability of Normal Variate in an Interval

Probability of Normal Variate in an Interval

In terms of figure, this probability is equal to the area under the normal curve between the ordinates at X = X1 and X = X2 respectively.

Note: It may be recalled that the probability that a continuous random variable takes a particular value is defined to be zero even though the event is not impossible.

It is obvious from the above that, to find P(X1≤X ≤ X2), we have to evaluate an integral which might be cumbersome and time consuming task. Fortunately, an alternative procedure is available for performing this task. To devise this procedure, we define a new variable z= X-µ/ σ.

new variable

Further, from the reproductive property, it follows that the distribution of z is also normal.

Thus, we conclude that if X is a normal variate with mean m and standard deviation s, then z= X-µ/σ is a normal variate with mean zero and standard deviation unity. Since the parameters of the distribution of z are fixed, it is a known distribution and is termed as standard normal distribution (s.n.d.). Further, z is termed as a standard normal variate (s.n.v.).

It is obvious from the above that the distribution of any normal variate X can always be transformed into the distribution of standard normal variate z. This fact can be utilised to evaluate the integral given above.

distribution of standard normal variate z

distribution of standard normal variate z

In terms of figure, this probability is equal to the area under the standard normal curve between the ordinates at z = z1 and z = z2.

Since the distribution of z is fixed, the probabilities of z lying in various intervals are tabulated. These tables can be used to write down the desired probability.

Example: Using the table of areas under the standard normal curve, find the following probabilities :

(i) P(0 ≤ z ≤ 1.3) (ii) P(–1 ≤ z ≤ 0) (iii) P(–1 ≤ z ≤ 12)
(iv) P( z ≥ 1.54) (v) P(|z| > 2) (vi) P(|z| < 2)

Solution: The required probability, in each question, is indicated by the shaded are of the corresponding figure.

  1. From the table, we can write P(0 ≤ z ≤ 1.3) = 0.4032.
  2. We can write P(–1 ≤ z ≤ 0) = P(0 ≤ z ≤ 1), because the distribution is symmetrical.
    required probability
  3. We can write
    P(z ≥ 1.54) = 0.5000 – P(0 ≤ z ≤ 1.54) = 0.5000 – 0.4382 = 0.0618.
  4. P(|z| > 2) = P(z > 2) + P(z < – 2) = 2P(z > 2) = 2[0.5000 - P(0 ≤ z ≤ 2)]
    = 1 – 2P(0 ≤ z ≤ 2) = 1 – 2 × 0.4772 = 0.0456.
    (vi) P(|z| < 2) = P(- 2 ≤ z ≤ 0) + P(0 ≤ z ≤ 2) = 2P(0 ≤ z ≤ 2) = 2 × 0.4772 = 0.9544.

Example : Determine the value or values of z in each of the following situations:
(a) Area between 0 and z is 0.4495.
(b) Area between – ∞ to z is 0.1401.
(c) Area between – ∞ to z is 0.6103.
(d) Area between – 1.65 and z is 0.0173.
(e) Area between – 0.5 and z is 0.5376.

Solution:

(a) On locating the value of z corresponding to an entry of area 0.4495 in the table of areas under the normal curve, we have z = 1.64. We note that the same situation may correspond to a negative value of z. Thus, z can be 1.64 or - 1.64.

(b) Since the area between –∞ to z < 0.5, z will be negative. Further, the area between z and 0 = 0.5000 – 0.1401 = 0.3599. On locating the value of z corresponding to this entry in the table, we get z = –1.08.

(c) Since the area between –∞ to z > 0.5000, z will be positive. Further, the area between 0 to z = 0.6103 - 0.5000 = 0.1103. On locating the value of z corresponding to this entry in the table, we get z = 0.28.

(d) Since the area between –1.65 and z < the area between –1.65 and 0 (which, from table, is 0.4505), z is negative. Further z can be to the right or to the left of the value –1.65. Thus, when z lies to the right of –1.65, its value, corresponds to an area (0.4505 – 0.0173) = 0.4332, is given by z = –1.5 (from table). Further, when z lies to the left of - 1.65, its value, corresponds to an area (0.4505 + 0.0173) = 0.4678, is given by z = –1.85 (from table).

(e) Since the area between –0.5 to z > area between –0.5 to 0 (which, from table, is 0.1915), z is positive. The value of z, located corresponding to an area (0.5376 – 0.1915) = 0.3461, is given by 1.02.

Example : If X is a random variate which is distributed normally with mean 60 and standard deviation 5, find the probabilities of the following events:
(i) 60 ≤ X ≤ 70, (ii) 50 ≤ X ≤ 65, (iii) X > 45, (iv) X ≤ 50.

Solution: It is given that m = 60 and s = 5

(i) Given X1 = 60 and X2 = 70, we can write

z1 = X1-µ/σ = 60-60/5=0 and z2= X2-µ/σ= 70-60-5 = 2.

∴ P(60 ≤ X ≤ 70) = P(0 ≤ z ≤ 2) = 0.4772 (from table).

probabilities

(ii) Here X1 = 50 and X2 = 65, therefore, we can write

z1= 50-60/5 = -2 and z2=65-60/5 = 1

Hence P(50 ≤ X ≤ 65) = P(–2 ≤ z ≤ 1) = P(0 ≤ z ≤ 2) + P(0 ≤ z ≤ 1)
= 0.4772 + 0.3413 = 0.8185

P(X >45) = P(z ≥ =
45-60
5
) = P(z ≥-3)


= P(-3≤ z≤0)+ P(0≤z≤∞)= P(0≤z≤3)+P(0≤z≤∞)
=0.4987+0.5000=0.9987

probabilities

Example : The average monthly sales of 5,000 firms are normally distributed with mean Rs 36,000 and standard deviation Rs 10,000. Find :
(i) The number of firms with sales of over Rs 40,000.
(ii) The percentage of firms with sales between Rs 38,500 and Rs 41,000.
(iii) The number of firms with sales between Rs 30,000 and Rs 40,000.

Solution: Let X be the normal variate which represents the monthly sales of a firm. Thus X ~ N(36,000, 10,000).

monthly sales of a firm

monthly sales of a firm

Example : In a large institution, 2.28% of employees have income below Rs 4,500 and 15.87% of employees have income above Rs. 7,500 per month. Assuming the distribution of income to be normal, find its mean and standard deviation.

Solution: Let the mean and standard deviation of the given distribution be mand srespectively.

mean and standard deviation

Example : Marks in an examination are approximately normally distributed with mean 75 and standard deviation 5. If the top 5% of the students get grade A and the bottom 25% get grade F, what mark is the lowest A and what mark is the highest F?

Solution: Let A be the lowest mark in grade A and F be the highest mark in grade F. From the given information, we can write

highest mark in grade F

Example : The mean inside diameter of a sample of 200 washers produced by a machine is 5.02 mm and the standard deviation is 0.05 mm. The purpose for which these washers are intended allows a maximum tolerance in the diameter of 4.96 to 5.08 mm, otherwise the washers are considered as defective. Determine the percentage of defective washers produced by the machine on the assumption that diameters are normally distributed.

percentage of defective washers

Example : The average number of units produced by a manufacturing concern per day is 355 with a standard deviation of 50. It makes a profit of Rs 1.50 per unit. Determine the percentage of days when its total profit per day is (i) between Rs 457.50 and Rs 645.00, (ii) greater than Rs 682.50 (assume the distribution to be normal). The area between z = 0 to z = 1 is 0.34134, the area between z = 0 to z = 1.5 is 0.43319 and the area between z = 0 to z = 2 is 0.47725, where z is a standard normal variate.

Solution: Let X denote the profit per day. The mean of X is 355*1.50 = Rs 532.50 and its S.D. is 50*1.50 = Rs 75. Thus, X ~ N (532.50, 75).

probability of profit per day

Example : The distribution of 1,000 examinees according to marks percentage is given below:

marks percentage

Assuming the marks percentage to follow a normal distribution, calculate the mean and standard deviation of marks. If not more than 300 examinees are to fail, what should be the passing marks?

Solution: Let X denote the percentage of marks and its mean and S.D. be mand s respectively. From the given table, we can write

mean and standard deviation

Example : In a certain book, the frequency distribution of the number of words per page may be taken as approximately normal with mean 800 and standard deviation 50. If three pages are chosen at random, what is the probability that none of them has between 830 and 845 words each?

Solution: Let X be a normal variate which denotes the number of words per page. It is given that X ~ N(800, 50).

The probability that a page, select at random, does not have number of words between 830 and 845, is given by

percentage of defective washers

Thus, the probability that none of the three pages, selected at random, have number of words lying between 830 and 845 = (0.91)3 = 0.7536.

Example : At a petrol station, the mean quantity of petrol sold to a vehicle is 20 litres per day with a standard deviation of 10 litres. If on a particular day, 100 vehicles took 25 or more litres of petrol, estimate the total number of vehicles who took petrol from the station on that day. Assume that the quantity of petrol taken from the station by a vehicle is a normal variate.

Solution: Let X denote the quantity of petrol taken by a vehicle. It is given that X ~ N(20, 10).

probability

Normal Approximation to Binomial Distribution

Normal distribution can be used as an approximation to binomial distribution when n is large and neither p nor q is very small. If X denotes the number of successes with probability p of a success in each of the n trials, then X will be distributed approximately normally with mean np and standard deviation √ npq.

Normal Approximation to Binomial Distribution

ICorrection for Continuity
Since the number of successes is a discrete variable, to use normal approximation, we have make corrections for continuity.

Example : An unbiased die is tossed 600 times. Use normal approximation to binomial to find the probability obtaining

(i) more than 125 aces,
(ii) number of aces between 80 and 110,
(iii) exactly 150 aces.

Solution: Let X denote the number of successes, i.e., the number of aces.

ICorrection for Continuity

ICorrection for Continuity

Normal/ Approximation to Poisson Distribution

Normal distribution can also be used to approximate a Poisson distribution when its parameter m ≥10. If X is a Poisson variate with mean m, then, for m ≥ 10, the distribution of X can be taken as approximately normal with mean m and standard deviation √m so that z=x-m/√m is a standard normal variate.

Fitting of Normal distribution problems

The fittings of Normal distribution problems are given below

Example : A random variable X follows Poisson distribution with parameter 25. Use normal approximation to Poisson distribution to find the probability that X is greater than or equal to 30.

Solution:

normal approximation to Poisson distribution

Fitting a Normal Curve

A normal curve is fitted to the observed data with the following objectives:

1. To provide a visual device to judge whether it is a good fit or not.
2. Use to estimate the characteristics of the population.

The fitting of a normal curve can be done by

(a) The Method of Ordinates or
(b) The Method of Areas.

(a) Method of Ordinates: In this method, the ordinate f(X) of the normal curve, for various values of the random variate X are obtained by using the table of ordinates for a standard normal variate.

Method of Ordinates

Example : Fit a normal curve to the following data :

normal curve

normal curve

Note: If the class intervals are not continuous, they should first be made so.

∴ µ= 45-10*(10/100)=44

class intervals

class intervals

(b) Method of Areas: Under this method, the probabilities or the areas of the random variable lying in various intervals are determined. These probabilities are then multiplied by N to get the expected frequencies. This procedure is explained below for the data of the above example.

Method of Areas

Exercise with Hints

  1. In a metropolitan city, there are on the average 10 fatal road accidents in a month (30 days). What is the probability that (i) there will be no fatal accident tomorrow,
    (ii) next fatal accident will occur within a week?
    Hint: Take m = 1/3 and apply exponential distribution.
  2. A counter at a super bazaar can entertain on the average 20 customers per hour. What is the probability that the time taken to serve a particular customer will be
    (i) less than 5 minutes, (ii) greater than 8 minutes?
    Hint: Use exponential distribution.
  3. The marks obtained in a certain examination follow normal distribution with mean 45 and standard deviation 10. If 1,000 students appeared at the examination, calculate the number of students scoring (i) less than 40 marks, (ii) more than 60 marks and (iii) between 40 and 50 marks.
    Hint: See example 30.
  4. The ages of workers in a large plant, with a mean of 50 years and standard deviation of 5 years, are assumed to be normally distributed. If 20% of the workers are below a certain age, find that age.
    Hint: Given P(X < X1) = 0.20, find X1.
  5. The mean and standard deviation of certain measurements computed from a large sample are 10 and 3 respectively. Use normal distribution approximation to answer the following:
    (i) About what percentage of the measurements lie between 7 and 13 inclusive?
    (ii) About what percentage of the measurements are greater than 16?
    Hint: Apply correction for continuity.
  6. There are 600 business students in the post graduate department of a university and the probability for any student to need a copy of a particular text book from the university library on any day is 0.05. How many copies of the book should be kept in the library so that the probability that none of the students, needing a copy, has to come back disappointed is greater than 0.90? (Use normal approximation to binomial.)
    Hint: If X1 is the required number of copies, P(X X1) 0.90.
  7. The grades on a short quiz in biology were 0, 1, 2, 3, ...... 10 points, depending upon the number of correct answers out of 10 questions. The mean grade was 6.7 with standard deviation of 1.2. Assuming the grades to be normally distributed, determine
    (i) the percentage of students scoring 6 points, (ii) the maximum grade of the lowest 10% of the class.
    Hint: Apply normal approximation to binomial.
  8. The following rules are followed in a certain examination. "A candidate is awarded a first division if his aggregate marks are 60% or above, a second division if his aggregate marks are 45% or above but less than 60% and a third division if the aggregate marks are 30% or above but less than 45%. A candidate is declared failed if his aggregate marks are below 30% and awarded a distinction if his aggregate marks are 80% or above."

    At such an examination, it is found that 10% of the candidates have failed, 5% have obtained distinction. Calculate the percentage of students who were placed in the second division. Assume that the distribution of marks is normal. The areas under the standard normal curve from 0 to z are

    z: 1.28 1.64 0.41 0.47
    Area : 0.4000 0.4500 0.1591 0.1808

    Hint: First find parameters of the distribution on the basis of the given information.
  9. For a certain normal distribution, the first moment about 10 is 40 and the fourth moment about 50 is 48. What is the mean and standard deviation of the distribution?
    Hint: Use the condition b2 = 3, for a normal distribution.
  10. In a test of clerical ability, a recruiting agency found that the mean and standard deviation of scores for a group of fresh candidates were 55 and 10 respectively. For another experienced group, the mean and standard deviation of scores were found to be 62 and 8 respectively. Assuming a cut-off scores of 70, (i) what percentage of the experienced group is likely to be rejected, (ii) what percentage of the fresh group is likely to be selected, (iii) what will be the likely percentage of fresh candidates in the selected group? Assume that the scores are normally distributed.
    Hint: See example above.
  11. 1,000 light bulbs with mean life of 120 days are installed in a new factory. Their length of life is normally distributed with standard deviation of 20 days. (i) How many bulbs will expire in less than 90 days? (ii) If it is decided to replace all the bulbs together, what interval should be allowed between replacements if not more than 10 percent bulbs should expire before replacement?
    Hint: (ii) P(X X1) = 0.9.
  12. The probability density function of a random variable is expressed as
    probability density function
    (i) Identify the distribution.
    (ii) Determine the mean and standard deviation of the distribution.
    (iii) Write down two important properties of the distribution.
    Hint: Normal distribution.
  13. The weekly wages of 2,000 workers in a factory are normally distributed with a mean of Rs 200 and a variance of 400. Estimate the lowest weekly wages of the 197 highest paid workers and the highest weekly wages of the 197 lowest paid workers.
    Hint: See example.
  14. Among 10,000 random digits, in how many cases do we expect that the digit 3 appears at the most 950 times. (The area under the standard normal curve for z = 1.667 is 0.4525 approximately.)
    Hint: m = 10000´ 0.10 and s2 = 1000*0.9.
  15. Marks obtained by certain number of students are assumed to be normally distributed with mean 65 and variance 25. If three students are taken at random, what is the probability that exactly two of them will have marks over 70?
    Hint: Find the probability (p) that a student gets more than 70 marks. Then find 3C2p2q.
  16. The wage distribution of workers in a factory is normal with mean Rs 400 and standard deviation Rs 50. If the wages of 40 workers be less than Rs 350, what is the total number of workers in the factory? [ given ò10 f (t) dt = 0.34 , where t ~ N (0,1).]
    Hint: N * Probability that wage is less than 350 = 40.
  17. The probability density function of a continuous random variable X is given by
    f(X) = kX(2 - X), 0 < X < 2
    = 0 elsewhere.
    Calculate the value of the constant k and E(X).
    Hint: To find k, use the fact that total probability is unity.
  18. take a screen shot of the question.
    Hint: Transform X into standard normal variate z.
  19. The income of a group of 10,000 persons was found to be normally distributed with mean Rs 1,750 p.m. and standard deviation Rs 50. Show that about 95% of the persons of the group had income exceeding Rs 1,668 and only 5% had income exceeding Rs 1,832.
    Hint: See example 30.
  20. A complex television component has 1,000 joints by a machine which is known to produce on an average one defective in forty. The components are examined and the faulty soldering corrected by hand. If the components requiring more than 35 corrections are discarded, what proportion of the components will be thrown away?
    Hint: Use Poisson approximation to normal distribution.
  21. The average number of units produced by a manufacturing concern per day is 355 with a standard deviation of 50. It makes a profit of Rs 150 per unit. Determine the percentage of days when its total profit per day is (i) between Rs 457.50 and Rs 645.00, (ii) greater than Rs 628.50.
    Hint: Find the probabilities of producing 457.50/150 to 645/150 units etc.
  22. A tyre manufacturing company wants 90% of its tyres to have a wear life of at least 40,000 kms. If the standard deviation of the wear lives is known to be 3,000 kms, find the lowest acceptable average wear life that must be achieved by the company. Assume that the wear life of tyres is normally distributed.
    lowest acceptable average wear
  23. The average mileage before the scooter of a certain company needs a major overhaul is 60,000 kms with a S.D. of 10,000 kms. The manufacturer wishes to warranty these scooters, offering to make necessary overhaul free of charge if the buyer of a new scooter has a breakdown before covering certain number of kms. Assuming that the mileage, before an overhaul is required, is distributed normally, for how many kms should the manufacturer warranty so that not more than 3% of the new scooters come for free overhaul?

    manufacturer warranty

  24. After an aeroplane has discharged its passengers, it takes crew A an average of 15 minutes (s= 4 min.) to complete its task of handling baggage and loading food and other supplies. Crew B fuels the plane and does maintenance checks, taking an average of 16 minutes (s= 2 min.) to complete its task. Assume that the two crews work independently and their times, to complete the tasks, are normally
    distributed. What is the probability that both crews will complete their tasks soon enough for the plane to be ready for take off with in 20 minutes?
    Hint: P(A).P(B) = P(AB).
  25. An automobile company buys nuts of a specified mean diameters m. A nut is classified as defective if its diameter differs from mby more than 0.2 mm. The company requires that not more than 1% of the nuts may be defective. What should be the maximum variability that the manufacturer can allow in the production of nuts so as to satisfy the automobile company?
    Hint: Find S.D.

All rights reserved © 2018 Wisdom IT Services India Pvt. Ltd DMCA.com Protection Status

Quantitative Techniques for management Topics