Statistics - Required Sample Size


A critical part of testing is the choice of the measure of test i.e. the quantity of units to be chosen from the populace for completing the exploration. There is no unequivocal answer or answer for characterizing the most suitable size. There are sure misguided judgments with respect to the span of test like the example ought to be 10% of the populace or the specimen size is relative to the extent of the universe. However as said before, these are just misguided judgments. How extensive a specimen ought to be is capacity of the variety in the populace parameters under study and the assessing exactness required by the specialist.

The decision on optimum size of the sample can be approached from two angles viz. the subjective and mathematical.

  1. Subjective Approach to Determining Sample Size

  2. Mathematical Approach to Sample Size Determination

Subjective Approach to Determining Sample Size

The choice of the size of sample is affected by various factors discussed as below:

  • The Nature of Population - The level of homogeneity or heterogeneity influences the extent of a specimen. On the off chance that the populace is homogeneous concerning the qualities of interest then even a little size of the specimen is adequate. However in the event that the populace is heterogeneous then a bigger example would be required to guarantee sufficient representativeness.

  • Nature of Respondent - If the respondents are effortlessly accessible and available then required data can be got from a little example. On the off chance that, notwithstanding, the respondents are uncooperative and non-reaction is relied upon to be high then a bigger specimen is required.

  • Nature of Study - A onetime study can be led utilizing a substantial example. If there should be an occurrence of examination studies which are of constant nature and are to be seriously completed, a little specimen is more suitable as it is anything but difficult to oversee and hold a little example over a long compass of time.

  • Sampling Technique Used - An essential variable affecting the span of test is the examining system received. Firstly a non-likelihood system requires a bigger specimen than a likelihood strategy. Besides inside of likelihood testing, if straightforward irregular examining is utilized it requires a bigger example than if stratification is utilized, where a little specimen is adequate.

  • Complexity of Tabulation - While settling on the specimen estimate the specialist ought to likewise consider the quantity of classifications and classes into which the discoveries are to be assembled and broke down. It has been seen that more the quantity of classifications that are to be produced the bigger is the example size. Since every class ought to be enough spoken to, a bigger specimen is required to give solid measures of the littlest classification.

  • Availability of Resources - The assets and the time accessible to specialist impact the span of test. Examination is a period and cash escalated assignment, with exercises like readiness of instrument, contracting and preparing field staff, transportation costs and so forth taking up a considerable measure of assets. Subsequently if the scientist does not have enough time and supports accessible he will settle on a littler example.

  • Degree of Precision and Accuracy Required - . It has turned out to be clear from our prior discourse that accuracy, which is measured by standard blunder, wills high just if S.E is less or the example size is substantial.

Also to get a high level of precision a bigger specimen is required. Other then these subjective efforts, sample size can be determined mathematically also.

Mathematical Approach to Sample Size Determination

In the mathematical approach to sample size determination the precision of estimate required is stated first and then the sample size is worked out. The precision can be specified as ${\pm}$ 1 of the true mean with 99% confidence level. This means that if the sample mean is 200, then the true value of the mean will be between 199 and 201. This level of precision is denoted by the term 'c'

Sample Size determination for means.

The confidence interval for the universe mean is given by

${\bar x \pm Z\frac{\sigma_p}{\sqrt N}\ or\ \bar x \pm e}$

Where −

  • ${\bar x}$ = Sample mean

  • ${e}$ = Acceptable error

  • ${Z}$ = Value of standard normal variate at a given confidence level

  • ${\sigma_p}$ = Standard deviation of the population

  • ${n}$ = Size of the sample

The acceptable error 'e' i.e. the difference between ${\mu}$ and ${\bar x}$ is given by

${Z.\frac{\sigma_p}{\sqrt N}}$

Thus, Size of the sample is:

${n = \frac{Z^2{\sigma_p}^2}{e^2}}$


In case the sample size is significant visa-a-vis the population size then above formula will be corrected by the finite population multiplier.

${n = \frac{Z^2.N.{\sigma_p}^2}{(N-1)e^2 + Z^2.{\sigma_p}^2}}$

Where −

  • ${N}$ = size of the population

Sample Size Determination for Proportions

The method for determining the sample size when estimating a proportion remains the same as the method for estimating the mean. The confidence interval for universe proportion ${\hat p}$ is given by

${ p \pm Z. \sqrt{\frac{p.q}{n}}}$

Where −

  • ${p}$ = sample proportion

  • ${q = (1 - p)}$

  • ${Z}$ = Value of standard normal variate for a sample proportion

  • ${n}$ = Size of the sample

Since ${ \hat p}$ is to be estimated hence the value of p can be determined by taking the value of p = 0.5, an acceptable value, giving a conservative sample size. The other option is that the value of p is estimated either through a pilot study or on a personal judgement basis. Given the value of p, the acceptable error 'e' is given by

${ e= Z. \sqrt{\frac{p.q}{n}} \\[7pt] e^2 = Z^2\frac{p.q}{n} \\[7pt] n = \frac{Z^2.p.q}{e^2}}$

In case the population is finite then the above formula will be corrected by the finite population multiplier.

${n = \frac{Z^2.p.q.N}{e^2(N-1) + Z^2.p.q}}$


Problem Statement:

A shopping store is interested in estimating the proportion of households possessing the store Privilege Membership card. Previous studies have shown that 59% of the household had a store credit card. At 95% confidence level with a tolerable error level of 05.

  1. Determine the sample size required to conduct the study.

  2. What would be the sample size if the number of target households is known to be 1000?


The store has the following information

${ p = .59 \\[7pt] \Rightarrow q = (1-p) = (1-.59) =.41 \\[7pt] CL = .95 \\[7pt] And\ the\ Z\ standard\ variate\ for\ CL\ .95\ is\ 1.96 \\[7pt] e = \pm .05 }$

The sample size can be determined by applying the following formula:

${n = \frac{Z^2.p.q}{e^2}}$
${n = \frac{(1.96)^2.(.59).(.41)}{(.05)^2} \\[7pt] = \frac{.9226}{.0025} \\[7pt] = 369 }$

Hence a sample of 369 households is sufficient to conduct the study.

Since the population i.e. target households are known to be 1000 and the above sample is a significant proportion of total population hence the corrected formula which includes finite population multiplier is used.

${n = \frac{Z^2.p.q.N}{e^2(N-1) + Z^2.p.q} \\[7pt] = \frac{(1.96)^2.(.59).(.41).(1000)}{(.05)^2 \times 999 + (1.96)^2(.59)(.41)} \\[7pt] = \frac{922.6}{2.497 + .922} \\[7pt] = 270 }$

Thus if the population is a finite one with 1000 households then the sample size required to conduct the study is 270.

It is evident from this illustration that if the population size is known then the sample size determined has decreased in size.