Data Sufficiency In Statistics



Data Sufficiency in Statistics

A sufficient statistic is a statistic that summarizes all of the information in a sample about a chosen parameter. For example, the sample mean, x̄, estimates the population mean, μ. x̄ is a sufficient statistic if it retains all of the information about the population mean that was contained in the original data points.

Let X1, X2, …, Xn be a random sample from a probability distribution with unknown parameter θ. Then, the statistic:

Y=u(X1,X2,…,Xn)

is said to be sufficient for θ if the conditional distribution of X1, X2, …, Xn, given the statistic Y, does not depend on the parameter θ.

Guidelines to solve questions

In each of the questions below consists of a question and two statements numbered I and II given below it. You have to decide whether the data provided in the statements are sufficient to answer the question. Read both the statements and give answer.

  • If the data in statement I alone are sufficient to answer the question, while the data in statement II alone are not sufficient to answer the question
  • If the data in statement II alone are sufficient to answer the question, while the data in statement I alone are not sufficient to answer the question
  • If the data either in statement I alone or in statement II alone are sufficient to answer the question
  • If the data given in both statements I and II together are not sufficient to answer the question and
  • If the data in both statements I and II together are necessary to answer the question.

 

Example of Data sufficiency

Question: In which year was Rahul born ?  

Statements:

  1. Rahul at present is 25 years younger to his mother.
  2. Rahul’s brother, who was born in 1964, is 35 years younger to his mother.

 

A.

I alone is sufficient while II alone is not sufficient

B.

II alone is sufficient while I alone is not sufficient

C.

Either I or II is sufficient

D.

Neither I nor II is sufficient

E.

Both I and II are sufficient

 

Answer  E

Explanation:

From both I and II, we find that Rahul is (35 – 25) = 10 years older than his brother, who was born in 1964. So, Rahul was born in 1954.