Basic Definitions
Statistical Inference Statistical Inference makes use of information from a sample to draw conclusions (inferences) about the population from which the sample was taken. Experiment An experiment is any process or study which results in the collection of data, the outcome of which is unknown. In statistics, the term is usually restricted to situations in which the researcher has control over some of the conditions under which the experiment takes place. Example
Experimental (or Sampling) Unit A unit is a person, animal, plant or thing which is actually studied by a researcher; the basic objects upon which the study or experiment is carried out. For example, a person; a monkey; a sample of soil; a pot of seedlings; a postcode area; a doctor's practice. Population A population is any entire collection of people, animals, plants or things from which we may collect data. It is the entire group we are interested in, which we wish to describe or draw conclusions about. In order to make any generalisations about a population, a sample, that is meant to be representative of the population, is often studied. For each population there are many possible samples. A sample statistic gives information about a corresponding population parameter. For example, the sample mean for a set of data would give information about the overall population mean. It is important that the investigator carefully and completely defines the population before collecting the sample, including a description of the members to be included. Example
Sample A sample is a group of units selected from a larger group (the population). By studying the sample it is hoped to draw valid conclusions about the larger group. A sample is generally selected for study because the population is too large to study in its entirety. The sample should be representative of the general population. This is often best achieved by random sampling. Also, before collecting the sample, it is important that the researcher carefully and completely defines the population, including a description of the members to be included. Example
Parameter A parameter is a value, usually unknown (and which therefore has to be estimated), used to represent a certain population characteristic. For example, the population mean is a parameter that is often used to indicate the average value of a quantity. Within a population, a parameter is a fixed value which does not vary. Each sample drawn from the population has its own value of any statistic that is used to estimate this parameter. For example, the mean of the data in a sample is used to give information about the overall mean in the population from which that sample was drawn. Parameters are often assigned Greek letters (e.g. ), whereas statistics are assigned Roman letters (e.g. s). Statistic A statistic is a quantity that is calculated from a sample of data. It is used to give information about unknown values in the corresponding population. For example, the average of the data in a sample is used to give information about the overall average in the population from which that sample was drawn. It is possible to draw more than one sample from the same population and the value of a statistic will in general vary from sample to sample. For example, the average value in a sample is a statistic. The average values in more than one sample, drawn from the same population, will not necessarily be equal. Statistics are often assigned Roman letters (e.g. m and s), whereas the equivalent unknown values in the population (parameters ) are assigned Greek letters (e.g. µ and ). Sampling Distribution The sampling distribution describes probabilities associated with a statistic when a random sample is drawn from a population. The sampling distribution is the probability distribution or probability density function of the statistic. Derivation of the sampling distribution is the first step in calculating a confidence interval or carrying out a hypothesis test for a parameter. Example
An estimate is an indication of the value of an unknown quantity based on observed data. More formally, an estimate is the particular value of an estimator that is obtained from a particular sample of data and used to indicate the value of a parameter. Example
Estimator An estimator is any quantity calculated from the sample data which is used to give information about an unknown quantity in the population. For example, the sample mean is an estimator of the population mean.
Example
If the value of the estimator in a particular sample is found to be 5, then 5 is the estimate of the population mean µ. Estimation Estimation is the process by which sample data are used to indicate the value of an unknown quantity in a population. Results of estimation can be expressed as a single value, known as a point estimate, or a range of values, known as a confidence interval.
