Skip to content

Population and Sample – simply explained!

The samples are individual elements of all objects (e.g. society) from which data are collected in a study. These can then be used for statistical analysis. 

The population is the summary of all units under investigation. The aim of statistical analysis is to be able to make statements about this group. 

These are used to conduct scientific experiments and determine if there is a statistical relationship between several variables (Correlation and Causation).

Das Bild zeigt mehrere Menschengruppen. Die größte ist die gesamte Population und die kleinere die Stichprobe.
Population and Sample | Source: Author

A brief example: On the evening of the Bundestag election (election of the German parliament), the first projection with results is shown punctually at 6 p.m.. Since the polling stations do not close until this time, only a fraction of all votes cast can be counted, the sample. The purpose of the extrapolation is to make an accurate statistical statement about what the result will be for all votes cast, the basic population. As the evening progresses and more ballots are counted, the extrapolation also approaches the actual later election result and reflects reality more and more accurately.

What are the types of the population?

In statistics, there are three types of populations. We will describe these in more detail below:

  • Finite population: A finite population has a finite number of members. For example, the number of students in a particular school, the number of employees in a company, or the total number of households in a particular area are finite populations.
  • Infinite population: An infinite population has an infinite number of members. For example, the population of all possible coin tosses, the number of all possible real numbers, or the total number of bacteria in a given neighborhood are infinite populations.
  • Theoretical population: A theoretical population is used to represent a group of people, animals, or objects. For example, a theoretical population may represent the set of all people with a particular genetic characteristic or the set of all households with a particular income level. These are often used in research studies to make statistical inferences about a real population.

Knowledge of the nature of the population is important for selecting an appropriate sampling method and for statistical inference.

Population vs. Sample Examples

Research QuestionPopulationSample
How much money does a German citizen spend on food per month?All German citizens (over 18 years)10,000 randomly encountered supermarket visitors
How old is the average student at the University of Stuttgart?All students enrolled at the University of StuttgartSurvey of students visiting Stuttgart University Library on a Saturday
How long is a song on the streaming platform Spotify?All songs uploaded to the platform at the time, exclusive podcasts100,000 randomly selected songs available in Germany
Practical examples for population and sample

4 Reasons for using samples instead of population

  • Practicability: It is easier and more feasible to collect data only from the sample, rather than the entire population.
  • Resource efficiency: The study saves costs for the survey, for example, through less time spent by the researchers or lower logistical costs, such as travel costs.   
  • Necessity: Depending on the research question, it may also be nearly impossible to study the entire population. For example, the U.S. only conducts a complete census every 10 years. Due to the lack of mandatory reporting in the states, this represents such a large expense that it can only be taken once a decade.
  • Simpler data management: Due to the smaller number of people surveyed, less data is generated overall. Thus, there are lower costs for storing and processing the data. In addition, the calculations can also be performed much more quickly and easily.

Sampling Methods

To obtain a sample of a population, two types of sampling are distinguished: 

Probability sampling is characterized by the fact that each element of a population has an equal chance of being part of the sample. For a population of 100 people, for example, this means that each person has a 1 in 100 (= 1%) chance of becoming part of the unit of study. These methods are usually very costly and time-consuming. 

Non-probability sampling is the exact opposite. In this case, not all elements of the population have the same probability of becoming part of the study. An example of this would be if the University of Stuttgart wanted to make an evaluation for all German students, but only surveyed students from its own university for the study. This saves the research team the time and expense of interviewing and studying students outside of Stuttgart. 

In addition to these general sampling methods, there are more detailed methods, such as:

  • Stratified sampling: in this method, the population is divided into subgroups (or strata) based on certain characteristics (e.g., age, gender, or place of residence), and a sample is then selected from each subgroup in proportion to its size in the population. This is a way to ensure that the sample is representative of the population.
  • Cluster sampling: This method divides the population into clusters (e.g., households or neighborhoods) and then selects a random sample from the clusters for the survey. This can be more efficient than simple random sampling if the clusters are homogeneous, but can be less accurate if the clusters are heterogeneous.
  • Systematic sampling: this method selects every nth member of the population to be included in the sample. This can be more efficient than simple random sampling if the population is large and there is a known pattern or sequence of members.
  • Random sampling: this method selects individuals who are readily available or easy to reach. This can be a quick and inexpensive method to obtain a sample but may introduce bias if the sample is not representative of the population.

The choice of an appropriate sampling method depends on several factors, such as the research question, the characteristics of the population, the resources available, and the desired level of precision and accuracy. It is important to carefully consider these factors and select a sampling method that is appropriate for the specific research study.

How to find the right size for the study unit?

Determining the appropriate sample size is an important aspect of statistical analysis. Several factors must be considered when finding the appropriate size. The size of the population is one of the most important factors, with a larger population requiring a larger sample size. The sampling method used also affects the size needed, as different methods require different sample sizes.

The desired degree of precision is another factor that must be considered. A higher degree of precision requires a larger sample size. The variability of the population also affects the required size, with a more variable population requiring a larger sample size.

Das Diagramm zeigt die Glockenkurve mit dem Erwartungswert (Expected Value) in Orange in der Mitte der Kurve.
Confidence interval for a normal distribution | Source: Author

The confidence level and margin of error sought for the analysis also affect the required sample size. A higher confidence level and lower margin of error require a larger sample size. Finally, available resources such as time and budget may also limit the sample size.

Thus, several factors must be considered when determining the appropriate sample size, including the size of the population, the sampling procedure, the desired level of precision, the variability in the population, the confidence level and margin of error, and the available resources. There are several statistical formulas and software tools that can be used to assist in calculating the required sample size for a particular

This is what you should take with you

  • The samples are individual elements of all objects from which data are collected in an investigation.
  • The population is the summary of all units of study.
  • The use of samples is preferable to the use of the entire population for various reasons, such as practicality or resource efficiency.
  • Samples can be collected either by random sampling or by non-random sampling. The difference is that in random sampling, all elements of the population have the same probability of appearing in the sample. In the non-random sample, this is not the case.

Other Articles on the Topic of Population and Sample

  • The selection procedures for research units are described in more detail here.
Das Logo zeigt einen weißen Hintergrund den Namen "Data Basecamp" mit blauer Schrift. Im rechten unteren Eck wird eine Bergsilhouette in Blau gezeigt.

Don't miss new articles!

We do not send spam! Read everything in our Privacy Policy.

Cookie Consent with Real Cookie Banner