Statistics | Data Basecamp

Statistics is indisputably one of the most important components and basis for any machine learning application. Thus, basic knowledge in various areas of this subfield is also indispensable if one wants to understand the algorithms behind machine learning more precisely.

Statistical Connections | Photo: dilbert.com

In general, statistical methods deal with being able to express a relationship between different variables and these inferences also mathematically. In other words, cause-effect relationships (Causation) are sought:

How much better does the grade on an exam get if you study more?
How does the election result change depending on the campaign that was run?
Is it safer to fly by plane or to take the train?

In order to be able to examine such correlations more precisely, data analysis also includes tools for evaluating, displaying and summarizing large amounts of data. Graphical evaluations, such as bar charts, pie charts or line charts, are just as much a part of the statistical repertoire as the calculation of mean values or medians.

Some of our Articles in the Field of Statistics

What is the F-Statistic?

29. March 2025

Explore the F-statistic: Its Meaning, Calculation, and Applications in Statistics. Learn to Assess Group Differences.

What is Gibbs Sampling?

5. October 2024

Explore Gibbs sampling: Learn its applications, implementation, and how it's used in real-world data analysis.

What is a Bias?

27. July 2024

Unveiling Bias: Exploring its Impact and Mitigating Measures. Understand, recognize, and address bias in this insightful guide.

What is the Variance?

13. July 2024

Explore variance's role in statistics and data analysis. Understand how it measures data dispersion.

Kullback-Leibler Divergence / Kullback-Leibler Divergenz / KL Divergence

What is the Kullback-Leibler Divergence?

3. July 2024

Explore Kullback-Leibler Divergence, a vital metric in information theory and machine learning, and its applications.

Maximum Likelihood Estimation / MLE / Maximum Likelihood Methode

What is the Maximum Likelihood Estimation?

29. June 2024

Unlocking insights: Understand Maximum Likelihood Estimation (MLE), a potent statistical tool for parameter estimation and data modeling.

Difference between statistical methods and stochastics

In everyday language, probability theory is often mistakenly assigned to statistics, although this is not true. Statistics is merely a subfield of so-called stochastics. In addition to data analysis, this also includes probability theory, i.e. all calculations relating to random experiments such as coin tossing, dice rolling or betting.

This is important because statistical methods do not include probability calculations, even though this is sometimes erroneously claimed. Statistical calculations are clearly more important for machine learning algorithms and form one of the most significant foundations for ML. Probabilities are only used within artificial intelligence when outputting results. A machine learning algorithm will never be able to make a prediction with complete certainty. Instead, results are output with probabilities to express how certain the algorithm is about the outcome. So a probability of 99.5% means that the model is very sure that its prediction will be correct.

Conclusion

Statistical methods are one of the most important foundations for understanding and correctly applying models in the field of machine learning. The contributions in this chapter aim to explain the methods that are indispensable for basic machine learning.