Skip to content

Statistics

Statistics is indisputably one of the most important components and basis for any machine learning application. Thus, basic knowledge in various areas of this subfield is also indispensable if one wants to understand the algorithms behind machine learning more precisely.

Statistical Connections | Photo: dilbert.com

In general, statistical methods deal with being able to express a relationship between different variables and these inferences also mathematically. In other words, cause-effect relationships (Causation) are sought:

  • How much better does the grade on an exam get if you study more?
  • How does the election result change depending on the campaign that was run?
  • Is it safer to fly by plane or to take the train?

In order to be able to examine such correlations more precisely, data analysis also includes tools for evaluating, displaying and summarizing large amounts of data. Graphical evaluations, such as bar charts, pie charts or line charts, are just as much a part of the statistical repertoire as the calculation of mean values or medians.

Some of our Articles in the Field of Statistics

Varianz / Variance

What is the Variance?

Explore variance's role in statistics and data analysis. Understand how it measures data dispersion.

Kullback-Leibler Divergence / Kullback-Leibler Divergenz / KL Divergence

What is the Kullback-Leibler Divergence?

Explore Kullback-Leibler Divergence, a vital metric in information theory and machine learning, and its applications.

Maximum Likelihood Estimation / MLE / Maximum Likelihood Methode

What is the Maximum Likelihood Estimation?

Unlocking insights: Understand Maximum Likelihood Estimation (MLE), a potent statistical tool for parameter estimation and data modeling.

Variance Inflation Factor (VIF) / Varianzinflationsfaktor

What is the Variance Inflation Factor (VIF)?

Learn how Variance Inflation Factor (VIF) detects multicollinearity in regression models for better data analysis.

Dummy Variable Trap

What is the Dummy Variable Trap?

Escape the Dummy Variable Trap: Learn About Dummy Variables, Their Purpose, the Trap's Consequences, and how to detect it.

R-Squared / Bestimmtheitsmaß

What is the R-squared?

Introduction to R-Squared: Learn its Significance, Calculation, Limitations, and Practical Use in Regression Analysis.

Difference between statistical methods and stochastics

In everyday language, probability theory is often mistakenly assigned to statistics, although this is not true. Statistics is merely a subfield of so-called stochastics. In addition to data analysis, this also includes probability theory, i.e. all calculations relating to random experiments such as coin tossing, dice rolling or betting.

This is important because statistical methods do not include probability calculations, even though this is sometimes erroneously claimed. Statistical calculations are clearly more important for machine learning algorithms and form one of the most significant foundations for ML. Probabilities are only used within artificial intelligence when outputting results. A machine learning algorithm will never be able to make a prediction with complete certainty. Instead, results are output with probabilities to express how certain the algorithm is about the outcome. So a probability of 99.5% means that the model is very sure that its prediction will be correct.

Conclusion

Statistical methods are one of the most important foundations for understanding and correctly applying models in the field of machine learning. The contributions in this chapter aim to explain the methods that are indispensable for basic machine learning.

Cookie Consent with Real Cookie Banner