The Naive Bayes Algorithm is a classification method based on the so-called Bayes Theorem. In essence, it assumes that the occurrence of a feature is completely uncorrelated with the occurrence of another feature within the class.

The algorithm is naive because it considers the features completely independent of each other and all contribute to the probability of the class. A simple example of this: A car is characterized by having four wheels, being about 4-5 meters long, and being able to drive. All three of these features independently contribute to this object being a car.

### How does the Algorithm work?

The Naive Bayes algorithm is based on the Bayes theorem. It describes a formula for calculating the conditional probability P(A|B) or in words: What is the probability that event A occurs when event B has occurred? As an example: What is the probability that I have Corona (= event A) if my rapid test is positive (= event B)?

According to Bayes, this conditional probability can be calculated using the following formula:

\(\) \[P(A|B) = \frac{P(B|A) * P(A)}{P(B)} \]

- P(B|A) = probability that event B occurs if event A has already occurred
- P(A) = probability that event A occurs
- P(B) = probability that event B occurs

Why should we use this formula? Let us return to our example with the positive test and the Corona disease. I cannot know the conditional probability P(A|B) and can only find it out via an elaborate experiment. The inverse probability P(B|A), on the other hand, is easier to find out. In words, it means: How likely is it that a person suffering from Corona has a positive rapid test?

This probability can be found out relatively easily by having demonstrably ill persons perform a rapid test and then calculating the ratio of how many of the tests were actually positive. The probabilities P(A) and P(B) are similarly easy to find out. The formula then makes it easy to calculate the conditional probability P(A|B).

If we have only one feature, this already explains the complete Naive Bayes algorithm. With a feature for the conditional probability P(x | K) for different classes is calculated and the class with the highest probability wins. For our example, this means that the identical conditional probabilities P(the person is sick | test is positive) and P(the person is healthy | test is negative) are calculated using Bayes’ theorem and the classification is done for the class with the higher probability.

If our dataset consists of more than one feature, we proceed similarly and compute the conditional probability for each combination of feature x and class K. We then multiply all probabilities for one feature. The class K that then has the highest product of probabilities is the corresponding class of the dataset.

### What are the Advantages and Disadvantages of the Naive Bayes Algorithm?

The Naive Bayes Algorithm is a popular starting point for a classification application since it is very easy and fast to train and can deliver good results in some cases. If the assumption of independence of the individual features is given, it even performs better than comparable classification models, such as logistic regression, and requires fewer data to train.

Although the Naive Bayes Algorithm can achieve good results with only a few data, we need so much data that each class appears at least once in the training data set. Otherwise, the Naive Bayes Classifier will return a probability of 0 as a result of the category in the test dataset. Moreover, in reality, it is very unlikely that all input variables are completely independent of each other, which is also very difficult to test.

### What Applications use the Naive Bayes Algorithm?

In the field of machine learning, Naive Bayes is used as a classification model, i.e. to classify a data set into a certain class. There are various concrete applications for these models for which Naive Bayes is also used:

In this area, the model can be used to assign a section of text to a specific class. E-mail programs, for example, are interested in classifying incoming emails as “spam” or “not spam”. For this purpose, the conditional probabilities of individual words are then calculated and matched with the class. The same procedure can also be used to classify social media comments as “positive” or “negative”.

Although Naive Bayes provides a fast and simple approach for these applications in the text domain, there are other models, such as Transformers, that provide much better results. This is because the Naive Bayes model does not take into account word order or some arrangement. For example, if I say “I don’t like this product.” it is probably not a positive product review just because the word “like” is in it.

#### Classification of Credit Risks

For banks, loan default is an immense risk, as they lose large sums of money if a customer can no longer pay the loan. That’s why a lot of work is put into models that can calculate the individual default risk depending on the customer. In the end, this is also a classification in which the customer is assigned to either the “loan repayment” or “loan default” group. For this purpose, some specific characteristics are used, such as loan amount, income, or the number of previous loans. With the help of Naive Bayes, a reliable classification model can be trained from this.

#### Prediction of Medical Treatment

In medicine, a doctor has to decide which treatment and which drugs are most promising for the individual patient and his clinical picture and have the highest probability to make the patient healthy again. To support this, a Naive Bayes classification model can be trained, which calculates the probability that the client will recover or not, depending on characteristics of the health condition, such as blood pressure, well-being, or symptoms, as well as the possible treatment (medication). The results of the model can in turn be used by the physician in his decision.

### This is what you should take with you

- The Naive Bayes Algorithm is a simple method to classify data.
- It is based on Bayes’ theorem and is naive because it assumes that all input variables and their expression are independent of each other.
- The Naive Bayes Algorithm is relatively quick and easy to train, but in many cases, it does not give good results because the assumption of independence of the variables is violated.

### Other Articles on the Topic of Naive Bayes

- Scikit-Learn provides some examples and programming instructions for the Naive Bayes algorithm in Python.