Naive Bayes

Pradeep Dhote
3 min readAug 3, 2020

Bayes Theorem

It is a theorem that works on conditional probability. Conditional probability is the probability that something will happen, given that something else has already occurred. The conditional probability can give us the probability of an event using its prior knowledge.

Conditional probability:

Where,

P(H): The probability of hypothesis H being true. This is known as prior probability.

P(E): The probability of the evidence.

P(E|H): The probability of the evidence given that hypothesis is true.

P(H|E): The probability of the hypothesis given that the evidence is true.

Naive Bayes Classifier

  • It is a kind of classifier that works on Bayes theorem.
  • Prediction of membership probabilities is made for every class such as the probability of data points associated to a particular class.
  • The class having maximum probability is appraised as the most suitable class.
  • This is also referred as Maximum A Posteriori (MAP).
  • The MAP for a hypothesis is:
  • 𝑀𝐴𝑃 (𝐻) = max 𝑃((𝐻|𝐸))
  • 𝑀𝐴𝑃 (𝐻) = max 𝑃((𝐻|𝐸) ∗ (𝑃(𝐻)) /𝑃(𝐸))
  • 𝑀𝐴𝑃 (𝐻) = max(𝑃(𝐸|𝐻) ∗ 𝑃(𝐻))
  • 𝑃 (𝐸) is evidence probability, and it is used to normalize the result. Result will not be affected by removing 𝑃(𝐸).
  • NB classifiers conclude that all the variables or features are not related to each other.
  • Existence or absence of a variable does not impact the existence or absence of any other variable.
  • Example:
  • A fruit may be observed to be an apple if it is red, round, and about 4″ in diameter.
  • In this case also even if all the features are interrelated to each other, a NB classifier will observe all of these independently contributing to the probability that the fruit is apple.
  • We experiment the hypothesis in real datasets, given multiple features.
  • So, computation becomes complex.

Types Of Naive Bayes Algorithms

1. Gaussian Naïve Bayes: When characteristic values are continuous in nature then an assumption is made that the values linked with each class are dispersed according to Gaussian that is Normal Distribution.

2. Multinomial Naïve Bayes: Multinomial Naive Bayes is favoured to use on data that is multinomial distributed. It is widely used in text classification in NLP. Each event in text classification constitutes the presence of a word in a document.

3. Bernoulli Naïve Bayes: When data is dispensed according to the multivariate Bernoulli distributions then Bernoulli Naive Bayes is used. That means there exist multiple features but each one is assumed to contain binary value. So, it requires features to be binary valued.

Advantages And Disadvantages Of Naive Bayes

Avantages:

  • It is a highly extensible algorithm which is very fast.
  • It can be used for both binary as well as multiclass classification.
  • It has mainly three different types of algorithms that are GaussianNB, MultinomialNB, BernoulliNB.
  • It is a famous algorithm for spam email classification.
  • It can be easily trained on small datasets and can be used for large volumes of data as well.

Disadvantages:

  • The main disadvantage of the NB is considering all the variables independent that contributes to the probability.

Applications of Naive Bayes Algorithms

  • Real time Prediction: Being a fast learning algorithm it can be used to make predictions in real time as well.
  • MultiClass Classification: It can be used for multi class classification problems also.
  • Text Classification: As it has shown good results in predicting multi class classification so it has more success rates compared to all other algorithms. As a result, it is majorly used in sentiment analysis & spam detection.

--

--