# Maximum Likelihood Estimator (MLE) vs. Maximimum a Posteriori Estimator (MAP)

## What's the difference?

We want to estimate the best parameters Theta for our model given some data D.

The MLE chooses

Theta_MLE = argmax_{Theta} P(D|Theta)

i.e. the parameters that maximize the likelihood.

The MAP chooses

Theta_MAP = argmax_{Theta} P(D|Theta) P(Theta) = arg_max{Theta} P(Theta|D)

So in contrast to the MLE, the MAP estimate considers the a-priori probability when choosing the most probable model parameters Theta as well.

Since P(D|Theta) P(Theta) is proportional to P(Theta|D), which is the posterior probability in Bayes theorem, it is also called the Maximum A-Posteriori estimate.

An important difference between both estimators is:

• MLE tends to overfit the parameters Theta the data
• MAP does not tend to overfit, since it uses the a-priori probabilities of the parameters Theta when choosing its estimate

## Videos: MLE

intro to MLE with pros/cons, part I

intro to MLE with pros/cons, part II

## Videos: MAP

intro to MAP with pros/cons 