Logistic Regression as Maximum Likelihood
maxlog
h - modeling hypothesis
max... - maximizes the likelihood function
In the case of logistic regression, a Binomial probability distribution is assumed for the data sample, where each example is one outcome of a Bernoulli trial. The bernoulli distribution has a single parameter, that is the probability of a successful outcome(p)
“
The probability distribution that is most often used when there are two classes is the binomial distribution. This distribution has a single parameter, p, that is the probability of an event or a specific class.
— Page 283, Applied Predictive Modeling, 2013.
”
The expected value(mean) of the Bernoulli distribution can be calculated as follows
or
https://www.investopedia.com/terms/e/expected-value.asp
In statistics and probability analysis, the expected value is calculated by multiplying each of the possible outcomes by the likelihood each outcome will occur and then summing all of those values.
![]()
One can treat it as the weighted mean
thus
log-likelihood
Sum the likelihood function across all examples
Minimize the cost function for optimization by inverting the function so that we minimize the negative log-likelihood
**Computing the negative of the log-likelihood function for the Bernoulli distribution is equivalent to calculating the cross-entropy function **
# test of Bernoulli likelihood function
# likelihood function for Bernoulli distribution
def likelihood(y, yhat):
return yhat * y + (1 - yhat) * (1 - y)
# test for y=1
y, yhat = 1, 0.9
print('y=%.1f, yhat=%.1f, likelihood: %.3f' % (y, yhat, likelihood(y, yhat)))
y, yhat = 1, 0.1
print('y=%.1f, yhat=%.1f, likelihood: %.3f' % (y, yhat, likelihood(y, yhat)))
# test for y=0
y, yhat = 0, 0.1
print('y=%.1f, yhat=%.1f, likelihood: %.3f' % (y, yhat, likelihood(y, yhat)))
y, yhat = 0, 0.9
print('y=%.1f, yhat=%.1f, likelihood: %.3f' % (y, yhat, likelihood(y, yhat)))
网友评论