轻雀文档

Machine Learning part3--Logistic regression

Classification

To attempt classification, one method is to use linear regression and map all predictions greater than 0.5 as a 1 and all less than 0.5 as a 0. However, this method doesn't work well because classification is not actually a linear function.

为了尝试分类，一种方法是使用线性回归，并将所有大于0.5的预测值映射为1，所有小于0.5的预测值映射为0。然而，这种方法效果并不好，因为分类实际上不是一个线性函数。

The classification problem is just like the regression problem, except that the values we now want to predict take on only a small number of discrete values. For now, we will focus on the binary classification problem in which y can take on only two values, 0 and 1. (Most of what we say here will also generalize to the multiple-class case.) For instance, if we are trying to build a spam classifier for email, then may be some features of a piece of email, and y may be 1 if it is a piece of spam mail, and 0 otherwise. Hence, y∈{0,1}. 0 is also called the negative class, and 1 the positive class, and they are sometimes also denoted by the symbols “-” and “+.” Given , the corresponding is also called the label for the training example.

分类问题就像回归问题一样，只是我们现在要预测的值只具有少量的离散值。现在，我们将重点讨论二元分类问题，在这个问题中，y只能有两个值，即0和1。（我们在这里说的大部分内容也可以推广到多类情况。）例如，如果我们试图建立一个电子邮件的垃圾邮件分类器，那么可能是一封邮件的一些特征，如果是一封垃圾邮件，y可能是1，否则是0。因此，y∈{0,1}。0也被称为负类，1称为正类，它们有时也用符号"-"和 "+"来表示。给定，相应的也被称为训练实例的标签。

Hypothesis Representation

We could approach the classification problem ignoring the fact that y is discrete-valued, and use our old linear regression algorithm to try to predict y given x. However, it is easy to construct examples where this method performs very poorly. Intuitively, it also doesn’t make sense for hθ(x) to take values larger than 1 or smaller than 0 when we know that y ∈ {0, 1}. To fix this, let’s change the form for our hypotheses hθ(x) to satisfy 0≤hθ(x)≤1. This is accomplished by plugging into the Logistic Function.