Machine Learning part3--Logistic regression
Classification
To attempt classification, one method is to use linear regression and map all predictions greater than 0.5 as a 1 and all less than 0.5 as a 0. However, this method doesn't work well because classification is not actually a linear function.
为了尝试分类,一种方法是使用线性回归,并将所有大于0.5的预测值映射为1,所有小于0.5的预测值映射为0。 然而,这种方法效果并不好,因为分类实际上不是一个线性函数。
The classification problem is just like the regression problem, except that the values we now want to predict take on only a small number of discrete values. For now, we will focus on the binary classification problem in which y can take on only two values, 0 and 1. (Most of what we say here will also generalize to the multiple-class case.) For instance, if we are trying to build a spam classifier for email, then may be some features of a piece of email, and y may be 1 if it is a piece of spam mail, and 0 otherwise. Hence, y∈{0,1}. 0 is also called the negative class, and 1 the positive class, and they are sometimes also denoted by the symbols “-” and “+.” Given , the corresponding is also called the label for the training example.
分类问题就像回归问题一样,只是我们现在要预测的值只具有少量的离散值。现在,我们将重点讨论二元分类问题,在这个问题中,y只能有两个值,即0和1。(我们在这里说的大部分内容也可以推广到多类情况。)例如,如果我们试图建立一个电子邮件的垃圾邮件分类器,那么可能是一封邮件的一些特征,如果是一封垃圾邮件,y可能是1,否则是0。因此,y∈{0,1}。0也被称为负类,1称为正类,它们有时也用符号"-"和 "+"来表示。给定,相应的也被称为训练实例的标签。
Hypothesis Representation
We could approach the classification problem ignoring the fact that y is discrete-valued, and use our old linear regression algorithm to try to predict y given x. However, it is easy to construct examples where this method performs very poorly. Intuitively, it also doesn’t make sense for hθ(x) to take values larger than 1 or smaller than 0 when we know that y ∈ {0, 1}. To fix this, let’s change the form for our hypotheses hθ(x) to satisfy 0≤hθ(x)≤1. This is accomplished by plugging into the Logistic Function.