Machine Learning part5--Neural Networks:Learing
Cost Function
Let's first define a few variables that we will need to use:
让我们首先定义几个我们需要使用的变量:
●L = total number of layers in the network
L = 网络中的总层数
= number of units (not counting bias unit) in layer l
= 第l层中的单元数(不计入偏置单元)
●K = number of output units/classes
K = 输出单元/类的数量
Recall that in neural networks, we may have many output nodes. We denote as being a hypothesis that results in the output. Our cost function for neural networks is going to be a generalization of the one we used for logistic regression. Recall that the cost function for regularized logistic regression was:
回顾一下,在神经网络中,我们可能有许多输出节点。我们表示是一个假设,其结果是的输出。我们的神经网络成本函数将是对我们用于逻辑回归的成本函数的概括。回顾一下,正则化逻辑回归的成本函数是:
For neural networks, it is going to be slightly more complicated:
对于神经网络来说,它将会稍微复杂一些:
We have added a few nested summations to account for our multiple output nodes. In the first part of the equation, before the square brackets, we have an additional nested summation that loops through the number of output nodes.
我们增加了一些嵌套求和,以考虑到我们的多个输出节点。在方程的第一部分,在方括号之前,我们有一个额外的嵌套求和,循环计算输出节点的数量。
In the regularization part, after the square brackets, we must account for multiple theta matrices. The number of columns in our current theta matrix is equal to the number of nodes in our current layer (including the bias unit). The number of rows in our current theta matrix is equal to the number of nodes in the next layer (excluding the bias unit). As before with logistic regression, we square every term.