#loss_function #cost_function
Approximation
J ( θ ) = M S E = E [ ( Y − Y ^ ) 2 ]
with M S E = the Mean Squared Error (MSE)
Y = the ground truth output values for the training examples
Y ^ = the predicted ouput values for the training examples
E [ z ] = the mean estimator: X ¯ = 1 m ∑ i = 1 m X i
Expanded
On contrary to the Mean Squared Error (MSE) , the expression of the RMSE doesn't need to be divided by 2, because the square root already eases the descent.
J ( θ ) = 1 m ∑ i = 1 m ( y i − y ^ i ) 2 = 1 m ∑ i = 1 m ( y i − h θ ( x i ) ) 2
with m = the number of training examples x i = the input (feature) of the i t h training example y i = the ground truth output of the i t h training example h θ ( x ) or y ^ i = the predicted ouput of the i t h training example
Vectorized
J ( θ ) = 1 m ( X θ − y → ) T ( X θ − y → )
with X = a matrix of the training examples arranged as rows of X y → = a vector of all the expected output values