Root Mean Squared Error (RMSE)

Approximation

J (θ) = \sqrt{M S E} = \sqrt{E [(Y - \hat{Y})^{2}]}

$with M S E = the$ Mean Squared Error (MSE)
$Y = the ground truth output values for the training examples$
$\hat{Y} = the predicted ouput values for the training examples$
$E [z] = the mean estimator: \bar{X} = \frac{1}{m} \sum_{i = 1}^{m} X_{i}$

Expanded

On contrary to the Mean Squared Error (MSE), the expression of the RMSE doesn't need to be divided by 2, because the square root already eases the descent.

J (θ) = \sqrt{\frac{1}{m} \sum_{i = 1}^{m} (y_{i} - {\hat{y}}_{i})^{2}} = \sqrt{\frac{1}{m} \sum_{i = 1}^{m} (y_{i} - h_{θ} (x_{i}))^{2}}

$\begin{aligned} with m & = the number of training examples \\ x_{i} & = the input (feature) of the i^{t h} training example \\ y_{i} & = the ground truth output of the i^{t h} training example \\ h_{θ} (x) or {\hat{y}}_{i} & = the predicted ouput of the i^{t h} training example \end{aligned}$

Vectorized

J (θ) = \sqrt{\frac{1}{m} (X θ - \vec{y})^{T} (X θ - \vec{y})}

$\begin{aligned} with X & = a matrix of the training examples arranged as rows of X \\ \vec{y} & = a vector of all the expected output values \end{aligned}$