Validation

We cannot use the test set for the purpose of tweaking parameters, since the model will overfit to the data
We split training data into train and validation data.
Cross-validation
Instead of randomly choosing the validation set, iterate over the whole train data and choose different validation sets and average the performance across these sets.
Example: 5-fold cross-validation: Split the train data into 5 folds, use 4 folds for training and 1 for validation. Finally average the performance across these folds
In practice, people prefer to avoid cross-validation because it’s computationally expensive. People prefer a single validation split.

Cross-validation