How would you split up a data set in order to choose from multiple models?

How would you split up a data set in order to choose from multiple models?



Answer: In such a situation, you should split the data into three parts: a training set for building models, a validation set for choosing among trained models (called the cross-validation set), and a test set for judging the final model.

Popular posts from this blog

After analyzing the model, your manager has informed that your regression model is suffering from multicollinearity. How would you check if he's true? Without losing any information, can you still build a better model?

Is rotation necessary in PCA? If yes, Why? What will happen if you don't rotate the components?

What does Latency mean?