Do you know / used data reduction techniques other than PCA? What do you think of step-wise regression? What kind of step-wise techniques are you familiar with?

Do you know / used data reduction techniques other than PCA? What do you think of step-wise regression? What kind of step-wise techniques are you familiar with?

Data reduction techniques other than Principal component analysis (PCA):

-Partial least squares: like PCR (principal component regression) but chooses the principal components in a supervised way. Gives higher weights to variables that are most strongly related to the response

step-wise regression?
- the choice of predictive variables are carried out using a systematic procedure
- Usually, it takes the form of a sequence of F-tests, t-tests, adjusted R-squared, AIC, BIC
- at any given step, the model is fit using unconstrained least squares
- can get stuck in local optima
- Better: Lasso

step-wise techniques:
- Forward-selection: begin with no variables, adding them when they improve a chosen model comparison criterion
- Backward-selection: begin with all the variables, removing them when it improves a chosen model comparison criterion

Better than reduced data:
-Example 1: If all the components have a high variance: which components to discard with a guarantee that there will be no significant loss of the information?
-Example 2 (classification):
- One has 2 classes; the within class variance is very high as compared to between class variance
- PCA might discard the very information that separates the two classes

Better than a sample:
- When number of variables is high relative to the number of observations

Popular posts from this blog

After analyzing the model, your manager has informed that your regression model is suffering from multicollinearity. How would you check if he's true? Without losing any information, can you still build a better model?

Is rotation necessary in PCA? If yes, Why? What will happen if you don't rotate the components?

What does Latency mean?