How frequently an algorithm must be updated?

How frequently an algorithm must be updated?

You want to update an algorithm when:
- You want the model to evolve as data streams through infrastructure
- The underlying data source is changing
- Example: a retail store model that remains accurate as the business grows
- Dealing with non-stationarity

Some options:
- Incremental algorithms: the model is updated every time it sees a new training example
Note: simple, you always have an up-to-date model but you can't incorporate data to different degrees.
Sometimes mandatory: when data must be discarded once seen (privacy)
- Periodic re-training in "batch" mode: simply buffer the relevant data and update the model every-so-often
Note: more decisions and more complex implementations

How frequently?
- Is the sacrifice worth it?
- Data horizon: how quickly do you need the most recent training example to be part of your model?
- Data obsolescence: how long does it take before data is irrelevant to the model? Are some older instances
more relevant than the newer ones?
Economics: generally, newer instances are more relevant than older ones. However, data from the same month, quarter or year of the last year can be more relevant than the same periods of the current year. In a recession period: data from previous recessions can be more relevant than newer data from different economic cycles.

Popular posts from this blog

Is rotation necessary in PCA? If yes, Why? What will happen if you don't rotate the components?

After analyzing the model, your manager has informed that your regression model is suffering from multicollinearity. How would you check if he's true? Without losing any information, can you still build a better model?

What does Latency mean?