Posts

Showing posts from July, 2017

What is L-BFGS?

What is L-BFGS? Answer: Limited-memory BFGS (L-BFGS or LM-BFGS) is an optimization algorithm in the family of quasi-Newton methods that approximates the Broyden-Fletcher-Goldfarb-Shanno (BFGS) algorithm using a limited amount of computer memory. It is a popular algorithm for parameter estimation in machine learning.

What does BFGS stand for?

What does BFGS stand for? Answer: Broyden-Fletcher-Goldfarb-Shanno algorithm

What is the trade-off between precision and recall?

What is the trade-off between precision and recall? Answer: A model that predicts "yes" when it's even a little bit confident will probably have a high recall but a low precision; a model that predicts "yes" only when it's extremely confident is likely to have a low recall and a high precision. Alternatively, you can think of this as a trade-off between false positives and false negatives. Saying "yes" too often (high recall) will give you lots of false positives; saying "no" too often will give you lots of false negatives (high precision).

How would you divide up a data set for training and testing?

How would you divide up a data set for training and testing? Answer: Split your data set, so that two-thirds of it is used to train the model, after which we test/measure the model's performance on the remaining third.

Machine learning: What tends to happen with the cross-validation error in a linear model as the degree of polynomial increases?

Machine learning: What tends to happen with the cross-validation error in a linear model as the degree of polynomial increases? Answer: It starts high (high bias) and decreases, reaching a minimum, and then increases (high variance).

Machine learning: What tends to happen with the training error in a linear model as the degree of polynomial increases?

Machine learning: What tends to happen with the training error in a linear model as the degree of polynomial increases? Answer: The error decreases, but too high a degree of polynomial will cause overfitting.

How many hidden layers should there be in a neural network?

How many hidden layers should there be in a neural network? Answer: Start with 1 as a default, and if more than one, have the same number of units at each layer. The more the better. The number of units in each hidden layer should be more than, or a multiple of, the input units.

What are the number of neurons (units) at the output layer of a neural network performing classification?

What are the number of neurons (units) at the output layer of a neural network performing classification? Answer: The number of classes you are classifying.

What are the number of neurons (units) at the input layer of a neural network?

What are the number of neurons (units) at the input layer of a neural network? Answer: The number of features.

How should you initialize Theta for a neural network?

How should you initialize Theta for a neural network? Answer: Initialize as a matrix of random reals between 0 and 1. Constrain within a range of +/- epsilon using Theta 2epsilon - epsilon.

What makes neural networks superior over regression or classification?

What makes neural networks superior over regression or classification? Answer: Each hidden layer learns its own features instead of being given features.

What are some alternative algorithms that can optimize for a logistic regression problem?

What are some alternative algorithms that can optimize for a logistic regression problem? Answer: - conjugate gradient - BGFS - L-BGFS

What can you do if your model suffers from overfitting due to high variance?

What can you do if your model suffers from overfitting due to high variance? Answer: You can remove features. Another solution is to obtain more training examples (if you can). Or use regularization.

What can you do when your model has high bias (which means it performs poorly even on your training data)?

What can you do when your model has high bias (which means it performs poorly even on your training data)? Answer: One thing to try is adding more features.

Low bias but very high variance typically correspond to _______.

Low bias but very high variance typically correspond to _______. Answer: overfitting

High bias and low variance typically correspond to _______.

High bias and low variance typically correspond to _______. Answer: underfitting

What is another name for the F1 score?

What is another name for the F1 score? Answer: the harmonic mean of precision and recall

How would you calculate the F1 score?

How would you calculate the F1 score? Answer: 2 precision recall / (precision + recall)

In statistics, what does recall measure?

In statistics, what does recall measure? Answer: Recall measures what fraction of the positives our model identified.

In statistics, what does precision measure?

In statistics, what does precision measure? Answer: Precision measures how accurate our positive predictions are.

In statistics, how would you calculate precision?

In statistics, how would you calculate precision? Answer: true_pos / (true_pos + false_pos)

What is a Type 2 error?

What is a Type 2 error? Answer: A false negative

What is a Type 1 error?

What is a Type 1 error? Answer: A false positive

How would you split up a data set in order to choose from multiple models?

How would you split up a data set in order to choose from multiple models? Answer: In such a situation, you should split the data into three parts: a training set for building models, a validation set for choosing among trained models (called the cross-validation set), and a test set for judging the final model.

What is the time and space complexity of merge sort?

What is the time and space complexity of merge sort? Answer: O(n lg n) time O(n) space

What is the time and space complexity of heapsort?

What is the time and space complexity of heapsort? Answer: O(n lg n) time O(1) space