01:00
00:40
A: The first 80 rows of the data are assigned to training; the last 20 are assigned to testing.
B: The first 20 rows of the data are assigned to training; the last 80 are assigned to testing.
C: Sample 100 rows from the table with replacement. The first 80 rows are assigned to training; the last 20 are assigned to testing.
D. Sample 100 rows from the table without replacement. The first 80 rows are assigned to training; the last 20 are assigned to testing.
00:45
A: A student fails a final exam because they did not go over the practice final exam, which would have given them practice with important concepts.
B: A student fails a final exam because they memorized the answers to the practice final exam without understanding the important concepts.
C: Fitting a model that crosses through all points in the training set.
D: Fitting a model function that more or less follows the points in the training set.
E: A student fails a final exam in spite of healthy studying habits because the professor gives out an exam containing topics which were not taught in the course.
01:00
00:30
A: True
B: False
00:30
05:00
35:00
05:00
35:00
01:00
Which one of these (open pollev.com) is not an example of overfitting (either in real life or in statistics)?
01:00
If I decrease the RSS (e.g. by fitting a more accurate model) does the \(R^2\) value necessarily increase?
00:30
Fill in the blanks: overfit models tend to
fit the training data (well/poorly)
fit the testing data (well/poorly)
01:00
Suppose I overfit my model to the training data. In which scenario (for which training data) would I expect the test set performance to be significantly worse? Assume that the testing sets A and B look like their corresponding training sets.
