Skip to content

Latest commit

 

History

History
62 lines (46 loc) · 1.19 KB

File metadata and controls

62 lines (46 loc) · 1.19 KB

Model Training and Evaluation

Model Training

Optimization

Cross-Validation

  • hold-one out vs k-fold
    • information leaking when use to validate parameters
  • Use Bootstrapping
    • 1/e rule
  • Time-Series Data
    • not use future information
    • ARIMA
  • how to decompose bias and variance
    • in-sample vs out-of sample error

Model Evaluation

Confusion Matrix

  • When Accuracy can be irrelavant
    • unbalanced sample
  • High Precision/High Recall situation
    • system error/security - recall first
    • spam mail - precision first
    • investment signal
  • F1 score, F beta score
  • ROC Curve, PR Curve
    • TPR, FPR
    • ROC's relationship with rank correctness
    • ROC vs PR Curve*

Loss Functions and Measures

  • RMSE weakness
    • noise/outlier

A/B Testing

Overfitting and Underfitting

  • Explain/derive Bias-Variance Tradeoff
  • How to decompose error to bias/variance
    • in-sample and out-of-sample error
  • lower bias/underfitting
    • more data/data augmentation
    • model complexity
      • AIC, BIC
    • regularization
    • ensemble learning
  • lower vairance/overfitting
    • new feature/feature engineeting
    • model complexity
    • lower regularization

Hyper-parameter Search