Skip to content

Latest commit

 

History

History
57 lines (41 loc) · 2.39 KB

08-xgb-tuning.md

File metadata and controls

57 lines (41 loc) · 2.39 KB

6.8 XGBoost parameter tuning

Slides

Notes

XGBoost has various tunable parameters but the three most important ones are:

  • eta (default=0.3)
    • It is also called learning_rate and is used to prevent overfitting by regularizing the weights of new features in each boosting step. range: [0, 1]
  • max_depth (default=6)
    • Maximum depth of a tree. Increasing this value will make the model more complex and more likely to overfit. range: [0, inf]
  • min_child_weight (default=1)
    • Minimum number of samples in leaf node. range: [0, inf]

For XGBoost models, there are other ways of finding the best parameters as well but the one we implement in the notebook follows the sequence of:

  • First find the best value for eta
  • Second, find the best value for max_depth
  • Third, find the best value for min_child_weight

Other useful parameter are:

  • subsample (default=1)
    • Subsample ratio of the training instances. Setting it to 0.5 means that model would randomly sample half of the training data prior to growing trees. range: (0, 1]
  • colsample_bytree (default=1)
    • This is similar to random forest, where each tree is made with the subset of randomly choosen features.
  • lambda (default=1)
    • Also called reg_lambda. L2 regularization term on weights. Increasing this value will make model more conservative.
  • alpha (default=0)
    • Also called reg_alpha. L1 regularization term on weights. Increasing this value will make model more conservative.

Add notes from the video (PRs are welcome)

⚠️ The notes are written by the community.
If you see an error here, please create a PR with a fix.

Navigation