Skip to content

Update default tree-learning parameters and training budgets#12214

Draft
RAMitchell wants to merge 4 commits into
dmlc:masterfrom
RAMitchell:codex/default-params-rfc
Draft

Update default tree-learning parameters and training budgets#12214
RAMitchell wants to merge 4 commits into
dmlc:masterfrom
RAMitchell:codex/default-params-rfc

Conversation

@RAMitchell
Copy link
Copy Markdown
Member

This implements the default-parameter update proposed in #12131.

Changes:

  • Updates core tree-learning defaults:
    • eta / learning_rate: 0.3 -> 0.1
    • min_child_weight: 1 -> 2
    • subsample: 1.0 -> 0.8
    • colsample_bytree: 1.0 -> 0.8
    • leaves max_depth at 6, following the later issue discussion
  • Aligns Python-facing default training budgets:
    • xgboost.train(..., num_boost_round=300)
    • xgboost.cv(..., num_boost_round=300)
    • xgboost.dask.train(..., num_boost_round=300)
    • sklearn fallback DEFAULT_N_ESTIMATORS = 300
  • Updates Spark/JVM defaults to match the new tree defaults and numRound=300
  • Updates parameter docs, R docs, and Spark tests that asserted the previous sklearn fallback default

One risk to watch in CI is test flakiness from tests that train with implicit defaults but do not set a seed. Since the new defaults enable row and column subsampling by default, any unseeded tests relying on deterministic full-sample training may need explicit seeds or explicit subsample=1 / colsample_bytree=1.

Local checks:

  • cmake -S . -B build-cpu -DUSE_CUDA=OFF -DGOOGLE_TEST=ON -DUSE_DMLC_GTEST=ON
  • cmake --build build-cpu --target testxgboost -j35
  • cmake --build build-cpu --target xgboost -j35
  • ./build-cpu/testxgboost --gtest_filter=Param.*:XGBoostParameter.*:Learner.ParameterValidation:GBTree.SelectTreeMethod
  • Focused Python pytest slice: 23 passed, 1 skipped

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant