Does adanet support GBDT as subnetwork? #121

fangkuann · 2019-09-05T05:51:32Z

No description provided.

cweill · 2019-09-09T19:15:13Z

@fangkuann: Yes it does! You can see how we use it here. Note this is the old contrib version of GBDT, and there is a new one called tf.estimator.BoostedTreesEstimator which should work, however we haven't tried it.

Please give it a shot and let us know how it works for you.

fangkuann · 2019-09-10T10:16:17Z

@cweill thanks for your answer!
In this paper https://ai.google/research/pubs/pub48133/, the authors combine the GBDT based model and DNN based model to enhance model performance.
Therefore I want to reproduce it in the product where I worked for using AdaNet. Since I want to train two models using only one single training tool (Training GBDT using LightGBM and training NN using TensorFlow makes system complex, and hard to maintain)

According to the tutorial https://medium.com/tensorflow/combining-multiple-tensorflow-hub-modules-into-one-ensemble-network-with-adanet-56fa73588bb0.
It seems using BoostedTreesEstimator here is almost same as using LinearEstimator, just replace the linear_estimator instead of tree_estimator like code below.

import adanet
estimator = adanet.AutoEnsembleEstimator(
    head=ranking_head,
    candidate_pool=[
        tree_estimator,
        dnn_estimator
    ],
    config=run_config,
    max_iteration_steps=50000,
)

However, our DNN based model has already trained and deployed using TF-Ranking library. Therefore, the input_fn for training has the shape of [None, list_size, 1] for each feature.

When running, the GBDT throws an exception when create bucket for each feature:
ValueError: List argument 'bucket_boundaries' to 'boosted_trees_bucketize' Op with length 1 must match length 64 of argument 'float_values'.
where 64 is list_size in our training config. Maybe GBDT expected [None, 1] shape for each feature.

I did not know whether it is still available to combine TF-Ranking and AdaNet. If it is yes, what is the guideline to do it.

cweill · 2019-09-12T17:43:58Z

@fangkuann Thanks for sharing that paper, I wasn't aware of it, and will share it with our team. As for the Core GBDT in AdaNet: your sample code looks fine. However I have not used TF-Ranking.

One thing you could try is use the same TF-Ranking head for the NN and GBDT, as well as the AutoEnsembleEstimator. What does the dnn_estimator look like? Is it using ranking_head too?

…#1 #121 In estimator_distributed_test_runner.py. PiperOrigin-RevId: 270085601

cweill added the question Further information is requested label Sep 9, 2019

cweill self-assigned this Sep 12, 2019

cweill pushed a commit that referenced this issue Sep 19, 2019

Change GBDT subestimator to tf.estimator.BoostedTreeEstimator in test. …

55e068e

…#1 #121 In estimator_distributed_test_runner.py. PiperOrigin-RevId: 270085601

fangkuann mentioned this issue Sep 24, 2019

Multi task tensorflow/ranking#112

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Does adanet support GBDT as subnetwork? #121

Does adanet support GBDT as subnetwork? #121

fangkuann commented Sep 5, 2019

cweill commented Sep 9, 2019

fangkuann commented Sep 10, 2019 •

edited

Loading

cweill commented Sep 12, 2019

Does adanet support GBDT as subnetwork? #121

Does adanet support GBDT as subnetwork? #121

Comments

fangkuann commented Sep 5, 2019

cweill commented Sep 9, 2019

fangkuann commented Sep 10, 2019 • edited Loading

cweill commented Sep 12, 2019

fangkuann commented Sep 10, 2019 •

edited

Loading