The flex-trees package consists of a set of tools and utilities to work with Decision Tree (DT) models in Federated Learning (FL). It is designed to be used with the FLEXible framework, as it is an extension of it.
flex-trees comes with some state-of-the-art decision tree models for federated learning. It also provides multiple tabular datasets to test the models.
The methods implemented in the repository are:
Model |
Description |
Citation |
---|---|---|
Federated ID3 | The ID3 model adapted to a federated learning scenario. | A Hybrid Approach to Privacy-Preserving Federated Learning |
Federated Random Forest | The Random Forest (RF) model adapted to a federated learning scenario. Each client builds a RF locally, then N trees are randomly sampled from each client to get a global RF composed from the N trees retrieved from the clients. |
Federated Random Forests can improve local performance of predictive models for various healthcare applications |
Federated Gradient Boosting Decision Trees | The Gradient Boosting Decision Trees model adapted to a federated learning scenario. In this model a global hash table is first created to aling the data between the clients within sharing it. After that, N trees (CART) are built by the clients. The process of building the ensemble is iterative, and one client builds the tree, then it is added to the ensemble, and after that the weights of the instances is updated, so the next client can build the next tree with the weights updated. |
Practical Federated Gradient Boosting Decision Trees |
Interpretable Client Decision Tree Aggregation For Federated Learning process (ICDTA4FL process) | The ICDTA4FL process is a process that allows the clients to build a decision tree locally, and then the trees are aggregated in a global tree by merging the rules extracted from the local trees. The process is iterative, and the clients can build a tree, then the trees that surpass a threshold are selected to be merged. In order the merge the trees, these are transformed into rules, and then the merged rules are used to build a global tree. This process is tree independent, and the code is available for merging ID3, CART and C4.5 trees. | Interpretable Client Decision Tree Aggregation For Federated Learning process |
The tabular datasets available in the repository are:
Dataset |
Description |
Citation |
---|---|---|
Adult | The Adult dataset is a dataset that contains demographic information about the people, and the task is to predict if the income of the person is greater than 50K. | UCI Machine Learning Repository |
Breast Cancer | The Breast Cancer dataset is a dataset that contains information about the breast cancer, and the task is to predict if the cancer is benign or malignant. | UCI Machine Learning Repository |
Credit Card | The Credit Card dataset is a dataset that contains information about the credit card transactions, and the task is to predict if the transaction is fraudulent or not. | Kaggle |
ILPD | The ILPD dataset is a dataset that contains information about the Indian Liver Patient, and the task is to predict if the patient has liver disease or not. | UCI Machine Learning Repository |
Nursery | The Nursery dataset is a dataset that contains information about the nursery, and the task is to predict the acceptability of the nursery. | UCI Machine Learning Repository |
Bank Marketing | The Bank Marketing dataset is a dataset that contains information about the bank marketing, and the task is to predict if the client will subscribe to a term deposit. | UCI Machine Learning Repository |
Magic Gamma | The Magic Gamma dataset is a dataset that contains information about the magic gamma, and the task is to predict if the gamma is signal or background. | UCI Machine Learning Repository |
To get started with flex-trees, you can check the notebooks available in the repository. They cover the following topics:
- Federated ID3 with FLEXible.
- Federated Random Forest with FLEXible.
- Practical Federated Gradient Boosting Decision Trees with FLEXible.
We recommend Anaconda/Miniconda as the package manager. The following is the corresponding flex-trees
versions and supported Python versions.
flex |
flex-trees |
Python |
---|---|---|
main / nightly |
main / nightly |
>=3.8 , <=3.11 |
v0.6.0 |
v0.1.0 |
>=3.8 , <=3.11 |
To install the package, you can use the following commands:
Using pip:
pip install flextrees
Download the repository and install it locally:
git clone git@github.com:FLEXible-FL/flex-trees.git
cd flex-trees
pip install -e .
If you use this package, please cite the following paper:
title={FLEX: FLEXible Federated Learning Framework},
author={Herrera, Francisco and Jim{\'e}nez-L{\'o}pez, Daniel and Argente-Garrido, Alberto and Rodr{\'\i}guez-Barroso, Nuria and Zuheros, Cristina and Aguilera-Martos, Ignacio and Bello, Beatriz and Garc{\'\i}a-M{\'a}rquez, Mario and Luz{\'o}n, M},
journal={arXiv preprint arXiv:2404.06127},
year={2024}
}