This repository serves as the official implementation for the KDD 2023 Paper titled, "Cognitive Evolutionary Search to Select Feature Interactions for Click-Through Rate Prediction". For a deeper understanding, kindly check out the Promotional Video, Slides, 中文解读.
- Ensure you have Python and PyTorch (version 1.8 or higher) installed. Our setup utilized Python 3.8 and PyTorch 1.13.0.
- Should you wish to leverage GPU processing, please install CUDA.
Before proceeding with the preprocessing, ensure you run the ./data/mkdir.sh
Upon completion, you'll observe the following three directory structures created at the same level as the project:
criteo
├── bucket
├── feature_map
└── processed
avazu
└── processed
huawei
└── processed
We conducted our experiments using three publicly available real-world datasets: Avazu, Criteo, and Huawei. You can access and download these datasets from the links provided below.
- Criteo: The raw dataset can be downloaded from https://www.kaggle.com/c/criteo-display-ad-challenge/data or https://www.kaggle.com/datasets/mrkmakr/criteo-dataset?resource=download. If you want to know how to preprocess the data, please refer to
./data/criteoPreprocess.py
- Avazu: The raw dataset can be downloaded from https://www.kaggle.com/c/avazu-ctr-prediction/data. If you want to know how to preprocess the data, please refer to
./data/avazuPreprocess.py
- Huawei: The raw dataset can be downloaded from https://www.kaggle.com/louischen7/2020-digix-advertisement-ctr-prediction. If you want to know how to preprocess the data, please refer to
./data/huaweiPreprocess.py
If you've acquired the source code, you can train the CELS model.
$ cd main
$ python train.py --dataset=[dataset] --strategy=[strategy] --gpu=[gpu_id]
The options for the command parameter "strategy" are ['1,1', '1+1', 'n,1', 'n+1'].
You can change the model parameters in ./config/configs.py
You can visualize the evolution path depicted by gene maps of the model.
$ cd main
$ python plotUtils.py --dataset_strategy=[dataset_strategy] --datetime=[datetime]
Should you have any questions regarding our paper or codes, please don't hesitate to reach out via email at yrunl@mail.ustc.edu.cn or demon@mail.ustc.edu.cn.
Our code is developed based on GitHub - shenweichen/DeepCTR-Torch: 【PyTorch】Easy-to-use,Modular and Extendible package of deep-learning based CTR models.