CANIFE: Crafting Canaries for Empirical Privacy Measurement in Federated Learning

This repository contains code for the paper "CANIFE: Crafting Canaries for Empirical Privacy Measurement in Federated Learning". We implement an attack to measure the empirical privacy of federated training pipelines based on designing out-of-distribution examples which we call canaries.

Our implementation is based on a modified FLSim but the code contained within canife can be used in any (PyTorch-based) FL framework of your choice.

Installation

Via pip and anaconda

conda create -n "canife" python=3.9 
conda activate canife
pip install -r ./requirements.txt

Paper Replication

CelebA

To replicate the above plot on CelebA with epsilon=50 first clone the LEAF repo and then under ../leaf/data/celeba/ run

./preprocess.sh -s niid --sf 1.0 -k 0 -t user

This should form splits all_data_0_0_keep_0_train_9.json and all_data_0_0_keep_0_test_9.json under ../leaf/data/celeba/train/ and ..leaf/data/celeba/test/ respectively.

To begin training with periodic attacking run the command:

python launcher.py --dataset celeba --dump-path DUMP_PATH --data-root DATA_ROOT --epsilon 50 --canary-setup holdout --canary-test-type train_freeze --canary-loss loss2 --canary-design-pool-size 100 --canary-design-minibatch-size num_users --canary-num-test-batches 100 --canary-insert-offset 40 --users-per-round 100 --local-batch-size 128 --canary-epochs 2000 --fl-epochs 30 --model-arch resnet --canary-norm-matching True --canary-norm-constant 5 --device cpu

Where:

DATA_ROOT is the path to the CelebA LEAF data folder i.e .../leaf/data/celeba
DUMP_PATH is the path for CANIFE to output experiment training plots and logs which by default is ./local_checkpoints

One can also set --plot-path PLOT_PATH to output to DUMP_PATH/PLOT_PATH/ (by default plot_path=""). Change --device cpu to --device gpu to run on a single GPU.

NOTE: ./local_checkpoints/ has a toy example attack output (from a checkpointed DP model on Sent140) which should be deleted before running any new experiments. Alternatively, change the --plot-path arg to a new folder within ./local_checkpoints/.

Once the experiment has been completed you can extract the data and plot as follows:

Extract attack logs from attack checkpoints (.tar files) by running the script python extract_exp.py --path DUMP_PATH/ --csv-name celeba_eps50_tf.csv
Plot the attack using the script python plot_sweep.py --csv-path PATH_TO_CSV/celeba_eps50_tf.csv

This will output the figure under /plotting/

NOTE: To generate the full figure, rerun the experiment with --epsilon equal to 10 and 30 with different --plot-path args, extract each .csv and combine them before plotting.

Other Datasets

For other LEAF datasets make the following changes:

Ensure you have the correct splits under leaf/data see the LEAF repo for installation instructions
- For Shakespeare:
  - Use LEAF preprocess cmd: ./preprocess.sh -s niid --sf 1.0 -k 0 -t sample -tf 0.8 in ..leaf/data/shakespeare/
  - Will form splits all_data_0_0_keep_0_train_9.json and all_data_0_0_keep_0_test_9.json
- For Sent140:
  - Use LEAF preprocess cmd: ./preprocess.sh -s niid --sf 1.0 -k 0 -t user in ../leaf/data/sent140/
  - Will form splits all_data_0_15_keep_1_train_6.json and all_data_0_15_keep_1_test_6.json
For running Shakespeare:
- Add/replace args --model-arch shakes-lstm --dataset shakespeare --users-per-round 60 --fl-epochs 30 --fl-client-lr 3 --local-batch-size 128 --canary-insert-offset 8 --canary-design-pool-size 100
For running Sent140:
- Add/replace args --model-arch lstm --dataset sent140 --users-per-round 100 --fl-epochs 15 --local-batch-size 32 --canary-insert-offset 100 --canary-design-pool-size 800

Reference

If the code and/or paper contained in this repository were useful to you please consider citing this work:

@article{maddock2022canife,
  title={{CANIFE}: Crafting Canaries for Empirical Privacy Measurement in Federated Learning},
  author={Maddock, Samuel and Sablayrolles, Alexandre and Stock, Pierre},
  journal={arXiv preprint arXiv:2210.02912},
  year={2022}
}

Contributing

See the CONTRIBUTING for how to contribute to this library.

License

This code is released under BSD-3-Clause, as found in the LICENSE file.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
FLSim		FLSim
assets		assets
canife		canife
local_checkpoints		local_checkpoints
plotting		plotting
privacy_lint @ 7de1544		privacy_lint @ 7de1544
.gitignore		.gitignore
.gitmodules		.gitmodules
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
arg_handler.py		arg_handler.py
launcher.py		launcher.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CANIFE: Crafting Canaries for Empirical Privacy Measurement in Federated Learning

Installation

Paper Replication

CelebA

Other Datasets

Reference

Contributing

License

About

Releases

Packages

Contributors 2

Languages

License

facebookresearch/canife

Folders and files

Latest commit

History

Repository files navigation

CANIFE: Crafting Canaries for Empirical Privacy Measurement in Federated Learning

Installation

Paper Replication

CelebA

Other Datasets

Reference

Contributing

License

About

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages