The location of the macros (BRAMs and DSPs) plays an important role in the overall routability and timing performance. The Cumple tool is enhanced macro placer based on the SimPL framework when two important design constraints in the industry are considered, relative placement constraint (RPC) and regional constraint (RC). Both RPC and RC are important to achieve the timing closure and would increase the difficulties with sparse locations and increased localized density. The Cumple tool is able to decrease congestion levels under these two constraints after intergrating the Vivado ML 2021.1. Although initially our macro placer ranks 3rd place in MLCAD 2023 FPGA Macro Placement Contest, the enhanced version is competitive with the open-sourced DreamplaceFPGA-MP. In particular, our macro placement performs better on those cases with both RPC and RC.
The overall evaluation is based on the MLCAD 2023 contest benchmark. The contest website is:
https://github.com/TILOS-AI-Institute/MLCAD-2023-FPGA-Macro-Placement-Contest
The link of the contest benchmark is:
https://www.kaggle.com/datasets/ismailbustany/updated-mlcad-2023-contest-benchmark
You could download the benchmark from this link. And then hard link the benchmark under the repo with the following commands:
cd CUMPLE_MLCAD
mkdir benchmarks
ln -s <benchmark> benchmarks/mlcad2023_v2
Note that in this benchmark, some bugs in the benchmarks metioned in FAQ https://github.com/TILOS-AI-Institute/MLCAD-2023-FPGA-Macro-Placement-Contest/blob/main/Documentation/FAQ.md have not been corrected in the benchmark in time. Here, we provide the scripts to correct this bug:
cd scripts/data_correction
bash replace.sh
If you want to check the routability of the macro placement solution, you are required to install the Vivado ML 2021.1, the link of downloading the Vivado is:
Meanwhile, you could obtain the license from the AMD University Program:
https://www.amd.com/en/corporate/university-program/donation-program.html
design.nodes: nodes in the netlist
design.nets: nets in the netlist
design.lib: library containing all the cells
design.scl: distribution of the macros on the FPGA board
design.cascade_shape: cascaded macros type
design.cascade_shape_instances: cascade macro instances
design.regions: regions constraining all the macros
design.pl : fixed IBUF/OBUF ports on the FPGA
design.dcp : the dcp/checkpoint file in Vivado
macroplacement.pl: the location of the macros
place_io.tcl: the tcl scripts placing the IBUF/OBUF ports
place_macro.tcl : the tcl scripts placing the macros
flow.tcl: the tcl scripts of the overall flow of place and route in Vivado
run.log: the log file of the macro placement
vivado.log: the report of overall place and route in the vivado
binary name: Cumple
options:
--benchmark_path: the path storing the benchmark folders (default: ../benchmarks/mlcad2023_v2/)
--flow: the flow of the place and route (cumple: simply macro placement, all: overall place and route)
--log_dir: the directory storing all the output files
Step 1: Go to the project root and build by
$ cd CUMPLE_MLCAD
$ ./scripts/build.py -o release
Step 2: Compile the io_map.cxx file by:
$ cd scripts
$ g++ io_map.cxx -o io_map
$ cp io_map ../run
$ cd ../run
Then we could run by:
$ python3 run.py <case> --benchmark_path <bencmark_path> --flow <flow> --log_dir <path_log>
For example, if we want to run Design_2 with the overall place and route flow, then
$ python3 run.py d2 --benchmark_path ../benchmarks/mlcad2023_v2/ --flow all --log_dir case_2
If we want to run all the testcases in the MLCAD2023 benchmark in parallel, then:
bash run_all.sh <parallel_num> <start_id> <end_id> <log_dir>
For example, if we want to run all the cases with 5 cases in parallel, then
bash run_all.sh 5 0 70 version_1
Step 3: Calculate the congestion scores and the PnR time information with statistics.py scripts
python3 statistics.py <log_dir>
For example, if we want to obtain the congestion scores and the PnR time information in version_1, then
python3 statistics.py version_1
There are two comparable baselines, Vivado ML 2021.1 and DreamPlaceFPGA-MP. Here we provide the scripts for reproducing the results in Vivado ML 2021.1 with:
$ python3 run_vivado.py <case> --benchmark_path <bencmark_path> --log_dir <path_log>
If we would like to run the all cases, then we use:
bash run_all_vivado.sh 5 0 70 version_vivado_1
For the DreamplaceFPGAMP, please refer to: https://github.com/zhilix/DREAMPlaceFPGA-MP
We also have added the DreamplaceFPGAMP in the .submodule, you could refer to the Readme.md in DreamplaceFPGAMP to reproduce the results.
Firstly replace the PlaceDB.py file under DreamplaceFPGAMP/dreamplacefpga/PlaceDB.py with scripts/PlaceDB.py
cp scripts/PlaceDB.py DreamplaceFPGAMP/dreamplacefpga/
Then, you need to build and run the docker for DreamplaceFPGAMP, that is:
cp scripts/run_all_DreamplaceFPGAMP.sh DreamplaceFPGAMP/
cd DreamplaceFPGAMP/
docker build . --file Dockerfile --tag utda_macro_placer/dreamplace_fpga:1.0
docker run -it -v $(pwd):/DREAMPlaceFPGA-MP -v /data/ssd/qluo/benchmark/mlcad2023_v2/:/Designs utda_macro_placer/dreamplace_fpga:1.0 bash
Here we need to change "/data/ssd/qluo/benchmark/mlcad2023_v2/" with the path to the mlcad2023 benchmark.
Then Go to the DREAMPlaceFPGA-MP
directory in the Docker and install the package
cd /DREAMPlaceFPGA-MP
rm -rf build
mkdir build
cd build
cmake .. -DCMAKE_INSTALL_PREFIX=/DREAMPlaceFPGA-MP -DPYTHON_EXECUTABLE=$(which python)
make
make install
Then if you want to run all the cases, you could use run_all_DreamplaceFPGAMP.sh in scripts folder like:
source <path_to_root_dir>/run_all_DreamplaceFPGAMP.sh <path_to_root_dir> <benchmark_path> <log_dir> <start_id> <end_id> <gpu_flag>
like we want to run CPU version, we could run as:
source /DREAMPlaceFPGA-MP/run_all_DreamplaceFPGAMP.sh /DREAMPlaceFPGA-MP/ /Designs/ /DREAMPlaceFPGA-MP/dreamfpgalog/ 0 70 0
The log file and the macroplacement.pl are all in the /DREAMPlaceFPGA-MP/dreamfpgalog/. Then we exit the docker and run the vivado flow for all the macro solution generated by DreamplaceFPGA-MP
cd ..
cp -r DreamplaceFPGAMP/dreamfpgalog run/
cp scripts/run_DreamplaceFPGA_vivado.sh run/
cd run
bash run_DreamplaceFPGA_vivado.sh 5 0 70 dreamfpgalog
To better demonstrate the effectiveness of introduced techniques, ablation experiments have been done as follows: For V1 (without macro-size aware pseudo nets), then:
bash run_all_nomacropseudo.sh 5 0 70 nomacropseudo
For V2 (without regional constraint guided spreading), then:
bash run_all_gp2region.sh 5 0 70 nogp2region
- g++ (version >= 5.4.0) or other working c++ compliers
- CMake (version >= 3.5.1)
- Boost (version >= 1.58)
- Python (version 3, optional, for using scripts, also pyjson5 is required)
If you would like to use our FPGA macro placer under the design constraints, please cite our DOI as: Or cite our published FCCM paper as:
@INPROCEEDINGS{10653672,
author={Luo, Qin and Zang, Xinshi and Wang, Qijing and Wang, Fangzhou and Young, Evangeline F.Y. and Wong, Martin D.F.},
booktitle={2024 IEEE 32nd Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM)},
title={A Routability-Driven Ultrascale FPGA Macro Placer with Complex Design Constraints},
year={2024},
volume={},
number={},
pages={1-7},
keywords={Benchmark testing;Routing;Field programmable gate arrays;Optimization;FPGA placement;Design constraints;Routability},
doi={10.1109/FCCM60383.2024.00024}}
In the end, we are sincerely grateful to Tingyuan Liang for his great work of AMFPlacers, an open-sourced Mixed Size FPGA placers (https://github.com/zslwyuan/AMF-Placer), and we have learned a lot from it.