LaneGCNi_ref

Implementation of LaneGCN (Learning Lane Graph Representations for Motion Forecasting)

refactors and reimplements some of LaneGCN and add more detailed comments
changes the whole framework to make it easier to carry experiments with other models.
reprocesses the data to MXNet Recoard to make it more flexible to be used (compared with originally provided data, ref to issue #4
add visualization and evaluation (to Argo eval.ai) script
provide an environment docker

I do hope and believe this project can help some learners get familiar with spatial-temporal prediction and trajectory prediction conveniently and efficiently. If you find this work is interesting, please also see and refer to the official work.

Prepare

Environment preparation

make sure you have nvidia-docker installed in your machine or follow the instruction to install it.

pull and start a container

docker pull zhaone/lanegcn:v1 # from docker hub. cost some time. size: about 30G (with data)
sh ./startc.sh # start a container
# you will get a container_id like: e563f358af72fd60c14c5a5...
docker exec -it e563(your container_id) /bin/bash

All the following operations happen in the container.

now you should be at /workspace/of container, then clone this repo to /workspace/
```
git clone git@github.com:zhaone/LaneGCN_ref.git
```

You can refer to ./docker to see how to build the image.

Data preparation

In container /workspace/

tar -xzf ./datasets.tar.gz

Bingo!

Training

To train the model, locate to root dir of this project and run

bash ./run_exp.sh lanegcn_ori train

Some experiment args can be found and modified in file commands/lanegcn_ori.sh

train() {
  horovodrun -np 4 python cli.py \ # 4 means using 4 gpus
    --mixed_precision \ # open mixed_precision training
    --epochs 36 \ # total training epoches
    --lr 0.001 \ # base learning rate
    --lr_decay 0.90 \ # lr decar rate
    --min_decay 0.1 \ # min lr decay rate
    --save_checkpoints_steps 100 \ # step interval of save model
    --print_steps 10 \ # step interval of printing on screen
    --board_steps 50 \ # step interval of writing tensorboard
    --eval_steps 1600 \ # step interval of evaluate the eval dataset
    --is_dist \ # use multiple gpus, if you have only one gpu, delete this arg
    --optimizer "adam" \ # step type of optimizer
    --pin_memory \ # dataloader arg
    --name "val_exp" \ # experiment name
    --model "lanegcn_ori" \ # model name
    --hparams_path "hparams/lanegcn_ori.json" \ # hyperparameters of model
    --num_workers 0 \ # dataloader arg
    --data_name "lanegcn" \ # data set name
    --data_version "full" \ # data set version
    --mode "train_eval" \ # experiment type
    --save_path "/workspace/expout/lanegcn/train" \ # output dir
    --batch_size 32 \
    --reload "latest" # type of model loaded when resume the training, best or latest
}

All the output of the experiment will be saved in save_path which have the following structure:

.
|-- env	# save some experiment configuration
|   |-- args.json	# experiment args
|   |-- hparams.json	# model hyperparameters
|   `-- src.tar.gz	# source code (model part)
|-- eval # 
|   |-- BEST-train_eval-debug-debug.json
|   |-- lanegcn_ori-eval-debug-000057800.json
|   |-- lanegcn_ori-train_eval-debug-000001600.json
|   |-- ...
|   |-- lanegcn_ori-train_eval-debug-000056000.json
|   `-- lanegcn_ori-train_eval-debug-000057600.json
|-- hooks # custom hooks
|   |-- hook_eval_vis # visualization of predcition svg
|   `-- hook_test_submit # prediction results of test dataset
|       |-- res_0.pkl
|       |-- res_1.pkl
|       `-- res_mgpu.h5 # prediction results of test dataset
|-- log # tensorboard event
|   |-- events.out.tfevents.1610454343.22683b51c1b9
|   |-- events.out.tfevents.1610784779.22683b51c1b9
|   |-- events.out.tfevents.1614741403.0f7fe7279276
|   |-- key.txt
|   `-- log.txt
`-- params # model params
    |-- best-000036800.params
    |-- best-000040000.params
    |-- best-000041600.params
    |-- best-000043200.params
    |-- best-000048000.params
    |-- best-000052800.params
    |-- best-000054400.params
    |-- latest-000057200.params
    |-- latest-000057300.params
    |-- latest-000057400.params
    |-- latest-000057500.params
    |-- latest-000057600.params
    |-- latest-000057700.params
    |-- latest-000057800.params
    `-- meta.json # some meta info

Note: persist data in host machine

If you want to save the output in the host machine rather than docker container, you can add a volume bind in startc.sh like:

docker run \
    --runtime=nvidia \
    --name lanegcn \
    --rm \
    --shm-size="20g" \
    -d \
    -v /tmp/lanegcn(your persisting host directory):/workspace/expout(root save_path in container) \ # add this binding to persist data in host machine
    -p 0.0.0.0:16006:6006 \
    -it \
    zhaoyi/lanegcn:v3 \

But make sure container has writing permission of host bind directory (like ... change the permission of host bind dir to 777).

Visualize training

tensorboard --logdir=your save_path --bind_all --port 6006

Then you can access tensorboard by address docker_host_machine_ip:16006 (port binding is in file startc.sh.

Evaluation

To evaluate model's performance on evaluation set, run

bash ./run_exp.sh lanegcn_ori val

Visualize prediction

If you want to visualize the prediction results, add --enable_hook args to

val() {
  horovodrun -np 4 python cli.py \
    --mixed_precision \
    --epochs 36 \
    ...
    --reload "latest" \
    --enable_hook # add this arg to visulzie when eval
}

The predcition for each sample will be saved in save_path/hooks/hook_eval_vis in .svg format.

Note

It will draw each evaluation sample and its prediction, press ctrl+c to stop plotting when you feel it is enough
script locates at util/argo_vis.py. You can modify this script to customize.

Testing

To generate prediction, run

bash ./run_exp.sh lanegcn_ori _test

Submit to A�rgo eval.ai

If you want to generate h5 result file that can be submitted on to A�rgo eval.ai, add --enable_hook args to

test() {
  horovodrun -np 4 python cli.py \
    --mixed_precision \
    --epochs 36 \
    --reload "latest" \
    --enable_hook # add this arg to generate result h5 file
}

Output result file locates at save_path/hooks/hook_test_submit/res_mgpu.h5, then you can upload it to the website.

Performance

Training

Evaluation result

Other materials

awesome trjactory prediction

Contact

Please propose issues or mail yizhaome@gmail.com.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
commands		commands
data		data
dataset		dataset
docker		docker
hparams		hparams
img		img
model		model
util		util
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
cli.py		cli.py
experiment.py		experiment.py
kill_all.sh		kill_all.sh
main.py		main.py
requirements.txt		requirements.txt
run_exp.sh		run_exp.sh
startc.sh		startc.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LaneGCNi_ref

Table of Contents

Intro

Prepare

Environment preparation

Data preparation

Training

Visualize training

Evaluation

Visualize prediction

Testing

Submit to A�rgo eval.ai

Performance

Other materials

Contact

About

Releases

Packages

Languages

License

zhaone/LaneGCN_ref

Folders and files

Latest commit

History

Repository files navigation

LaneGCNi_ref

Table of Contents

Intro

Prepare

Environment preparation

Data preparation

Training

Visualize training

Evaluation

Visualize prediction

Testing

Submit to A�rgo eval.ai

Performance

Other materials

Contact

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages