DeePMD-kit v3: Multiple-backend Framework, DPA-2 Large Atomic Model, and Plugin Mechanisms
After eight months of public tests, we are excited to present the first stable version of DeePMD-kit v3, an advanced version that enables deep potential models with TensorFlow, PyTorch, or JAX backends. Additionally, DeePMD-kit v3 introduces support for the DPA-2 model, a novel architecture optimized for large atomic models. This release enhances plugin mechanisms, making integrating and developing new models easier.
Highlights
Multiple-backend framework: TensorFlow, PyTorch, and JAX support
DeePMD-kit v3 adds a versatile, pluggable framework providing consistent training and inference experience across multiple backends. Version 3.0.0 includes:
- TensorFlow backend: Known for its computational efficiency with a static graph design.
- PyTorch backend: A dynamic graph backend that simplifies model extension and development.
- DP backend: Built with NumPy and Array API, a reference backend for development without heavy deep-learning frameworks.
- JAX backend: Based on the DP backend via Array API, a static graph backend.
Features | TensorFlow | PyTorch | JAX | DP |
---|---|---|---|---|
Descriptor local frame | ✅ | |||
Descriptor se_e2_a | ✅ | ✅ | ✅ | ✅ |
Descriptor se_e2_r | ✅ | ✅ | ✅ | ✅ |
Descriptor se_e3 | ✅ | ✅ | ✅ | ✅ |
Descriptor se_e3_tebd | ✅ | ✅ | ✅ | |
Descriptor DPA1 | ✅ | ✅ | ✅ | ✅ |
Descriptor DPA2 | ✅ | ✅ | ✅ | |
Descriptor Hybrid | ✅ | ✅ | ✅ | ✅ |
Fitting energy | ✅ | ✅ | ✅ | ✅ |
Fitting dipole | ✅ | ✅ | ✅ | ✅ |
Fitting polar | ✅ | ✅ | ✅ | ✅ |
Fitting DOS | ✅ | ✅ | ✅ | ✅ |
Fitting property | ✅ | ✅ | ✅ | |
ZBL | ✅ | ✅ | ✅ | ✅ |
DPLR | ✅ | |||
DPRc | ✅ | ✅ | ✅ | ✅ |
Spin | ✅ | ✅ | ✅ | |
Gradient calculation | ✅ | ✅ | ✅ | |
Model training | ✅ | ✅ | ||
Model compression | ✅ | ✅ | ||
Python inference | ✅ | ✅ | ✅ | ✅ |
C++ inference | ✅ | ✅ | ✅ |
Critical features of the multiple-backend framework include the ability to:
- Train models using different backends with the same training data and input script, allowing backend switching based on your efficiency or convenience needs.
# Training a model using the TensorFlow backend
dp --tf train input.json
dp --tf freeze
dp --tf compress
# Training a model using the PyTorch backend
dp --pt train input.json
dp --pt freeze
dp --pt compress
- Convert models between backends using
dp convert-backend
, with backend-specific file extensions (e.g.,.pb
for TensorFlow and.pth
for PyTorch).
# Convert from a TensorFlow model to a PyTorch model
dp convert-backend frozen_model.pb frozen_model.pth
# Convert from a PyTorch model to a TensorFlow model
dp convert-backend frozen_model.pth frozen_model.pb
# Convert from a PyTorch model to a JAX model
dp convert-backend frozen_model.pth frozen_model.savedmodel
# Convert from a PyTorch model to the backend-independent DP format
dp convert-backend frozen_model.pth frozen_model.dp
- Run inference across backends via interfaces like
dp test
, Python/C++/C interfaces, or third-party packages (e.g., dpdata, ASE, LAMMPS, AMBER, Gromacs, i-PI, CP2K, OpenMM, ABACUS, etc.).
# In a LAMMPS file:
# run LAMMPS with a TensorFlow backend model
pair_style deepmd frozen_model.pb
# run LAMMPS with a PyTorch backend model
pair_style deepmd frozen_model.pth
# run LAMMPS with a JAX backend model
pair_style deepmd frozen_model.savedmodel
# Calculate model deviation using different models
pair_style deepmd frozen_model.pb frozen_model.pth frozen_model.savedmodel out_file md.out out_freq 100
- Add a new backend to DeePMD-kit much more quickly if you want to contribute to DeePMD-kit.
DPA-2 model: a large atomic model as a multi-task learner
The DPA-2 model offers a robust architecture for large atomic models (LAM), accurately representing diverse chemical systems for high-quality simulations. In this release, DPA-2 can be trained using the PyTorch backend, supporting both single-task (see examples/water/dpa2
) or multi-task (see examples/water_multi_task/pytorch_example
) training schemes. DPA-2 is available for Python/C++ inference in the JAX backend.
The DPA-2 descriptor comprises repinit
and repformer
, as shown below.
The PyTorch backend supports training strategies for large atomic models, including:
- Parallel training: Train large atomic models on multiple GPUs for efficiency.
torchrun --nproc_per_node=4 --no-python dp --pt train input.json
- Multi-task training: For large atomic models trained across a broad range of data calculated on different DFT levels with shared descriptors. An example is given in
examples/water_multi_task/pytorch_example/input_torch.json
. - Finetune: Training a pre-train large atomic model on a smaller, task-specific dataset. The PyTorch backend has supported
--finetune
argument in thedp --pt train
command line.
Plugin mechanisms for external models
In version 3.0.0, the plugin capabilities have been implemented to support the development and integration of potential energy models using TensorFlow, PyTorch, or JAX backends, leveraging DeePMD-kit's trainer, loss functions, and interfaces. A plugin example is deepmd-gnn, which supports training the MACE and NequIP models in the DeePMD-kit with the familiar commands.
dp --pt train mace.json
dp --pt freeze
dp --pt test -m frozen_model.pth -s ../data/
Other new features
- Descriptor se_e3_tebd. (#4066)
- Fitting the property (#3867).
- New training parameters:
max_ckpt_keep
(#3441),change_bias_after_training
(#3993), andstat_file
. - New command line interface:
dp change-bias
(#3993) anddp show
(#3796). - Support generating JSON schema for integration with VSCode (#3849).
- The latest LAMMPS version (stable_29Aug2024_update1) is supported. (#4088, #4179)
Breaking changes
- The deepmodeling conda channel is deprecated. Use the conda-forge channel instead. (#3462, #4385)
- The offline package and conda packages for CUDA 11 are dropped.
- Python 3.7 and 3.8 supports are dropped. (#3185, #4185)
- The minimal versions of deep learning frameworks: TensorFlow 2.7, PyTorch 2.1, JAX 0.4.33, and NumPy 1.21.
- We require all model files to have the correct filename extension for all interfaces so a corresponding backend can load them. TensorFlow model files must end with
.pb
extension. - Bias is removed by default from type embedding. (#3958)
- The spin model is refactored, and its usage in the LAMMPS module has been changed. (#3301, #4321)
- Multi-task training support is removed from the TensorFlow backend. (#3763)
- The
set_prefix
key is deprecated. (#3753) dp test
now uses all sets for training and test. In previous versions, only the last set is used as the test set in dp test. (#3862)- The Python module structure is fully refactored. The old
deepmd
module was moved todeepmd.tf
without other API changes, anddeepmd_utils
was moved todeepmd
without other API changes. (#3177, #3178) - Python class
DeepTensor
(includingDeepDiople
andDeepPolar
) now returns atomic tensor in the dimension ofnatoms
instead ofnsel_atoms
. (#3390) - C++ 11 support is dropped. (#4068)
For other changes, refer to Full Changelog: v2.2.11...v3.0.0rc0
Contributors
The PyTorch backend was developed in the dptech-corp/deepmd-pytorch repository, and then it was fully merged into the deepmd-kit repository in #3180. Contributors to the deepmd-pytorch repository:
- @20171130
- @CaRoLZhangxy
- @amcadmus
- @guolinke
- @iProzd
- @nahso
- @njzjz
- @qin2xue3jian4
- @shishaochen
- @zjgemi
Contributors to the deepmd-kit repository:
- @CaRoLZhangxy: #3162 #3287 #3337 #3375 #3379 #3434 #3436 #3612 #3613 #3614 #3656 #3657 #3740 #3780 #3917 #3919 #4209 #4237
- @Chengqian-Zhang: #3615 #3796 #3828 #3840 #3867 #3912 #4120 #4145 #4280
- @ChiahsinChu: #4246 #4248
- @Cloudac7: #4031
- @HydrogenSulfate: #4117
- @LiuGroupHNU: #3978
- @Mancn-Xu: #3567
- @Yi-FanLi: #3822 #4013 #4084 #4283
- @anyangml: #3192 #3210 #3212 #3248 #3266 #3281 #3296 #3309 #3314 #3321 #3327 #3338 #3351 #3362 #3376 #3385 #3398 #3410 #3426 #3432 #3435 #3447 #3451 #3452 #3468 #3485 #3486 #3575 #3584 #3654 #3662 #3663 #3706 #3757 #3759 #3812 #3824 #3876 #3946 #3975 #4194 #4205 #4292 #4296 #4335 #4339 #4370 #4380
- @caic99: #3465 #4165 #4401
- @chazeon: #3473 #3652 #3653 #3739
- @cherryWangY: #3877 #4227 #4297 #4298 #4299 #4300
- @dependabot: #3231 #3312 #3446 #3487 #3777 #3882 #4045 #4127 #4374
- @hztttt: #3762
- @iProzd: #3180 #3203 #3245 #3261 #3301 #3355 #3359 #3367 #3371 #3378 #3380 #3387 #3388 #3409 #3411 #3441 #3442 #3445 #3456 #3480 #3569 #3571 #3573 #3607 #3616 #3619 #3696 #3698 #3712 #3717 #3718 #3725 #3746 #3748 #3758 #3763 #3768 #3773 #3774 #3775 #3781 #3782 #3785 #3803 #3813 #3814 #3815 #3826 #3837 #3841 #3842 #3843 #3873 #3906 #3914 #3916 #3925 #3926 #3927 #3933 #3944 #3945 #3957 #3958 #3967 #3971 #3976 #3992 #3993 #4006 #4007 #4015 #4066 #4089 #4138 #4139 #4148 #4162 #4222 #4223 #4224 #4225 #4243 #4244 #4321 #4323 #4324 #4344 #4353 #4354 #4372 #4375 #4394 #4395 #4440
- @iid-ccme: #4340
- @nahso: #3726 #3727
- @njzjz: #3164 #3167 #3169 #3170 #3171 #3172 #3173 #3174 #3175 #3176 #3177 #3178 #3179 #3181 #3185 #3186 #3187 #3191 #3193 #3194 #3195 #3196 #3198 #3200 #3201 #3204 #3205 #3206 #3207 #3213 #3217 #3220 #3221 #3222 #3223 #3226 #3228 #3229 #3237 #3238 #3239 #3243 #3244 #3247 #3249 #3250 #3253 #3254 #3257 #3258 #3263 #3267 #3271 #3275 #3276 #3283 #3284 #3285 #3286 #3288 #3290 #3292 #3293 #3294 #3303 #3304 #3306 #3307 #3308 #3310 #3313 #3315 #3316 #3318 #3323 #3325 #3326 #3330 #3331 #3332 #3333 #3335 #3339 #3342 #3343 #3346 #3348 #3349 #3350 #3356 #3358 #3360 #3361 #3364 #3365 #3366 #3369 #3370 #3373 #3374 #3377 #3381 #3382 #3383 #3384 #3386 #3390 #3393 #3394 #3395 #3396 #3397 #3399 #3402 #3403 #3404 #3405 #3415 #3418 #3419 #3421 #3422 #3423 #3424 #3425 #3431 #3437 #3438 #3443 #3444 #3449 #3450 #3453 #3461 #3462 #3464 #3484 #3519 #3570 #3572 #3574 #3580 #3581 #3583 #3600 #3601 #3605 #3610 #3617 #3618 #3620 #3621 #3624 #3625 #3631 #3632 #3633 #3636 #3651 #3658 #3671 #3676 #3682 #3685 #3686 #3687 #3688 #3694 #3695 #3701 #3709 #3711 #3714 #3715 #3716 #3721 #3737 #3753 #3767 #3776 #3784 #3787 #3792 #3793 #3794 #3798 #3800 #3801 #3810 #3811 #3816 #3820 #3829 #3832 #3834 #3835 #3836 #3838 #3845 #3846 #3849 #3851 #3855 #3856 #3857 #3861 #3862 #3870 #3872 #3874 #3875 #3878 #3880 #3888 #3889 #3890 #3891 #3893 #3894 #3895 #3896 #3897 #3918 #3921 #3922 #3930 #3956 #3964 #3965 #3972 #3973 #3977 #3980 #3981 #3982 #3985 #3987 #3989 #3995 #3996 #4001 #4002 #4005 #4009 #4010 #4012 #4021 #4024 #4025 #4027 #4028 #4032 #4038 #4047 #4049 #4059 #4067 #4068 #4070 #4071 #4073 #4074 #4075 #4079 #4081 #4083 #4088 #4095 #4100 #4106 #4110 #4111 #4113 #4131 #4134 #4136 #4144 #4146 #4147 #4152 #4153 #4155 #4156 #4160 #4172 #4176 #4178 #4179 #4180 #4185 #4187 #4190 #4196 #4199 #4200 #4204 #4212 #4213 #4214 #4217 #4218 #4219 #4220 #4221 #4226 #4228 #4230 #4236 #4238 #4239 #4240 #4242 #4247 #4251 #4252 #4254 #4256 #4257 #4258 #4259 #4260 #4261 #4263 #4264 #4269 #4271 #4274 #4275 #4278 #4284 #4285 #4286 #4287 #4288 #4289 #4290 #4293 #4294 #4301 #4304 #4307 #4309 #4313 #4315 #4318 #4319 #4320 #4325 #4326 #4327 #4329 #4330 #4331 #4336 #4338 #4341 #4342 #4343 #4345 #4350 #4351 #4352 #4355 #4356 #4357 #4363 #4365 #4369 #4377 #4383 #4384 #4385 #4386 #4387 #4388 #4390 #4391 #4392 #4402 #4403 #4404 #4405 #4406
- @njzjz-bot: #3669 #3953 #3988 #4119 #4266
- @pre-commit-ci: #3163 #3236 #3264 #3305 #3454 #3489 #3599 #3634 #3659 #3675 #3700 #3720 #3754 #3779 #3825 #3850 #3863 #3883 #3900 #3938 #3955 #3983 #4003 #4048 #4053 #4065 #4080 #4097 #4115 #4130 #4159 #4173 #4192 #4235 #4268 #4310 #4337 #4378
- @robinzyb: #3647
- @shiruosong: #3344 #3345
- @sigbjobo: #4150
- @wanghan-iapcm: #3184 #3188 #3190 #3199 #3202 #3208 #3219 #3225 #3232 #3234 #3235 #3240 #3241 #3246 #3260 #3262 #3268 #3274 #3279 #3280 #3282 #3289 #3295 #3340 #3352 #3357 #3389 #3391 #3400 #3413 #3458 #3469 #3609 #3611 #3626 #3628 #3639 #3642 #3649 #3650 #3755 #3761 #4052 #4116 #4135 #4142 #4166 #4233 #4241
- @wangzyphysics: #3597 #4312
We also thank everyone who did tests and reported bugs in the past eight months.