-
Hello, thank you for the great package! I have a large number of 80-atom structures that I am attempting to relax. To try and speed this up I am using the python multiprocessing package to execute multiple runs of I am betting the error is most likely on my end... Here is a rough outline of the parallelization I am running: import os
import multiprocessing as multip
from chgnet.model import CHGNet, StructOptimizer
def relax_one_struc(single_struc_info, model):
struc_name, ini_struc = single_struc_info
relaxer = StructOptimizer(model=model, use_device="cpu")
result = relaxer.relax(ini_struc, verbose=True)
final_struc = result["final_structure"]
return (struc_name, final_struc.as_dict())
def relax_many_strucs(all_strucs, model):
with multip.Pool(processes=3) as pool:
results = pool.starmap(
relax_one_struc, [(struc_info, model) for struc_info in all_strucs.items()]
)
return dict(results)
def main():
# Load three pymatgen structures in form {struc_name: Structure}
# first_initial_struc = {struc_1_name, struc_1}
# second_initial_struc = {struc_2_name, struc_2}
# third_initial_struc = {struc_3_name, struc_3}
three_strucs = dict(
(first_initial_struc, second_initial_struc, third_initial_struc)
)
pretrained_chgnet = CHGNet.load(use_device="cpu")
relaxed_strucs = relax_many_strucs(three_strucs, pretrained_chgnet)
return
if __name__ == "__main__":
main() In testing on my laptop, when I run three structures back to back, it takes ~ 100 seconds total (about 30 relaxation steps per structure). However, when running in parallel, the total execution takes ~ 400 seconds. I have run similar tests on a hpc and there is an equivalent slowdown. When running on my laptop's gpu (via mps), the parallelized calculations takes about the same amount of time as the sequential one (which I think makes sense because it is accessing the same compute). This is long enough, so I will leave it at that. Would greatly appreciate any insight as to why there might be a slow down/ changes and alternative approaches I can take for better results. Please let me know if I can provide more details. Thanks! Quick edit - forgot to include, but did some time testing, and the relaxation steps are what experience the slower speeds (time for loading the model, initializing classes, overhead for multiprocessing are all minimal) |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 6 replies
-
arguably a more efficient way to parallelize structure relaxation would be to load a single model and then batch the structures in the model's forward pass, making better use of large tensor processing. since different structures need different numbers of relaxation steps, that would require implementing a pool-based ASE calculator that checks if any structures have finished relaxing and swaps those out for new structures from the pool in the next forward pass. let me know if you're interested in working on that, happy to collaborate |
Beta Was this translation helpful? Give feedback.
-
Hello, hope you both are doing well! Has there been progress on this? I'd be interested in this functionality in ASE with CHGNet (and other PyTorch based MLIPs) and can offer some assistance if needed as I'm trying to do many NEB calculations split over several GPUs. At the moment sequentially doing NEB takes me about 1-2 days for 10000 barriers... Thanks! |
Beta Was this translation helpful? Give feedback.
arguably a more efficient way to parallelize structure relaxation would be to load a single model and then batch the structures in the model's forward pass, making better use of large tensor processing. since different structures need different numbers of relaxation steps, that would require implementing a pool-based ASE calculator that checks if any structures have finished relaxing and swaps those out for new structures from the pool in the next forward pass. let me know if you're interested in working on that, happy to collaborate