HPE recommends that you run each Swarm Network node, and Swarm Learning node on dedicated systems to get the best performance from the platform.
The recommended requirements for each system are as follows:
NOTE:The requirements of system running the user ML node is driven by the complexity of the ML algorithm. GPUs may also be needed.
-
Any x86-64 hardware
-
System memory of 32 GB or more
-
Hard disk space of 200 GB or more
-
Qualified with HPE Edgeline, Proliant DL380, and Apollo 6500
-
A minimum of one or a maximum four open TCP/IP ports in each node. All swarm nodes must be able to access the ports of every other node. For more information on port details that must be opened, see Exposed ports.
-
Stable internet connectivity to download Swarm Learning package and Docker images.
-
Linux - Qualified on Ubuntu 20.04, RHEL 8.1.
-
For Swarm Web UI installer, any x86-64 hardware running Linux, Windows, or Mac.
-
HPE Swarm Learning is qualified with Docker 20.10.5.
-
Configure Docker to run as a non-root user. For more details, see Manage Docker as a non-root user.
-
Configure network proxy settings for Docker. For more details, see HTTP/HTTPS proxy.
-
Configure Docker to use IPv4.
Qualified with Keras (TensorFlow 2 backend) and PyTorch 1.5 based Machine Learning models implemented using Python3.
- Synchronized time across all systems using NTP.
NOTE:'Qualified' in this section means that HPE has qualified the product with the respective versions. Swarm Learning may work with other versions as well.