We provide several video tutorials on YouTube.
- Introduction to $\lambda$-ML
- Programming Interface
- Deploying $\lambda$-ML with S3
- Deploying $\lambda$-ML with ElastiCache
- Deploying $\lambda$-ML with DynamoDB
- Deploying $\lambda$-ML with Hybrid Parameter Server
- awscli (version 1)
- botocore
- boto3
- numpy
- torch=1.0.1
- thrift
- redis
- grpcio
- Create a Lambda layer with PyTorch 1.0.1.
- Compress the whole project and upload to Lambda.
- Create a VPC and a security group in AWS.
The storage layer offers basic operations to manipulate external storage.
- S3 (storage/s3/s3_type.py). storage operations: list/save/load/delete/clear/...
- Elasticache (storage/memcached/memcached_type.py). storage operations: list/save/load/delete/clear/...
- DynamoDB (storage/dynamo/dynamo_type.py). storage operations: list/save/load/delete/clear/...
The communication layer provides popular communication primitives.
- S3 communicator (communicator/s3_comm.py). primitives: async/reduce/reduce_scatter.
- Elasticache communicator (communicator/memcached_comm.py). primitives: async/reduce/reduce_scatter.
- DynamoDB communicator (communicator/dynamo_comm.py). primitives: async/reduce/reduce_scatter.
In addition to storage services,
- Launch parameter server. see thrift_ps/start_service.py
- Communication interfaces: ping/register/pull/push/delete.
The general usage of
- Partition the dataset and upload to S3.
- Create a trigger Lambda function and an execution Lambda function.
- Set configurations (e.g., dataset location) and hyperparameters (e.g., learning rate).
- Set VPC and security group.
- Execute the trigger function.
- See the logs in CloudWatch.
See examples for more details.
If you have any question or suggestion, feel free to contact jiawei.jiang@inf.ethz.ch and ce.zhang@inf.ethz.ch.
Jiawei Jiang, Shaoduo Gan, Yue Liu, Fanlin Wang, Gustavo Alonso, Ana Klimovic, Ankit Singla, Wentao Wu, Ce Zhang. Towards Demystifying Serverless Machine Learning Training. SIGMOD 2021.