This repository contains two examples for performing training on Amazon SageMaker using SageMaker's script mode and debugging using Amazon SageMaker Debugger. Both examples contain training scripts for both zero-script-change and with-script-change scenarios.
Amazon SageMaker is a fully managed service that provides every developer and data scientist with the ability to build, train and deploy machine learning (ML) models quickly. With SageMaker, you have the option of using the built-in algorithms as well as bringing your own algorithms and frameworks. One such framework is TensorFlow 2.x. Amazon SageMaker Debugger debugs, monitors and profiles training jobs in real time thereby helping with detecting non-converging conditions, optimizing resource utilization by eliminating bottlenecks, improving training time and reducing costs of your machine learning models.
This example contains a Jupyter Notebook that demonstrates how to use a SageMaker optimized TensorFlow 2.x container to train a model on the Fashion MNIST dataset and debug using SageMaker Debugger. Finally the debugger's output is analyzed. This will take your training script and use SageMaker in script mode with the default training loop.
This repository contains
-
A Jupyter Notebook to get started
-
A training script in Python for zero-script-change scenario that is passed to the training job
-
A training script in Python for with-script-change scenario that is passed to the training job
This example contains a Jupyter Notebook that demonstrates how to use a SageMaker optimized TensorFlow 2.x container to train a model on the Fashion MNIST dataset and debug using SageMaker Debugger. Finally the debugger's output is analyzed. This will take your training script and use SageMaker in script mode with a custom training loop i.e. customizes what goes on in the fit()
loop.
This repository contains
-
A Jupyter Notebook to get started
-
A training script in Python for zero-script-change scenario that is passed to the training job
-
A training script in Python for with-script-change scenario that is passed to the training job
See CONTRIBUTING for more information.
This library is licensed under the MIT-0 License. See the LICENSE file.