Skip to content

Commit

Permalink
Merge pull request #52 from awslabs/develop
Browse files Browse the repository at this point in the history
Release v0.0.4
  • Loading branch information
jfuss authored Dec 11, 2018
2 parents 140fd87 + 20a80f5 commit 65a2643
Show file tree
Hide file tree
Showing 35 changed files with 894 additions and 5 deletions.
3 changes: 3 additions & 0 deletions .appveyor.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,9 @@ environment:
build: off

install:
# To run Nodejs workflow integ tests
- ps: Install-Product node 8.10

- "set PATH=%PYTHON%\\Scripts;%PYTHON%\\bin;%PATH%"
- "%PYTHON%\\python.exe -m pip install -r requirements/dev.txt"
- "%PYTHON%\\python.exe -m pip install -e ."
Expand Down
5 changes: 5 additions & 0 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,11 @@ matrix:
dist: xenial
sudo: true
install:

# To run Nodejs workflow integ tests
- nvm install 8.10.0
- nvm use 8.10.0

# Install the code requirements
- make init
script:
Expand Down
2 changes: 1 addition & 1 deletion aws_lambda_builders/__init__.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
"""
AWS Lambda Builder Library
"""
__version__ = '0.0.3'
__version__ = '0.0.4'
RPC_PROTOCOL_VERSION = "0.1"
3 changes: 3 additions & 0 deletions aws_lambda_builders/builder.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
"""

import importlib
import os
import logging

from aws_lambda_builders.registry import get_workflow, DEFAULT_REGISTRY
Expand Down Expand Up @@ -93,6 +94,8 @@ def build(self, source_dir, artifacts_dir, scratch_dir, manifest_path,
if runtime:
self._validate_runtime(runtime)

if not os.path.exists(scratch_dir):
os.makedirs(scratch_dir)

workflow = self.selected_workflow_cls(source_dir,
artifacts_dir,
Expand Down
1 change: 1 addition & 0 deletions aws_lambda_builders/workflows/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,3 +3,4 @@
"""

import aws_lambda_builders.workflows.python_pip
import aws_lambda_builders.workflows.nodejs_npm
137 changes: 137 additions & 0 deletions aws_lambda_builders/workflows/nodejs_npm/DESIGN.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,137 @@
## NodeJS - NPM Lambda Builder

### Scope

This package is an effort to port the Claudia.JS packager to a library that can
be used to handle the dependency resolution portion of packaging NodeJS code
for use in AWS Lambda. The scope for this builder is to take an existing
directory containing customer code, including a valid `package.json` manifest
specifying third-party dependencies. The builder will use NPM to include
production dependencies and exclude test resources in a way that makes them
deployable to AWS Lambda.

### Challenges

NPM normally stores all dependencies in a `node_modules` subdirectory. It
supports several dependency categories, such as development dependencies
(usually third-party build utilities and test resources), optional dependencies
(usually required for local execution but already available on the production
environment, or peer-dependencies for optional third-party packages) and
production dependencies (normally the minimum required for correct execution).
All these dependency types are mixed in the same directory.

To speed up Lambda startup time and optimise usage costs, the correct thing to
do in most cases is just to package up production dependencies. During development
work we can expect that the local `node_modules` directory contains all the
various dependency types, and NPM does not provide a way to directly identify
just the ones relevant for production. To identify production dependencies,
this packager needs to copy the source to a clean temporary directory and re-run
dependency installation there.

A frequently used trick to speed up NodeJS Lambda deployment is to avoid
bundling the `aws-sdk`, since it is already available on the Lambda VM.
This makes deployment significantly faster for single-file lambdas, for
example. Although this is not good from a consistency and compatibility
perspective (as the version of the API used in production might be different
from what was used during testing), people do this frequently enough that the
packager should handle it in some way. A common way of marking this with ClaudiaJS
is to include `aws-sdk` as an optional dependency, then deploy without optional
dependencies.

Other runtimes do not have this flexibility, so instead of adding a specific
parameter to the SAM CLI, the packager should support a flag to include or
exclude optional dependencies through environment variables.

NPM also provides support for running user-defined scripts as part of the build
process, so this packager needs to support standard NPM script execution.

NPM, since version 5, uses symbolic links to optimise disk space usage, so
cross-project dependencies will just be linked to elsewhere on the local disk
instead of included in the `node_modules` directory. This means that just copying
the `node_modules` directory (even if symlinks would be resolved to actual paths)
far from optimal to create a stand-alone module. Copying would lead to significantly
larger packages than necessary, as sub-modules might still have test resources, and
common references from multiple projects would be duplicated.

NPM also uses a locking mechanism (`package-lock.json`) that's in many ways more
broken than functional, as it in some cases hard-codes locks to local disk
paths, and gets confused by including the same package as a dependency
throughout the project tree in different dependency categories
(development/optional/production). Although the official tool recommends
including this file in the version control, as a way to pin down dependency
versions, when using on several machines with different project layout it can
lead to uninstallable dependencies.

NPM dependencies are usually plain javascript libraries, but they may include
native binaries precompiled for a particular platform, or require some system
libraries to be installed. A notable example is `sharp`, a popular image
manipulation library, that uses symbolic links to system libraries. Another
notable example is `puppeteer`, a library to control a headless Chrome browser,
that downloads a Chromium binary for the target platform during installation.

To fully deal with those cases, this packager may need to execute the
dependency installation step on a Docker image compatible with the target
Lambda environment.

### Implementation

The general algorithm for preparing a node package for use on AWS Lambda
is as follows.

#### Step 1: Prepare a clean copy of the project source files

Execute `npm pack` to perform project-specific packaging using the supplied
`package.json` manifest, which will automatically exclude temporary files,
test resources and other source files unnecessary for running in a production
environment.

This will produce a `tar` archive that needs to be unpacked into the artifacts
directory. Note that the archive will actually contain a `package`
subdirectory containing the files, so it's not enough to just directly unpack
files.

#### Step 2: Rewrite local dependencies

_(out of scope for the current version)_

To optimise disk space and avoid including development dependencies from other
locally linked packages, inspect the `package.json` manifest looking for dependencies
referring to local file paths (can be identified as they start with `.` or `file:`),
then for each dependency recursively execute the packaging process

Local dependencies may include other local dependencies themselves, this is a very
common way of sharing configuration or development utilities such as linting or testing
tools. This means that for each packaged local dependency this packager needs to
recursively apply the packaging process. It also means that the packager needs to
track local paths and avoid re-packaging directories it already visited.

NPM produces a `tar` archive while packaging that can be directly included as a
dependency. This will make NPM unpack and install a copy correctly. Once the
packager produces all `tar` archives required by local dependencies, rewrite
the manifest to point to `tar` files instead of the original location.

If the project contains a package lock file, this will cause NPM to ignore changes
to the package.json manifest. In this case, the packager will need to remove
`package-lock.json` so that dependency rewrites take effect.
_(out of scope for the current version)_

#### Step 3: Install dependencies

The packager should then run `npm install` to download an expand all dependencies to
the local `node_modules` subdirectory. This has to be executed in the directory with
a clean copy of the source files.

Note that NPM can be configured to use proxies or local company repositories using
a local file, `.npmrc`. The packaging process from step 1 normally excludes this file, so it may
need to be copied additionally before dependency installation, and then removed.
_(out of scope for the current version)_

Some users may want to exclude optional dependencies, or even include development dependencies.
To avoid incompatible flags in the `sam` CLI, the packager should allow users to specify
options for the `npm install` command using an environment variable.
_(out of scope for the current version)_

To fully support dependencies that download or compile binaries for a target platform, this step
needs to be executed inside a Docker image compatible with AWS Lambda.
_(out of scope for the current version)_

5 changes: 5 additions & 0 deletions aws_lambda_builders/workflows/nodejs_npm/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
"""
Builds NodeJS Lambda functions using NPM dependency manager
"""

from .workflow import NodejsNpmWorkflow
113 changes: 113 additions & 0 deletions aws_lambda_builders/workflows/nodejs_npm/actions.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,113 @@
"""
Action to resolve NodeJS dependencies using NPM
"""

import logging

from aws_lambda_builders.actions import BaseAction, Purpose, ActionFailedError
from .npm import NpmExecutionError

LOG = logging.getLogger(__name__)


class NodejsNpmPackAction(BaseAction):

"""
A Lambda Builder Action that packages a Node.js package using NPM to extract the source and remove test resources
"""

NAME = 'NpmPack'
DESCRIPTION = "Packaging source using NPM"
PURPOSE = Purpose.COPY_SOURCE

def __init__(self, artifacts_dir, scratch_dir, manifest_path, osutils, subprocess_npm):
"""
:type artifacts_dir: str
:param artifacts_dir: an existing (writable) directory where to store the output.
Note that the actual result will be in the 'package' subdirectory here.
:type scratch_dir: str
:param scratch_dir: an existing (writable) directory for temporary files
:type manifest_path: str
:param manifest_path: path to package.json of an NPM project with the source to pack
:type osutils: aws_lambda_builders.workflows.nodejs_npm.utils.OSUtils
:param osutils: An instance of OS Utilities for file manipulation
:type subprocess_npm: aws_lambda_builders.workflows.nodejs_npm.npm.SubprocessNpm
:param subprocess_npm: An instance of the NPM process wrapper
"""
super(NodejsNpmPackAction, self).__init__()
self.artifacts_dir = artifacts_dir
self.manifest_path = manifest_path
self.scratch_dir = scratch_dir
self.osutils = osutils
self.subprocess_npm = subprocess_npm

def execute(self):
"""
Runs the action.
:raises lambda_builders.actions.ActionFailedError: when NPM packaging fails
"""
try:
package_path = "file:{}".format(self.osutils.abspath(self.osutils.dirname(self.manifest_path)))

LOG.debug("NODEJS packaging %s to %s", package_path, self.scratch_dir)

tarfile_name = self.subprocess_npm.run(['pack', '-q', package_path], cwd=self.scratch_dir)

LOG.debug("NODEJS packed to %s", tarfile_name)

tarfile_path = self.osutils.joinpath(self.scratch_dir, tarfile_name)

LOG.debug("NODEJS extracting to %s", self.artifacts_dir)

self.osutils.extract_tarfile(tarfile_path, self.artifacts_dir)

except NpmExecutionError as ex:
raise ActionFailedError(str(ex))


class NodejsNpmInstallAction(BaseAction):

"""
A Lambda Builder Action that installs NPM project dependencies
"""

NAME = 'NpmInstall'
DESCRIPTION = "Installing dependencies from NPM"
PURPOSE = Purpose.RESOLVE_DEPENDENCIES

def __init__(self, artifacts_dir, subprocess_npm):
"""
:type artifacts_dir: str
:param artifacts_dir: an existing (writable) directory with project source files.
Dependencies will be installed in this directory.
:type subprocess_npm: aws_lambda_builders.workflows.nodejs_npm.npm.SubprocessNpm
:param subprocess_npm: An instance of the NPM process wrapper
"""

super(NodejsNpmInstallAction, self).__init__()
self.artifacts_dir = artifacts_dir
self.subprocess_npm = subprocess_npm

def execute(self):
"""
Runs the action.
:raises lambda_builders.actions.ActionFailedError: when NPM execution fails
"""

try:
LOG.debug("NODEJS installing in: %s", self.artifacts_dir)

self.subprocess_npm.run(
['install', '-q', '--no-audit', '--no-save', '--production'],
cwd=self.artifacts_dir
)

except NpmExecutionError as ex:
raise ActionFailedError(str(ex))
90 changes: 90 additions & 0 deletions aws_lambda_builders/workflows/nodejs_npm/npm.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
"""
Wrapper around calling npm through a subprocess.
"""

import logging

LOG = logging.getLogger(__name__)


class NpmExecutionError(Exception):

"""
Exception raised in case NPM execution fails.
It will pass on the standard error output from the NPM console.
"""

MESSAGE = "NPM Failed: {message}"

def __init__(self, **kwargs):
Exception.__init__(self, self.MESSAGE.format(**kwargs))


class SubprocessNpm(object):

"""
Wrapper around the NPM command line utility, making it
easy to consume execution results.
"""

def __init__(self, osutils, npm_exe=None):
"""
:type osutils: aws_lambda_builders.workflows.nodejs_npm.utils.OSUtils
:param osutils: An instance of OS Utilities for file manipulation
:type npm_exe: str
:param npm_exe: Path to the NPM binary. If not set,
the default executable path npm will be used
"""
self.osutils = osutils

if npm_exe is None:
if osutils.is_windows():
npm_exe = 'npm.cmd'
else:
npm_exe = 'npm'

self.npm_exe = npm_exe

def run(self, args, cwd=None):

"""
Runs the action.
:type args: list
:param args: Command line arguments to pass to NPM
:type cwd: str
:param cwd: Directory where to execute the command (defaults to current dir)
:rtype: str
:return: text of the standard output from the command
:raises aws_lambda_builders.workflows.nodejs_npm.npm.NpmExecutionError:
when the command executes with a non-zero return code. The exception will
contain the text of the standard error output from the command.
:raises ValueError: if arguments are not provided, or not a list
"""

if not isinstance(args, list):
raise ValueError('args must be a list')

if not args:
raise ValueError('requires at least one arg')

invoke_npm = [self.npm_exe] + args

LOG.debug("executing NPM: %s", invoke_npm)

p = self.osutils.popen(invoke_npm,
stdout=self.osutils.pipe,
stderr=self.osutils.pipe,
cwd=cwd)

out, err = p.communicate()

if p.returncode != 0:
raise NpmExecutionError(message=err.decode('utf8').strip())

return out.decode('utf8').strip()
Loading

0 comments on commit 65a2643

Please sign in to comment.