Skip to content

Latest commit

 

History

History
300 lines (251 loc) · 10.6 KB

README.md

File metadata and controls

300 lines (251 loc) · 10.6 KB

Autoscaling GitLab Runner on AWS

If you want to have a cost-effective and scalable solution for you CI/CD builds, it can be suitable to use Gitlab Runner with it's autoscaling feature. In this example, we'll show how to configure a Gitlab Runner in AWS that will serve as the bastion where it will spawn new spot instances whenever you will need.

Prerequisites

We will this tools:

Prepare the bastion instance

The first step is to install GitLab Runner in an EC2 instance that will serve as the bastion that spawns new machines. This doesn’t have to be a powerful machine since it will not run any jobs itself, a t3.micro instance will do. This machine will be a dedicated host since we need it to always be up and running, thus it will be the only standard monthly cost.

Install the prerequisites:

After the Runner is registered, we need to edit it's configuration file and add the required options for the AWS docker machine driver.

The Gitlab Runner configuration

The main section

In the main section, we can define the limit of the jobs that can be run in parallel across all Runners (concurrent). We can start with something low like 2, and increase or decrease when we need.

The check_interval option defines how often the Runner should check GitLab for new jobs (in seconds).

Example:

concurrent = 2
check_interval = 0

Read more about all the options we can use.

The runners section

From the [[runners]] section, the most important part is the executor which must be set to docker+machine. Most of those settings are taken care of when you register the Runner for the first time.

limit sets the maximum number of machines (running and idle) that this Runner will spawn. For more info check the relationship between limit, concurrent and IdleCount.

Example:

[[runners]]
  name = "gitlab-aws-autoscaler"
  url = "<URL of your GitLab instance>"
  token = "<Runner's token>"
  executor = "docker+machine"
  limit = 20

Read more about all the options we can use.

The runners.docker section

In this section, we can define the default Docker image to be used by the child Runners if it’s not defined in .gitlab-ci.yml. To build our apps we use Kaniko executor and we don't need to use privileged mode. Read more why priviliged is dangerous.

Next, we use disable_cache = true to disable the Docker executor’s inner cache mechanism since we will use the distributed cache mode as described in the following section.

Example:

  [runners.docker]
    tls_verify = false
    image = "alpine"
    privileged = false
    disable_entrypoint_overwrite = false
    oom_kill_disable = false
    disable_cache = true
    shm_size = 0

Read more about all the options we can use under [runners.docker].

The runners.cache section

To speed up our jobs, GitLab Runner provides a cache mechanism. It is recommended to use the distributed cache mechanism that GitLab Runner provides. Since new instances will be created on demand, it is essential to have a common place where the cache is stored. In next example we don't use ACCESS_KEY or SECRET_KEY. Gitlab-runner will look on our IAM Role on our bastion. In the following example, we use Amazon S3:

 [runners.cache]
    Type = "s3"
    Shared = true
    [runners.cache.s3]
      BucketName = "<the bucket where your cache should be kept>"
      BucketLocation = "us-east-1"

The runners.machine section

This is the most important part of the configuration and it’s the one that tells GitLab Runner how and when to spawn new or remove old Docker Machine instances.

  [runners.machine]
    IdleCount = 0
    IdleTime = 120
    MaxBuilds = 100
    MachineDriver = "amazonec2"
    MachineName = "gitlab-docker-machine-%s"
    MachineOptions = [
      "amazonec2-iam-instance-profile=nameOfRole", # Better security features. We can use IAM Roles instead of AWS keys
      "amazonec2-region=XXXXX", 
      "amazonec2-vpc-id=vpc-XXXXX", 
      "amazonec2-subnet-id=subnet-XXXXX", 
      "amazonec2-zone=a", 
      "amazonec2-tags=runner-manager-name,gitlab-aws-autoscaler,gitlab,true,gitlab-runner-autoscale,true", 
      "amazonec2-security-group=XXXXX", 
      "amazonec2-instance-type=t3.medium", 
      "amazonec2-request-spot-instance=true", 
      "amazonec2-spot-price=0.03"
      ]
    OffPeakPeriods = ["* * 0-8,22-23 * * mon-fri *", "* * * * * sat,sun *"]
    OffPeakTimezone = "Europe/Kiev"
    OffPeakIdleCount = 0
    OffPeakIdleTime = 120

Read more about MachineOptions.

Use IAM Roles instead of AWS_SECRET_KEY and AWS_ACCESS_KEY

First of all, we need to create IAM Role for out bastion server. Below you can see which permissions for IAM Role you need.
Read more about IAM Roles . Example for creating EC2-spot instances(it uses on bastion server):

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Action": [
                "ec2:DescribeKeyPairs",
                "ec2:TerminateInstances",
                "ec2:StopInstances",
                "ec2:StartInstances",
                "ec2:RunInstances",
                "ec2:RebootInstances",
                "ec2:CreateKeyPair",
                "ec2:DeleteKeyPair",
                "ec2:ImportKeyPair",
                "ec2:Describe*",
                "ec2:CreateTags",
                "ec2:RequestSpotInstances",
                "ec2:CancelSpotInstanceRequests",
                "ec2:DescribeSubnets",
                "ec2:AssociateIamInstanceProfile"
            ],
            "Effect": "Allow",
            "Resource": "*"
        },
        {
            "Action": [
                "iam:PassRole"
            ],
            "Effect": "Allow",
            "Resource": "*",
            "Condition": {
                "IpAddress": {
                    "aws:SourceIp": "<use your EIP on bastion>" # It means that you can attach role to resource only from IP that belongs to bastion instance.
                }
            }
        }
    ]
}

Next step, you need to create IAM Role for your runners. Your permissions will depend on your CI/CD process.
Below, permission for our runners with our CI/CD workflow:

{
  # This permission we are using for UpdateService in deploy stage in our CI/CD process.
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "ecs:UpdateService"
            ],
            "Resource": [
                "arn:aws:ecs:region:account-id:service/name_of_cluster/name_of_service",
                "arn:aws:ecs:region:account-id:service/name_of_cluster/name_of_service"
            ]
        }
    ]
}

Policy to push images to ECR push our image to ECR repositories:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "ecr:CompleteLayerUpload",
                "ecr:UploadLayerPart",
                "ecr:InitiateLayerUpload",
                "ecr:BatchCheckLayerAvailability",
                "ecr:PutImage",
                "ecr:GetDownloadUrlForLayer",
                "ecr:BatchGetImage"
            ],
            "Resource": [
                "arn:aws:ecr:region:account-id:repository/name_of_repo",
                "arn:aws:ecr:region:account-id:repository/name_of_repo"
            ]
        },
        {
            "Action": [
                "ecr:GetAuthorizationToken"
            ],
            "Effect": "Allow",
            "Resource": "*"
        }
    ]
}

Policy to Describe Target Health on our ELB:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": "elasticloadbalancing:DescribeTargetHealth",
            "Resource": "*"
        }
    ]
}

Getting it all together

Here’s the full example of /etc/gitlab-runner/config.toml:

concurrent = 10
check_interval = 0

[[runners]]
  name = "gitlab-aws-autoscaler"
  url = "<URL of your GitLab instance>"
  token = "<Runner's token>"
  executor = "docker+machine"
  limit = 20
  [runners.cache]
    [runners.cache.s3]
      BucketName = "<the bucket where your cache should be kept>"
      BucketLocation = "us-east-1"
  [runners.docker]
    tls_verify = false
    image = "alpine"
    privileged = false
    disable_entrypoint_overwrite = false
    oom_kill_disable = false
    disable_cache = true
    shm_size = 0
  [runners.machine]
    IdleCount = 0
    IdleTime = 120
    MaxBuilds = 100
    MachineDriver = "amazonec2"
    MachineName = "gitlab-docker-machine-%s"
    MachineOptions = [
      "amazonec2-iam-instance-profile=nameOfRole",
      "amazonec2-region=XXXXX", 
      "amazonec2-vpc-id=vpc-XXXXX", 
      "amazonec2-subnet-id=subnet-XXXXX", 
      "amazonec2-zone=a", 
      "amazonec2-tags=runner-manager-name,gitlab-aws-autoscaler,gitlab,true,gitlab-runner-autoscale,true", 
      "amazonec2-security-group=XXXXX", 
      "amazonec2-instance-type=t3.medium", 
      "amazonec2-request-spot-instance=true", 
      "amazonec2-spot-price=0.03"
      ]

Read the official documentation about Autoscaling GitLab Runner on AWS.

License

Copyright © 2015-2022 Codica. It is released under the MIT License.

About Codica

Codica logo

The names and logos for Codica are trademarks of Codica.

We love open source software! See our other projects or hire us to design, develop, and grow your product.