This module creates resources required to run the GitHub action runner on AWS EC2 spot instances. The life cycle of the runners on AWS is managed by two lambda functions. One function will handle scaling up, the other scaling down.
The action runners are created via a launch template, on launch template only the subnet needs to be provided. During launch the installation is handled via a user data script. The configuration is fetched from SSM parameter store.
The scale up lambda is triggered by events on a SQS queue. Events on this queued are delayed, which will will give the workflow some time to start running on available runners. For each event the lambda will check the workflow is still queued and no other limits are reached. In that case the lambda will create a new EC2 instance. The lambda only needs to know which launch template to use and which subnets are available. From the available subnets a random one will be chosen. Once the instance is created the event is assumed as handled, and we assume the workflow wil start at some moment once the created instance is ready.
The scale down lambda is triggered via a CloudWatch event. The event is triggered by a cron expression defined in the variable scale_down_schedule_expression
(https://docs.aws.amazon.com/AmazonCloudWatch/latest/events/ScheduledEvents.html). For scaling down GitHub does not provide a good API yet, therefore we run the scaling down based on this event every x minutes. Each time the lambda is triggered it tries to remove all runners older than x minutes (configurable) managed in this deployment. In case the runner can be removed from GitHub, which means it is not executing a workflow, the lambda will terminate the EC2 instance.
Usage examples are available in the root module. By default the root module will assume local zip files containing the lambda distribution are available. See the download lambda module for more information.
The Lambda function is written in TypeScript and requires Node 12.x and yarn. Sources are located in [./lambdas/runners]. Two lambda functions share the same sources, there is one entry point for scaleDown
and another one for scaleUp
.
cd lambdas/runners
yarn install
Test are implemented with Jest, calls to AWS and GitHub are mocked.
yarn run test
To compile all TypeScript/JavaScript sources in a single file ncc is used.
yarn run dist
No requirements.
Name | Version |
---|---|
aws | n/a |
Name | Description | Type | Default | Required |
---|---|---|---|---|
ami_filter | List of maps used to create the AMI filter for the action runner AMI. | map(list(string)) |
{ |
no |
ami_owners | The list of owners used to select the AMI of action runner instances. | list(string) |
[ |
no |
aws_region | AWS region. | string |
n/a | yes |
block_device_mappings | The EC2 instance block device configuration. Takes the following keys: delete_on_termination , volume_type , volume_size , encrypted , iops |
map(string) |
{} |
no |
enable_organization_runners | n/a | bool |
n/a | yes |
encryption | KMS key to encrypted lambda environment secrets. Either provide a key and encrypt set to true . Or set the key to null and encrypt to false . |
object({ |
n/a | yes |
environment | A name that identifies the environment, used as prefix and for tagging. | string |
n/a | yes |
github_app | GitHub app parameters, see your github app. Ensure the key is base64 encoded. | object({ |
n/a | yes |
idle_config | List of time period that can be defined as cron expression to keep a minimum amount of runners active instead of scaling down to 0. By defining this list you can ensure that in time periods that match the cron expression within 5 seconds a runner is kept idle. | list(object({ |
[] |
no |
instance_profile_path | The path that will be added to the instance_profile, if not set the environment name will be used. | string |
null |
no |
instance_type | Default instance type for the action runner. | string |
"m5.large" |
no |
lambda_timeout_scale_down | Time out for the scale down lambda in seconds. | number |
60 |
no |
lambda_timeout_scale_up | Time out for the scale up lambda in seconds. | number |
60 |
no |
lambda_zip | File location of the lambda zip file. | string |
null |
no |
logging_retention_in_days | Specifies the number of days you want to retain log events for the lambda log group. Possible values are: 0, 1, 3, 5, 7, 14, 30, 60, 90, 120, 150, 180, 365, 400, 545, 731, 1827, and 3653. | number |
7 |
no |
market_options | Market options for the action runner instances. | string |
"spot" |
no |
minimum_running_time_in_minutes | The time an ec2 action runner should be running at minimum before terminated if non busy. | number |
5 |
no |
overrides | This maps provides the possibility to override some defaults. The following attributes are supported: name_sg overwrite the Name tag for all security groups created by this module. name_runner_agent_instance override the Name tag for the ec2 instance defined in the auto launch configuration. name_docker_machine_runners override the Name tag spot instances created by the runner agent. |
map(string) |
{ |
no |
role_path | The path that will be added to the role, if not set the environment name will be used. | string |
null |
no |
role_permissions_boundary | Permissions boundary that will be added to the created role for the lambda. | string |
null |
no |
runner_architecture | The platform architecture of the runner instance_type. | string |
"x64" |
no |
runner_as_root | Run the action runner under the root user. | bool |
false |
no |
runner_extra_labels | Extra labels for the runners (GitHub). Separate each label by a comma | string |
"" |
no |
runners_maximum_count | The maximum number of runners that will be created. | number |
3 |
no |
s3_bucket_runner_binaries | n/a | object({ |
n/a | yes |
s3_location_runner_binaries | S3 location of runner distribution. | string |
n/a | yes |
scale_down_schedule_expression | Scheduler expression to check every x for scale down. | string |
"cron(*/5 * * * ? *)" |
no |
sqs_build_queue | SQS queue to consume accepted build events. | object({ |
n/a | yes |
subnet_ids | List of subnets in which the action runners will be launched, the subnets needs to be subnets in the vpc_id . |
list(string) |
n/a | yes |
tags | Map of tags that will be added to created resources. By default resources will be tagged with name and environment. | map(string) |
{} |
no |
userdata_post_install | User-data script snippet to insert after GitHub acton runner install | string |
"" |
no |
userdata_pre_install | User-data script snippet to insert before GitHub acton runner install | string |
"" |
no |
vpc_id | The VPC for the security groups. | string |
n/a | yes |
Name | Description |
---|---|
lambda_scale_down | n/a |
lambda_scale_up | n/a |
launch_template | n/a |
role_runner | n/a |
role_scale_down | n/a |
role_scale_up | n/a |
This module is part of the Philips Forest.
___ _
/ __\__ _ __ ___ ___| |_
/ _\/ _ \| '__/ _ \/ __| __|
/ / | (_) | | | __/\__ \ |_
\/ \___/|_| \___||___/\__|
Infrastructure
Talk to the forestkeepers in the forest
-channel on Slack.