Skip to content
This repository has been archived by the owner on Aug 27, 2024. It is now read-only.

Latest commit

 

History

History
123 lines (100 loc) · 11.6 KB

File metadata and controls

123 lines (100 loc) · 11.6 KB

Release CI License Source

snowplow-terraform-google-bigquery-loader-ce

NOTE: This module has been adopted by the official Snowplow DevOps organization and is now archived. Please use the official module.

A Terraform module which deploys the Snowplow BigQuery Loader, Mutator and Repeater apps on Google Cloud Compute Engine. If you want to use a custom image for this deployment you will need to ensure it is based on top of Ubuntu 20.04.

Requirements

Name Version
terraform >= 0.15
google >= 3.50.0

Providers

Name Version
google >= 3.50.0
local n/a

Modules

Name Source Version
telemetry snowplow-devops/telemetry/snowplow 0.2.0

Resources

Name Type
google_bigquery_dataset.snowplow resource
google_bigquery_dataset_iam_member.sa_bigquery_dataset_editor resource
google_compute_instance_from_template.snowplow_bq_app resource
google_compute_instance_template.tpl resource
google_project_iam_member.sa_logging_log_writer resource
google_project_iam_member.sa_pubsub_publisher resource
google_project_iam_member.sa_pubsub_subscriber resource
google_project_iam_member.sa_pubsub_viewer resource
google_project_iam_member.sa_storage_object_viewer resource
google_pubsub_subscription.failed_insert_subscription resource
google_pubsub_subscription.input_subscription resource
google_pubsub_subscription.types_subscription resource
google_pubsub_topic.bad_types_topic resource
google_pubsub_topic.failed_insert_topic resource
google_pubsub_topic.types_topic resource
google_service_account.sa resource
google_storage_bucket.dead_letter resource
google_storage_bucket_iam_binding.binding resource
local_file.config resource
local_file.resolver resource
local_file.startup_scripts resource
google_compute_image.ubuntu_20_04 data source

Inputs

Name Description Type Default Required
associate_public_ip_address Whether to assign a public ip address to this instance; if false this instance must be behind a Cloud NAT to connect to the internet bool true no
dataset_config The dataset in which to load the Snowplow events. Created by default.
object({
name = string
create = bool
})
{
"create": true,
"name": "snowplow"
}
no
enriched_topic_id The pubsub topic to read enriched messages from. string n/a yes
gcp_logs_enabled Whether application logs should be reported to GCP Logging bool true no
images The docker images with version tag to deploy on Compute Engine's instances. See here for details:
https://docs.snowplowanalytics.com/docs/pipeline-components-and-applications/loaders-storage-targets/bigquery-loader/

The default is to launch all three apps: 'Stream Loader', 'Mutator' and 'Repeater'.
list(string)
[
"snowplow/snowplow-bigquery-streamloader:1.3.2",
"snowplow/snowplow-bigquery-repeater:1.3.2",
"snowplow/snowplow-bigquery-mutator:1.3.2"
]
no
labels The labels to append to this resource map(string) {} no
machine_type The machine type to use string "e2-micro" no
name Will be prefixed to all resource names. Use to easily identify the resources created. string "loader" no
network The name of the network to deploy within. string n/a yes
project_id The GCP project ID. string n/a yes
region The name of the region to deploy within. string n/a yes
ssh_block_project_keys Whether to block project wide SSH keys bool true no
ssh_ip_allowlist The list of CIDR ranges to allow SSH traffic from list(any)
[
""
]
no
ssh_key_pairs The list of SSH key-pairs to add to the servers
list(object({
user_name = string
public_key = string
}))
[] no
subnetwork The name of the sub-network to deploy within; if populated will override the 'network' setting. string "" no
table_config The table in which to load the Snowplow events. Created by default.
object({
name = string
load_timestamp_column = string
load_timestamp_column_partition = string

})
{
"load_timestamp_column": "load_tstamp",
"load_timestamp_column_partition": null,
"name": "events"
}
no
tags The tags to apply to the created resources. list(string) [] no
telemetry_enabled Whether or not to send telemetry information back to Snowplow Analytics Ltd bool true no
ubuntu_20_04_source_image The source image to use which must be based of of Ubuntu 20.04; by default the latest community version is used string "" no
user_provided_id An optional unique identifier to identify the telemetry events emitted by this stack string "" no
zone The zone in which to deploy the instances. any n/a yes

Outputs

Name Description
bad_types_topic_name n/a
bq_loader_apps The compute instances created.
created_or_provided_dataset_id n/a
created_or_provided_table_id n/a
enriched_topic_name n/a
failed_inserts_sub_name n/a
failed_inserts_topic_name n/a
input_sub_name n/a
types_sub_name n/a
types_topic_name n/a

Copyright and license

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this software except in compliance with the License.

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.