AquaCulture Data Platform

Setup Guide

1. Setup a resource group for the terraform state

In order to properly manage the terraform .state file, one creates a resource group in azure with a storage account to hold the terraform state. This is then referenced in the terraform/backend.tf clause in the terraform configuration. If you have created a new resource group and storage account, you must change the names in the †erraform/backend.tf to match the names of the resources you created.

2. Run terraform configuration in order to populate the cloud with the required resources.

az login
terraform init
terraform plan
terraform apply

3. Add deployment secrets into Github actions

In order for the api docker image to be published to the Azure Container Registry (ACR) we need to add three variables to the github repo. These variables are listed below, and can be found in the azure portal after navigating. Also one publish profile from the azure function is required in order to publish

ACR_PASSWORD
ACR_USERNAME
ARC_SERVER (the whole path to the arc, including the .azurecr.io suffix
AZURE_FUNCTIONAPP_PUBLISH_PROFILE (found by navigating to the azure function in the portal, and click 'publish profile')

4. Run the Github Actions to publish

Navigate to the Actions tab in Github, and check if the actions script for publishing an image to Azure Container Registry have run successfully. If not, then restart it in order to publish the code to the registry.

Also run the action to publish the azure fuction.

Local Development

Create a file called api/AquaApi/appsettings.Development.json. This must be .gitignored. Add a variable ConnectionStrings:BlobStorage storage connection string to the appsettings.
The connection string must be on the format: DefaultEndpointsProtoco=https;AccountName=<your-storage-account>;AccountKey=<storage-key>;EndpointSuffix=core.windows.net"

Overview

This project is a template for a data platform designed for small and medium-large aquaculture companies in Norway. It leverages the power of Azure, Databricks, and Terraform to provide a scalable and efficient solution for managing and transforming data. The platform follows the Medallion Architecture and fetches data daily from api.havvarsel.no.

Architecture

Azure Databricks Architecture: This platform is built on the Azure Databricks architecture, providing a unified analytics framework that integrates with Azure services. You can learn more about Azure Databricks architecture here.
Medallion Architecture: The platform transforms data following the Medallion Architecture, enabling efficient and scalable data processing. Learn more about Medallion Architecture here.

Data Flow

Data Source: Every day, new data is fetched from the public API at api.havvarsel.no.
Transformation: The raw data is transformed using Databricks' Medallion Architecture, processing it in stages (Bronze, Silver, Gold).
Reporting: The transformed data is available through a Power BI report, which can be accessed here. (Add link to your Power BI report)

Deployment

This project uses Terraform to automate the deployment of infrastructure and resources on Azure. To get started with setting up the platform, follow the official Terraform documentation for running up a new project here.

Prerequisites

Azure Subscription
Databricks Workspace
Terraform Installed
Power BI for reporting

How to Use

Clone this repository.
Follow the Terraform setup instructions.
Use Databricks to manage data transformations.
Access the Power BI report to view insights.

Contact

For any questions, please contact [Your Name/Your Company].

Name		Name	Last commit message	Last commit date
Latest commit History 89 Commits
.github/workflows		.github/workflows
.vscode		.vscode
databricks		databricks
src		src
terraform		terraform
.gitignore		.gitignore
README.md		README.md
dataplatform-aquaculture-template.sln		dataplatform-aquaculture-template.sln

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AquaCulture Data Platform

Setup Guide

1. Setup a resource group for the terraform state

2. Run terraform configuration in order to populate the cloud with the required resources.

3. Add deployment secrets into Github actions

4. Run the Github Actions to publish

Local Development

Overview

Architecture

Data Flow

Deployment

Prerequisites

How to Use

Contact

About

Releases

Packages

Contributors 2

Languages

miles-no/dataplatform-aquaculture-template

Folders and files

Latest commit

History

Repository files navigation

AquaCulture Data Platform

Setup Guide

1. Setup a resource group for the terraform state

2. Run terraform configuration in order to populate the cloud with the required resources.

3. Add deployment secrets into Github actions

4. Run the Github Actions to publish

Local Development

Overview

Architecture

Data Flow

Deployment

Prerequisites

How to Use

Contact

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages