In order to properly manage the terraform .state file, one creates a resource group in azure with a storage account to hold the terraform state.
This is then referenced in the terraform/backend.tf
clause in the terraform configuration. If you have created a new resource group and storage account, you must change the names in the †erraform/backend.tf
to match the names of the resources you created.
- az login
- terraform init
- terraform plan
- terraform apply
In order for the api docker image to be published to the Azure Container Registry (ACR) we need to add three variables to the github repo. These variables are listed below, and can be found in the azure portal after navigating. Also one publish profile from the azure function is required in order to publish
- ACR_PASSWORD
- ACR_USERNAME
- ARC_SERVER (the whole path to the arc, including the .azurecr.io suffix
- AZURE_FUNCTIONAPP_PUBLISH_PROFILE (found by navigating to the azure function in the portal, and click 'publish profile')
Navigate to the Actions
tab in Github, and check if the actions script for publishing an image to Azure Container Registry have run successfully. If not, then restart it in order to publish the code to the registry.
Also run the action to publish the azure fuction.
- Create a file called
api/AquaApi/appsettings.Development.json
. This must be .gitignored. Add a variableConnectionStrings:BlobStorage
storage connection string to the appsettings. - The connection string must be on the format:
DefaultEndpointsProtoco=https;AccountName=<your-storage-account>;AccountKey=<storage-key>;EndpointSuffix=core.windows.net"
This project is a template for a data platform designed for small and medium-large aquaculture companies in Norway. It leverages the power of Azure, Databricks, and Terraform to provide a scalable and efficient solution for managing and transforming data. The platform follows the Medallion Architecture and fetches data daily from api.havvarsel.no.
-
Azure Databricks Architecture: This platform is built on the Azure Databricks architecture, providing a unified analytics framework that integrates with Azure services. You can learn more about Azure Databricks architecture here.
-
Medallion Architecture: The platform transforms data following the Medallion Architecture, enabling efficient and scalable data processing. Learn more about Medallion Architecture here.
- Data Source: Every day, new data is fetched from the public API at api.havvarsel.no.
- Transformation: The raw data is transformed using Databricks' Medallion Architecture, processing it in stages (Bronze, Silver, Gold).
- Reporting: The transformed data is available through a Power BI report, which can be accessed here. (Add link to your Power BI report)
This project uses Terraform to automate the deployment of infrastructure and resources on Azure. To get started with setting up the platform, follow the official Terraform documentation for running up a new project here.
- Azure Subscription
- Databricks Workspace
- Terraform Installed
- Power BI for reporting
- Clone this repository.
- Follow the Terraform setup instructions.
- Use Databricks to manage data transformations.
- Access the Power BI report to view insights.
For any questions, please contact [Your Name/Your Company].