dataproc
Here are 88 public repositories matching this topic...
Performance Observability for Apache Spark
-
Updated
Dec 17, 2024 - TypeScript
Companion to Learning Hadoop and Learning Spark courses on Linked In Learning
-
Updated
Dec 10, 2024 - HTML
Ephemeral Hadoop clusters using Google Compute Platform
-
Updated
Mar 31, 2022 - Java
Solution Accelerators for Serverless Spark on GCP, the industry's first auto-scaling and serverless Spark as a service
-
Updated
May 3, 2024 - Python
EtlFlow is an ecosystem of functional libraries in Scala based on ZIO for running complex Auditable workflows which can interact with Google Cloud Platform, AWS, Kubernetes, Databases, SFTP servers, On-Prem Systems and more.
-
Updated
Aug 26, 2024 - Scala
Debussy is an opinionated Data Architecture and Engineering framework, enabling data analysts and engineers to build better platforms and pipelines.
-
Updated
Mar 20, 2023 - Python
Data Pipeline from the Global Historical Climatology Network DataSet
-
Updated
Dec 20, 2022 - Jupyter Notebook
Scheduling Big Data Workloads and Data Pipelines in the Cloud with pyDag
-
Updated
Sep 19, 2022 - Python
Creating an Inverted Index of words occurring in a large set of documents extracted from web pages using Hadoop MapReduce and Google Dataproc
-
Updated
Oct 28, 2019 - Java
ecommerce GCP Streaming pipeline ― Cloud Storage, Compute Engine, Pub/Sub, Dataflow, Apache Beam, BigQuery and Tableau; GCP Batch pipeline ― Cloud Storage, Dataproc, PySpark, Cloud Spanner and Tableau
-
Updated
Mar 9, 2022 - Python
A search engine to query social media insights with political theme
-
Updated
Sep 22, 2021 - Jupyter Notebook
GCP_Data_Enginner
-
Updated
Sep 4, 2021 - Shell
An educational project to build an end-to-end pipline for near real-time and batch processing of data further used for visualisation and a machine learning model.
-
Updated
May 19, 2023 - Python
Improve this page
Add a description, image, and links to the dataproc topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the dataproc topic, visit your repo's landing page and select "manage topics."