The repository to reinvent a Data Engineering Career
Comments from @Sumit Mittal on what to cover in Data Engineering Career-
- Distributed Storage Fundamentals
- Distributed Processing Fundamentals
- Apache Spark
- Azure Databricks
- Azure Datafactory (for ingestion)
- Azure Synapse
- Data Modeling
- System design
- Deployment Part (CICD)
- Loads of Performance tuning
- Multi Cloud (Azure & AWS can be good options)
- In AWS (EMR, Redshift, Athena, Glue, Lambda, S3)
- Spark Structured Streaming
- Kafka
- Lakehouse
- Open File formats