Data Engineering Roadmap

Below you’ll find the Data Engineers roadmap - a step-by-step guide on how to become a Data Engineer. This roadmap covers topics like Data Architecture, Data Processing and Deployment. 

When you click on any of the boxes on the map, a sidebar will open showing you a curated list of relevant learning resources. Here you can bookmark your favorite resources, mark articles as complete and add (study) notes. 

โš™๏ธ Data Engineer
โš™๏ธ Data Engineer
Architectural Patterns
Architectural Patterns
Horizontal vs vertical scaling
Horizontal vs vertical scali...
Map Reduce
Map Reduce
Data Replication
Data Replication
Job & Task Tracker
Job & Task Tracker
Name & Data Nodes
Name & Data Nodes
Hadoop (large data)
Hadoop (large data)
Spark (in memory)
Spark (in memory)
Data Architectures
Data Architectures
๐Ÿ’ ย Principles
๐Ÿ’ ย Principles
๐Ÿ”ง Tools
๐Ÿ”ง Tools
RAPIDS (on GPU)
RAPIDS (on GPU)
Hive (Data Warehouse)
Hive (Data Warehouse)
Elastic
Elastic
Google BigQuery
Google BigQuery
Flink
Flink
MLFlow
MLFlow
Kafka
Kafka
Databases
Databases
Cassandra
Cassandra
MongoDB
MongoDB
Scalability
Scalability
ZooKeeper
ZooKeeper
Kubernetes
Kubernetes
Cloud Services
Cloud Services
AWS SageMaker
AWS SageMaker
Google AutoML
Google AutoML
Microsoft Azure
Microsoft Azure
Dask
Dask
Apache Airflow
Apache Airflow
Snowflake
Snowflake
Amazon Redshift
Amazon Redshift
Data Formats
Data Formats
Data Discovery
Data Discovery
Data Source & Acquisition
Data Source & Acquisition
Data Integration
Data Integration
Data Fusion
Data Fusion
Transformation & Enrichment
Transformation & Enrichment
OpenRefine
OpenRefine
Data Survey
Data Survey
Using ETL
Using ETL
Data Lake
Data Lake
Data Mesh
Data Mesh
Neo4j
Neo4j

๐Ÿ”Ž Legend

Yellow boxes are subjects to study. Blue boxes are tools to master. Boxes contain links to relevant content.ย 

๐Ÿ”Ž Legend...
๐ŸŽšData Processing
๐ŸŽšData Processing
Data Warehousing
Data Warehousing
๐Ÿ“ฆ Deployment
๐Ÿ“ฆ Deployment
Text is not SVG - cannot display