Data Engineering on Microsoft Azure Cloud Platform (3 Months)
Course Overview: This course provides hands-on training in data engineering using Microsoft Azure. You'll learn key tools and technologies, including SQL, Python, Azure Data Factory, Azure Databricks, Spark, and PySpark. The course is project-driven and prepares you for real-world data engineering roles.
Course Structure:
- Duration: 1.5-hour sessions on Saturdays and Sundays
- Total Sessions: 12 sessions monthly (including 1.5-hour weekly doubt-clearing session on Fridays)
- Project-Based Learning: All sessions include practical, project-driven activities.
Modules:
-
SQL & Python (Basic to Advanced) – 4 Sessions
Learn SQL queries, advanced concepts, and Python for data engineering. -
Azure & Big Data Services – 1 Session
Introduction to Azure and key Big Data services (Azure Data Lake, Synapse). -
Azure DevOps Setup – 1 Session
Overview of Azure DevOps tools for CI/CD and project management. -
Azure Data Factory (ADF) – 2 Sessions
Set up ADF pipelines, data flows, and triggers for ETL tasks. -
Azure Databricks – 2 Sessions
Setup and use Databricks for data processing and analytics. -
Spark Architecture & APIs – 1 Session
Understand Spark's architecture and APIs for big data processing. -
PySpark – 4 Sessions
Work with RDDs, DataFrames, and optimization techniques in PySpark. -
Final Project & Interview Readiness – 2 Sessions
Complete a capstone project and prepare for data engineering interviews.
Additional Details:
- Doubt-Clearing Sessions: Weekly 1.5-hour sessions on Fridays.
- Azure Subscription: Students need their own Azure subscription for hands-on labs (costs borne by the student).
- Project-Driven Learning: From the second session onward, all activities focus on practical, real-world projects.