Azure Databricks and Data Factory Course Content including Python:
Demo:
Understanding BigData and Data Lake
- Course Introduction
- What is BigData and Hadoop
- What is Data Lake and How it works?
Introduction to Data Engineer and DataBricks
- Introduction to Data Engineer
- Apache Spark to Data Engineering Platform
- Introduction to Apache Spark and DataBricks Cloud
Python Course:
- Content:
- Introduction
- Variables & Data Types
- Control Flows & Loops
- Functions
- Python Lists, Tuples and Set
- Python Dictionaries
- Comprehensions
- More On Functions
- Exception Handling
- Modules and Packages
- Python OOPs.
Spark SQL Course:
Course Content:
- Introduction
- SQL DDL, DRL and DML Operations
- SQL Aggregate Functions
- Group By, Having & Order by Clauses etc...
- SQL Joins
- Windows Functions etc...
PySpark Programming and Databricks Course:
Getting Started with Spark
- Create Spark First Application Databricks Cloud
- Introduction to Spark Dataframes and Spark Tables
- Creating Spark Dataframes
- Creating Spark Tables
- Working with Spark SQL
- Dataframes Transformations and Actions
- Applying Transformations
- Querying Spark Dataframes
- More Dataframes Transformations
Spark Execution Model and Architecture
- Spark Transformations and Actions
- Spark Jobs, Stages and Tasks
- Understanding your Execution Plan
Spark Data sources, Sinks and Transformations
- Spark Data sources and Sinks
- Reading CSV, JSON and Parquet Files
- Creating Spark Dataframe Schema
- Writing Your Data and Managing Layout
- Working with Spark SQL Tables
- Introduction to Data Transformations
- Working with Dataframes Rows and Columns
- Creating and Using UDF
- Misc Transformations
Aggregations in Spark and Joins
- Aggregating Dataframes
- Grouping Aggregations
- Windowing Aggregations
- Dataframes Joins and Column name Ambiguity.
- Optimizing your Joins
Getting Started with Databricks Premium Account
- Create Azure Cloud Account and Portal Overview
- Create Azure Databricks Workspace Service
- Introduction to Databricks Workspace
- Azure Databricks Platform Architecture
Working in Databricks Workspace
- How to Create Spark Cluster
- Working with Databricks Notebook
- Notebook Magic Commands
- Databricks Utilities Package
Working with Databricks File System – DBFS
- Introduction to DBFS
- Working with DBFS Root
- Mounting ADLS to DBFS
Working with Unity Catalog
- Introduction to Unity Catalog
- Setup Unity Catalog
- Unity Catalog User Provisioning
- Working with Securable Objects
- Working with Delta Lake and Delta Tables
- Introduction to Delta Lake
- Creating Delta Table
- Reading and Operations on Delta Table
- Delta Table Time Travel
- Convert Parquet to Delta
- Delta Table Schema Validation and Evolution
- Look Inside Delta Table
- Delta Tables Utilities and Optimizations
- Working with Databricks Incremental Ingestion Tools
- Architecture and Need for Incremental Ingestion
- COPY INTO Ingestion
- Streaming Ingestion
- AutoLoader Ingestion
- Databricks Project and Automation Features
- Working with Databricks Repos
- Working with Databricks Workflows
- Working with Databricks Rest API
Azure Data Factory Course:
- Introduction
- Environment Setup
- Data Activities
- Control Flow Activities
- Data Flows
- Triggers
- Monitoring
Mini Project
Databricks Performance Optimization Techniques
Commonly Asked Interview Questions