This course will help you understand the role and responsibilities of a Data Engineer.
We discuss Azure Data Factory, Databricks, ADLS, Spark, SQL, Pyspark as part of this course.
Welcome to Data Engineering world!!
Contents off the course:
Introduction to Data, Big Data
Distributed data Processing
Overview of Hadoop and demerits
Azure Cloud Intro
Azure Account setup
Create ADLS Subscription, RG, ADF, Databricks
ADF walkthrough
create IR, Linkedservice,Dataset, pipeline
Usage of different pipeline activities
Intro to dataflows
Monitor the pipeline runs
global parameters, parameterisation
Intro to databricks
walkthrough of databricks
create mount point in databricks using secret scope
ingest, transform and write data into different formats and tables
what is delta and parquet
Introduction to Unity Catalog
create notebooks
run notebooks from other notebooks
transformations and actions
RDD, Dataframe, Dataset
cache vs persist
shuffle vs partition
joins, predicate push down, broadcast joins
spark - architecture
spark job submit
Thank you!!
Azure