Azure DataBricks - Data Engineering With Real Time Project
Real Time Project on Retail Data - PySpark ,SQL, Delta/Delta Live Table,Unity Catalogue, AutoLoader & Performance Tuning

Azure DataBricks - Data Engineering With Real Time Project free download
Real Time Project on Retail Data - PySpark ,SQL, Delta/Delta Live Table,Unity Catalogue, AutoLoader & Performance Tuning
By Completing this course you will be equipped with below Data Engineer Roles & Responsibilities in the real time project
• Designing and Configuring Unity Catalogue for Better Access Control & Connecting to External Data Stores
• Designing and Developing Databricks(PySpark) Notebooks to Ingest the data from Web(HTTP) Services
• Designing and Developing Databricks(PySpark) Notebooks to Ingest the data from SQL Databases
• Designing and Developing Databricks(PySpark) Notebooks to Ingest the data from API source Systems
• Designing and Developing Spark SQL External and Managed Tables
• Developed Databricks Spark SQL Reusable Notebooks To Create and populate Delta Lake Tables
• Developed Databricks SQL Code to populate Reporting Dimension tables
• Developed Databricks SQL Code to populate Reporting SCD Type 2 Dimension tables
• Developed Databricks SQL Code to populate Reporting Fact Table
• Designing and Developing Databricks(PySpark ) Notebooks to Process and Flatten Semi Structured JSON Data using EXPLODE function
• Designing and Developing Databricks(PySpark ) Notebooks to Integrate(JOIN) Data and load into Datalake Gold Layer
• Designing and Developing Databricks(PySpark) Notebooks to Process Semi Structured JSON Data in DataLake Silver Layer
• Designing and Developing Databricks(SQL) Notebooks to Integrate Data and load into Datalake Gold Layer
• Developed Databricks Jobs for Scheduling the Data Ingestion and Transformation Notebooks
• Designing and Configuring Delta Live Tables in all layers for seamless Data Integration
• Setup Azure Monitor and Log Analytics for Automated Monitoring of Job Runs and Stored Extended Log Details
• Setup Azure Key Vault and Configure Key Vault Backed Secret Scopes in Databricks Workspace
• Configuring GitHub Repository and creating Git Repo Folders in Databricks Workspace
• Designing and Configuring CI/CD Pipelines to release the code into multiple environment
• Identifying performance bottle necks and perform the performance tuning using ZORDER BY , BROADCAST JOIN , ADAPTIVE QUERY EXECUTION , DATA SALTING and LIQUID CLUSTERING