Kiddcorp LP - Databricks Intermediate

Intermediate Databricks Course Outline

Length: 3 Days

Module 1: Advanced Data Engineering with Delta Lake

Delta Lake architecture deep dive: transaction log, time travel, and ACID guarantees
Schema enforcement vs. schema evolution strategies
Optimizing Delta tables: OPTIMIZE, ZORDER, and vacuum operations
Handling slowly changing dimensions (SCD Type 1, 2, 3)

Module 2: Delta Live Tables & Pipeline Architecture

Change Data Capture (CDC) with Delta Lake
Delta Live Tables: declarative pipelines and expectations
Medallion architecture implementation (bronze/silver/gold)
Lab: Build an end-to-end Delta Live Tables pipeline with data quality constraints

Module 3: Spark Performance Fundamentals

Spark UI deep dive: understanding jobs, stages, and tasks
Identifying and resolving data skew
Broadcast joins vs. shuffle hash joins vs. sort-merge joins
Adaptive Query Execution (AQE) tuning
Partition pruning and predicate pushdown

Module 4: Cluster & Code Optimization

Cluster configuration: worker types, autoscaling policies, spot instances
Caching strategies and when to use them
UDFs: performance implications and alternatives (pandas UDFs, vectorized operations)
Lab: Performance tune a slow-running job using Spark UI diagnostics

Module 5: Orchestration & DevOps

Databricks Workflows: jobs, tasks, and dependencies
Parameterized notebooks and job clusters vs. all-purpose clusters
CI/CD patterns: Repos integration, testing strategies, promotion workflows
Unity Catalog fundamentals: metastore, catalogs, schemas, and governance

Module 6: Security, Governance & Production Readiness

Row-level and column-level security with Unity Catalog
Secret management and secure credential handling
Monitoring and alerting: job notifications, query history, audit logs
Lab: Deploy a production-ready pipeline with Unity Catalog governance and scheduled workflows

Prerequisites

Basic Spark/PySpark
SQL proficiency
Familiarity with Databricks notebooks and basic cluster operations

Page updated

Report abuse