Data Engineering with Databricks v2  [draft]

00 - General Databricks documentation, link Data Engineering With Databricks, GitHub 01 - Databricks Workspace and Services Databricks Architecture and Services Databricks Control Plane Web Application Databricks SQL Databricks Machine Learning Databricks Data Science and Engineering Repos / Notebooks Job Scheduling Cluster Management Cluster Cluster are made up of one or more virtual machine (VMs) instances Driver node. Coordinate activities of executors, aka master node in EMR. Executor node. Run tasks composing a Spark job, aka run node in EMR....

March 7, 2023 · 30 min · 6268 words · Eric