Effective Retrospective [draft]
Reference How to do effective retrospective
Reference How to do effective retrospective
I will use it to explain some of the fundamentals that we are talking about and eventually bring them to life in a tutorial series. Will also extend the template with missing MLOps parts so tune in! Recap: Data Producers - Python Applications that extract data from chosen Data Sources and push it to Collector via REST or gRPC API calls. Collector - REST or gRPC server written in Python that takes a payload (json or protobuf), validates top level field existence and correctness, adds additional metadata and pushes the data into either Raw Events Topic if the validation passes or a Dead Letter Queue if top level fields are invalid....
What is DataOps? DataOps is a methodology that combines technology, processes, principles, and personnel to automate data orchestration throughout an organization. Data Platform Design Data Model: Kimball Model. Data File Format Comparison: Apache Parquet, Avro, ORC, and Arrow. Open Table Formats: Delta Table, Apache Iceberg, Hudi, and Hive. Data Governance & Management Data Governance and Trust establishes the rules of engagement for the organisation. This includes how data will be managed across roles, responsibilities, decision rights, policies, and standards....
Reference Wikipedia: Shared-nothing Architecture
Reference https://aws.amazon.com/blogs/architecture/lets-architect-modern-data-architectures/ https://garystafford.medium.com/building-a-simple-data-lake-on-aws-df21ca092e32 https://medium.com/pythonistas/complete-guide-to-aws-data-lake-4cc85259deb0