What is DataOps?

DataOps is a methodology that combines technology, processes, principles, and personnel to automate data orchestration throughout an organization.

Data Platform Design

  • Data Model: Kimball Model.
  • Data File Format Comparison: Apache Parquet, Avro, ORC, and Arrow.
  • Open Table Formats: Delta Table, Apache Iceberg, Hudi, and Hive.

Data Governance & Management

Data Governance and Trust establishes the rules of engagement for the organisation.

This includes how data will be managed across roles, responsibilities, decision rights, policies, and standards.

Data Discovery & Curation

Data Sourcing and Discovery understands the legacy data landscape within the organisation – how to identify and acquire the data sets relevant to the customer, transaction and product data sets defined in the rules framework.

Data Quality & Assurance

Data Quality and Assurance establishes the fitness-for-use of the data sets – identifying and resolving gaps, inconsistencies, and errors in data before datasets are either shared with market participants or merged with market data and used for analytics, automation or pricing.

=======

Data Observability

Five key pillars of Data Observability

  • Recency. Freshness
  • Volume
  • Schema
  • Distribution
  • Lineage

Data Sharing & Architecture

Data sharing and Architecture delivers the infrastructure and mechanics for consolidating, mastering, and securely administering data requests from customers, accredited data recipients or within the organisation.

Data Lifecycle Management

Data retention, disposal, and decommissioning ensures that conditions of customer consent are adhered to, and that data is de-identified and/or deleted in alignment with the conditions under which the consent has been supplied.

Master Data Management

Technical Capability

Data Architecture

Code Packaging

Integration Test

Monitoring & Alerting

Release Management

Data Lineage