What Why Data Vault vs Data Warehouse?

Introduction to Data Vault

2 weeks ago SETHA THAY 89
What Why Data Vault vs Data Warehouse?

What is Data Vault 2.0

Data Vault 2.0 is a modern data modeling approach and methodology designed to provide an enterprise data warehouse to handle complex and varying data structures. Data Vault 2.0 is an extension and refinement of Data Vault 1.0 introduced in 2013. Maintain hub-and-spoke architecture, enable history tracking, integrity, quality, agility, and adaptability.

what-why-data-vault-vs-data-warehouse-hub

Data Vault 2.0 introduces additional features of big data platforms, cloud computing, and automation tools to enhance data management capability. Data Vault 2.0 utilizes hash keys to generate identifiers for hubs, links, and satellites to increase the performance, scalability, and traceability of the data. New architecture includes a staging area, presentation layer in the data mart, and data quality services which include:

  • Raw vault: contains original source of the data
  • Business Vault: contains business rules and transformations applied from the raw vault
  • Information mart: Presentation layer providing analytical capabilities
  • Data mart: Presentation layer providing reporting to end users for consumption

The data Vault consists of 3 components such as:

  • Hubs: Represent core business entities like customers, products, or stores, contain business keys and necessary fields, do not hold entity information
  • Links: Establish relationships between different hub entities, connecting hubs
  • Satellites: Store descriptive attributes related to hubs (change over time, similar to Type SCD II )

Contains additional information like timestamps, status, and flags and provides historical data about entities over time

what-why-data-vault-vs-data-warehouse-hub-1

Why Data Vault 2.0

  • Flexibility: handle multiple sources and frequently changing relationships to reduce workloads & impact existing
  • Scalability: incremental updates, easier for large datasets over time, convenient expansion
  • Consistency: load data in parallel by using hash values, faster data access, and automation tool
  • Repeatability: enables easy repeatability when it comes to performing a new modification on an already planned process

Representations the fraction

  • Agility: short agile work process
  • Adaptability: frequent changes to business need
  • Auditability: historical data for keeping track of changes in satellite

what-why-data-vault-vs-data-warehouse-reason

Architecture

what-why-data-vault-vs-data-warehouse-architecture-1

Figure 1

what-why-data-vault-vs-data-warehouse-architecture-2

Figure 2

Data Warehouse vs Data Vault

Purpose

  • DWH: Storing structured data for analytical
  • DV:  Storing all in flexible way for agile analytics and data integration

Schema

  • DWH: Fact and dimension tables
  • DV: Spoke and hub technique

Flexibility

  • DWH: suitable for well-defined reporting and analytics requirements
  • DV: More flexible and adaptable to changes in data sources, schema, and business needs

Normalization

  • DWH: Involved in denormalization to simplify query
  • DV: Normalization to maintain data representation

Loading Strategy

  • DWH: ETL transforms and loads data into DWH
  • DV: First loaded to raw vault

Scalability

  • DWH: More challenging due to predefined schema
  • DV: Easy expansion as new data sources increase

Performance

  • DWH: Optimize for query performance, pre-defined query
  • DV: Slightly lower due to normalization, excels agility and scale

Sample Implementation

what-why-data-vault-vs-data-warehouse-sample-implementation

Conclusion

In conclusion, Data Vault 2.0 offers a scalable and flexible methodology for managing ever-growing business data. It's not just a data warehousing technique, but also it's a comprehensive business intelligence and analytics solution for modern technology.

THANK YOU, Find Us @
Telegram
Twitter
LinkedIn


About author

Author Profile

SETHA THAY

Software Engineer & Project Manager. I am willing to share IT knowledge, technical experiences and investment to financial freedom. Feel free to ask and contact me.



Scroll to Top