What is a Persistent Layer?

ByAdam Gilmore Updated onNovember 1, 2023

The persistent layer contains as set of persistent tables that record the full history of changes to the data of the table/query that is the source of the Persistent table.

The source could a source table/file, a source query, another staging table or a view/materialized view in the transform layer.

In a persistent table there may be multiple row versions for each row found in the source. Each row version has an effective date and end date marking the date range of when that row version was effective (or in existence).

Technically speaking a persistent table is a bi-temporal table. A bi-temporal table permit queries over two timelines: valid time and transaction time. Valid time is the time when a row is effective (i.e row effective and end date-time). Transaction time denotes the time when the row version was recorded in the database. The persistent table supports transaction time by tagging each row version with an inserted and updated batch execution Id. The batch execution is associated with a date time in the batch database. Note that the Dimodelo Data Warehouse Studio version of the bi-temporal table goes one step further by identifying a last updated transaction date time also.

How is a Persistent layer different to an ODS, Inmon EDW, Data Vault?

A Persistent Layer has some similarities to all of these techniques, in that, it is a precursor to a Dimensional layer. It could be described as a temporal ODS. However there are some key differences:

The temporal nature of a Persistent layer is different. A Data Vault will contain some temporal information, but not effective dates. The other techniques don’t traditionally hold history.
An ODS and EDW, and to a certain extent, a Data Vault, are designed to be queried by end users. A Persistent layer’s primary purpose is to support the higher layers in the Data Warehouse. You could query a Persistent layer directly, but, because of the temporal nature of the data, it becomes difficult to write the temporal joins. Luckily Dimodelo also set a Is_Latest flag for every row version, so it’s easy to query current state,and thus emulate an ODS.
The extent to which these layers are modeled is different. The EDW and Data Vault techniques expect to be extensively modeled. This takes considerable effort on the behalf of the developer, first to model the layer, and second to transform the data from the source into the new model. In a persistent layer, you can model this layer as much or as little as you like. Again its purpose is support higher layers. If you need, you can create entities representing aggregations, allocations and new relationships etc in this layer. You might do this to support the Dimensional layer, or for a specific reporting purpose. We do recommend that the persistent layer be at least organised, into source systems, or entity domains, or both.

Adam Gilmore

Adam Gilmore is a Data Architect with 19 years' experience in the Data field. He writes for Dimodelo and develops courses covering Data Analysis to Data Architecture.

Data Warehouse

What is Dimensional Modeling (introduction)

ByAdam Gilmore March 19, 2024May 4, 2024

Dimensional modeling is a data modeling technique used to model the presentation layer of a data warehouse. It focuses on delivering simplicity and query performance for the end user. It allows users to easily understand and navigate the data available for reporting and ad hoc analysis. A dimensional model supports high-performance aggregated queries and performs…

Data Warehouse | semantic layer

What is a Semantic Layer? (and why you need one)

ByAdam Gilmore October 17, 2023March 19, 2024

What is a Semantic Layer A semantic layer exists to present data to users as a set of related and commonly understood business entities, terms and metrics. A semantic layer is typically the “top” layer of a data warehouse/lakehouse. It is accessible to end users and report developers, who use it as the source for…

Data Warehouse | Persistent Staging Layer

Top 5 reasons you need a Persistent Layer in your Data Warehouse

ByAdam Gilmore June 7, 2018November 1, 2023

At Dimodelo, we have been hard at work, redesigning the Data Warehouse data management architecture that Dimodelo Data Warehouse Studio (our Data Warehouse Automation tool) generates. Working with our clients we have introduced a Persistent Layer into the data warehouse architecture. “What is a Persistent Layer?” I hear you ask. The persistent layer contains as…

Data Warehouse

Dimension Tables – An Introduction

ByAdam Gilmore May 3, 2024May 8, 2024

A Dimension table is one of the 3 key elements of dimensional modeling used to build a Data Warehouse. This article will give you an in-depth understanding of various Dimension table concepts. What is a Dimension Table? Conceptually, Dimension tables are database tables modelled to represent the business entities involved in business processes and events….

Data Warehouse

What is a Fact table? (and why you need them)

ByAdam Gilmore March 28, 2024May 4, 2024

In the context of a data warehouse, a fact table represents a business process or event and contains the measures and metrics you want to analyze for that process or event. There are three types of facts corresponding to three kinds of business events. A Fact doesn’t exist in isolation. It is related to a…

Data Warehouse | Persistent Staging Layer

Why you need a Persistent Layer in your Data Warehouse – Top 5 Reasons

ByAdam Gilmore July 24, 2018November 1, 2023

Why you Need a Persistent Layer 1. All history, all of the time, instead of some history some of the time Including a persistent layer in your architecture is a paradigm shift in how you see the data warehouse. In the popular Kimball methodology, without the persistent layer, the data warehouse layer was responsible for…

How is a Persistent layer different to an ODS, Inmon EDW, Data Vault?

Similar Posts

Leave a Reply Cancel reply