
About the Forum
This is intended as a professional site to discuss data modeling and its effects on data availibility, security, and society. You can view the blog as a guest, but cannot post or comment in the Forum. You must register first, to be able to post and comment within the group.
While the focus is on Dimensional Modeling, it does not preclude the discussion of other methodologies. Feel free to create new comments on all aspects of building and maintaing an analytic environment. Its challenges, successes and failures.
Copyrights and Ads
This site has no advertizing. Any vendor contributions should avoid any marketing slant to their message. The message should contribute to the art and practice of data analytics. The Admin (me) holds final say as to what appears on this site.
The only commercial reference is the green link at the bottom of the screen which takes you to my business site.
Any original material is copyright to the Author (any Member). All images (except the guy on the ‘About the Author’ page) are AI generated. Last thing I heard is they cannot be copyright. Have fun.
Why Dimensional?
Since the inception of commerce, successful entrepreneurs have learned by examining history and trends in their respective businesses. The fundamental issue at hand was how to analyze those business events in their true context. This need for context evolved into Accounting Systems and eventually led to where we are today. Context is not a new idea; we simply call it ‘Dimension’ in Data Modeling terms. A Dimension represents a category of context, its relationships in a fact table are used to represent business events in the context in which they occurred.
Dimensional Modeling has deep roots in centuries of business management practice. Dr. Kimball formalized the concept of organizing data for optimal effect in an analytic environment. He authored numerous books on the subject and established the methodology for building and maintaining complex analytic environments.
Thinking Dimensionally
In a basic Dimensional Model data is organized into two categories: Attribute or Measure. An Attribute provides contextual information about the event, while a Measure indicates the size or value of the event. A Dimension is an organization or grouping of Attributes, ideally in a way that makes sense to the business. The methodology further states that dimensions evolve into new dimensions as data is aggregated, providing a method to integrate disparate data sets into a cohesive whole.
Load and Query Performance
The physical data structure, the Star Schema, is the most performant data structure to support analytics on any contemporary data platform. If you are heavily invested in a Cloud platform, a proper data model can reduce compute costs dramatically and provide faster query performance. Particularly with very large data sets (100’s millions facts).
Being able to collect, store, and analyze high granularity data in a timely manner is the hallmark of a properly executed dimensional model.
Simplicity
You can use a View to present a single, flat image of the Data Mart. You perform all joins in the View, and the Query Optimizer will eliminate any unnecessary for the Query. This can be done with Hierarchies as well. These Views can be tailored for specific Departments or User Groups to manage some Security and Privacy Issues, such as data obfuscation.
Fit to Purpose
The concepts and principles of the model are solely for the purpose of performing analytics using relational data. This model is easily understood by business, analysts, and technicians. It is so well established that ALL relational database vendors look for, and optimise query plans specifically for Star Schema based queries.
The focus of Dimensional Design is the ability to integrate different data sets across an organization, allowing truly seamless integration across the enterprise.