Data is Dimensional

Dimensional Modelling for Advanced Data Analytics and Cloud Solutions

  • Natural keys can be complex. It may involve multiple columns making its use prone to error. A column may be omitted in the join causing run-away queries. It is not efficient as it requires significantly more work to compare two strings over two binary integers. A surrogate is simple and easy to understand: The key…

    Read more

  • The optimal data schema for parallelization is a Star Schema. Normalized data models are very poor for such systems because all tables are based on a unique primary key.  Vendors that encouraged such modeling (Teradata) included an extensive array of bizarre indexing strategies to overcome the issue.  So, the machine required a lot of handholding…

    Read more