Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Introduction
Online transaction processing (OLTP) database contains huge transaction record.some time it has million and billions of rows.these data is stored in relational database.when business require analysis of these transaction, it takes huge amount of time to process these data. business user wants to analyse these data in small time period. they dont want to have complexity in schema too. Sometimes business user wants to analyse data from other source too. these other source data could be from different business sources. So database designer comes with a solution to implement Datawarehouse system for business data. this system is separate from OLTP database. So analysis process becomes easier than before.
Background
Datawarehouse system consist of Fact table , dimension table and Auxiliary table. Datawarehouse system can have multiple OLTP system as data source. it can be CRM system,ERP system, CMS system or Flat file data. Tables in Datawarehouse system are organised in specific schema. this schema can be either star schema or snowflake schema. we are going to cover each topic of datawarehouse system in same article.
Using the code
CREATE TABLE SalesRegion (
SalesRegionId INTEGER NOT NULL,
RegionName VARCHAR(50) NOT NULL,
CONSTRAINT SalesRegion_pk PRIMARY KEY (SalesRegionId)
);
CREATE TABLE DimSalesMan (
SalesRegionId INTEGER NOT NULL,
SalesManId INTEGER NOT NULL,
SallesmanName VARCHAR(50) NOT NULL,
CONSTRAINT DimSalesMan_pk PRIMARY KEY (SalesRegionId, SalesManId)
);
CREATE TABLE DimDate (
DateId INTEGER NOT NULL,
DateValue DATE NOT NULL,
MonthValue INTEGER NOT NULL,
YearValue INTEGER NOT NULL,
DateQuarter INTEGER NOT NULL,
DateHalfYear INTEGER NOT NULL,
CONSTRAINT DimDate_pk PRIMARY KEY (DateId)
);
CREATE TABLE Currency (
CurrencyId INTEGER NOT NULL,
CurrencyName VARCHAR(50) NOT NULL,
CurrencyFxRate DECIMAL(18) NOT NULL,
CONSTRAINT Currency_pk PRIMARY KEY (CurrencyId)
);
CREATE TABLE DimProduct (
ProductId INTEGER NOT NULL,
CurrencyID INTEGER NOT NULL,
ProductName VARCHAR(50) NOT NULL,
ProductCost NUMERIC(18,2) NOT NULL,
CONSTRAINT DimProduct_pk PRIMARY KEY (ProductId, CurrencyID)
);
CREATE TABLE FactSales (
SalesRegionId INTEGER NOT NULL,
SalesManId INTEGER NOT NULL,
ProductId INTEGER NOT NULL,
CurrencyID INTEGER NOT NULL,
DateId INTEGER NOT NULL,
SalesId INTEGER NOT NULL,
SalesDescription VARCHAR(100) NOT NULL,
CONSTRAINT FactSales_pk PRIMARY KEY (SalesRegionId, SalesManId, ProductId, CurrencyID, DateId)
);
ALTER TABLE DimSalesMan ADD CONSTRAINT SalesRegion_DimSalesMan_fk
FOREIGN KEY (SalesRegionId)
REFERENCES SalesRegion (SalesRegionId)
ALTER TABLE FactSales ADD CONSTRAINT FactSales_DimSalesMan_fk
FOREIGN KEY (SalesRegionId, SalesManId)
REFERENCES DimSalesMan (SalesRegionId, SalesManId)
ALTER TABLE FactSales ADD CONSTRAINT FactSales_DimDate_fk
FOREIGN KEY (DateId)
REFERENCES DimDate (DateId)
ALTER TABLE DimProduct ADD CONSTRAINT Currency_DimProduct_fk
FOREIGN KEY (CurrencyID)
REFERENCES Currency (CurrencyId)
ALTER TABLE FactSales ADD CONSTRAINT FactSales_DimProduct_fk
FOREIGN KEY (ProductId, CurrencyID)
REFERENCES DimProduct (ProductId, CurrencyID)
Fact Table
Table which contains fact information is known as fact table. these infomation can be measure in analysis. such as Total Sales Amount,Total Order Count,Total Product Sales etc are different measures of business. these measure information are stored in Fact table. Normally Fact table contains transaction of business. eq: FactSales table is fact table. it contains business transaction.
Dimension Table
Table which contain dimension for analysis of data is known as Dimension table. it can be Date Dimension,Product Dimension,Region Dimension,Currency Dimension etc. Analysis of Fact table are process by Dimension table. one can analyse business transaction on the basis of period wise,region wise,product wise. Data in dimension table are normally fixed nature.these data normally does not modify or modify in some condition when updated set is required in business. it may be modify in a month or in a year. eq. DimDate,DimProduct,DimRegion are dimension table.
Attribute of Dimension Table
attribute of dimension table could be product name,product cost,product color,product category, product weight etc
Natural Key
The default key assigned to attribute is known as Natural key. eq.ProductCode is natural key for any product.
Surrogate Key
Numeric value used as key which is used to join dimension table with fact table is known as surrogate key eq.ProductId is surrogate key for product dimension table. it joins with FactSales fact table to get product sales information.
Database Schema
Datawarehouse is a relational database where Fact and Dimension table are related with each other. before defining datawarehouse sytem, its structure must be define.All table in datawarehouse are structured in specific schema. it can be Start Schema or Snowflake schema
Star Schema
Database schema in which dimension table are directly connected with fact table is known as Star Schema. it is most popular schema used in desinging datawarehouse system. it is easy to implement and relate dimension tables with fact tables.
Snowflake Schema
Database schema in which dimension table are connected with fact table but dimension table itself related with another dimension table to get data is known as Snowflake Schema. snowflake schema makes datawarehouse designing complex. if possible we should avoid snowflake schema using view where multiple dimension table are joined together and used as simple dimension table.
Junk Dimension
Some value in data warehouse system are very less in number of rows. we can not include them as dimension table and also can not avoid in system.such as Global setting value for datawarehouse system. we can put all require global setting in separate setting table but it is not considered as dimension table. it is termed as Junk Dimension.
Degenerate Dimension
Some tables in datawarehouse system contains atribute of fact table. these attribute are used in analysis of business. such as Transaction number in fact table is maintained as Degenrate dimension.
Slowly changing Dimension
It is process mananging changes of data in dimension table.when data in dimension table is modified, it is required to maintain change history of data. it gives accurate picture to business when analysis is depend on different duration of time. eq.Global business sales is depend on currency rate. each country has their different currency and their exchange rate gets updated time to time. if some sales is occured in Jan 2014 with XXX currency exchange rate and it is modified in July 2014. correct analysis is only possible when we can evaluate sales value exactly based on that duration exchange rate. if we calculate sales amount on current exchange rate value , then sales figure will not be accurate.
SCD type1
In SCD type1 ,dimension table overwrite existing value of dimension attribute with new value. So no change history is maintained for any changed attribute value.dimension table will contain only last updated value. eq.In dimcustomer table , customer address can be maintain as SCD type1. so that it will hold only current address of customer.
SCD type2
In SCD type2, complete track of change history is maintained. so for every changes in attribute value insert new record in dimension table. history is track through date range given with record. eq:DimCurrencyFxRate should be of type SCD type2. because currecny exchange rate is valid from specific date to certain date range only.
SCD type3
In SCD type3, only last few change history is maintain. number of changes tracking is based on business requirement. so such dimension table contains multiple column for same attribute. eq:DimProduct table can contain three column name for Productname naming Productname1, Productname2, Productname3. this will help when business team ask to change the name of any product.Productname3 will contain latest change name , so on Productname2 and Productname1 can contain previous 2 name of product.
Bridge table or Factless fact table
it is intermediate table which allows to join dimension table with another dimension table. it does not contain transaction detail rather it contains key attribute with other columns.
Snapshot fact table
Table which contains aggregate of transaction of Fact table is known as snapshot fact table.when business transaction is huge,entry in fact table will be huge. this will make analysis process time consuming. so that fact table transaction are aggregated and stored in separate snapshot table.
Transaction fact table
Table which contains complete details of transaction is known as transaction fact table. attribute in transaction fact table contains each details of transaction. eq: transaction number,productid,transaction date,transaction user etc.
Points of Interest
This article contains information on terminologies used in datawarehouse. this will help user to understand concepts before designing new datawarehouse system.
History
No updates available


