As I dive into the world of datawarehousing, designing the schema for this case is much different than you would for a transactional database. In a standard transactional database you would first find all your entities and relationships, and proceed to normalize to 3NF. In a datawarhousing environment you are actually de-normalizing the tables.
Follow up:
All of the information gets stored in a way to optimize the speed of the queries, and not the size. Redundant information is likely as well. One example is storing ad statistics separately from the main statistics. The information repeated may be time, which application that ad is on, user name, etc. The end result is that you have a table you can run searches on that is much smaller that the main table and as a result much quicker. Even though the information is duplicated, the end result is quicker and more desirable.
Recent Comments