Member-only story
Data Analytics
What Is a Data Warehouse
Understanding the concept of data warehouses and how they differ from data lakes and databases

Introduction
Data and analytics have become quite crucial for businesses as they enable people across the organization to make more informed decisions and monitor business performance.
Today, we will be discussing data warehouses which are essentially the main store of information that enables access to this data. Additionally, we will also discuss the bigger picture in the context of data storage and management, and see how Data Warehouses work and interact with Data Lakes and Databases.
What is a Data Warehouse and how does it work
A Data Warehouse is a central data repository that holds information derived from various systems such as relational databases, transactional systems, and various other sources, such as application log files.
Users consuming data from Data Warehouses include Data Scientists and Engineers, Business and Insights Analysts, and Decision Makers. Access to the data is performed through various mechanisms including (but not limited to) SQL clients, Analytics, and Business Intelligence (BI) tools.
The basic architecture of a modern Data Warehouse is split into three tiers. The bottom tier of the data warehouse consists of the database server (usually a relational database system) where the data is extracted, loaded, and transformed using various backend tools. The middle tier acts as a layer between the end-user and the database and it consists of an engine that enables access and analysis over the data. Finally, the top tier is the front-end client that consists of APIs and tools that enable access to results through reporting, analysis, query, and data mining tools.
Note that the bottom tier (and thus the data warehouse itself) may consist of multiple databases containing structured data (i.e. organized into tables and columns). Every table has a schema containing the data type and description for each column. The schemas enable query tools to easily determine which tables to access and analyze when ingesting data.