Towards AI

The leading AI community and content platform focused on making AI accessible to all. Check out our new course platform: https://academy.towardsai.net/courses/beginner-to-advanced-llm-dev

Follow publication

Data Analytics

What Is a Data Warehouse

Understanding the concept of data warehouses and how they differ from data lakes and databases

Giorgos Myrianthous
Towards AI
Published in
4 min readFeb 17, 2022

--

Photo by S Migaj on Unsplash

Introduction

Data and analytics have become quite crucial for businesses as they enable people across the organization to make more informed decisions and monitor business performance.

Today, we will be discussing data warehouses which are essentially the main store of information that enables access to this data. Additionally, we will also discuss the bigger picture in the context of data storage and management, and see how Data Warehouses work and interact with Data Lakes and Databases.

What is a Data Warehouse and how does it work

A Data Warehouse is a central data repository that holds information derived from various systems such as relational databases, transactional systems, and various other sources, such as application log files.

Users consuming data from Data Warehouses include Data Scientists and Engineers, Business and Insights Analysts, and Decision Makers. Access to the data is performed through various mechanisms including (but not limited to) SQL clients, Analytics, and Business Intelligence (BI) tools.

The basic architecture of a modern Data Warehouse is split into three tiers. The bottom tier of the data warehouse consists of the database server (usually a relational database system) where the data is extracted, loaded, and transformed using various backend tools. The middle tier acts as a layer between the end-user and the database and it consists of an engine that enables access and analysis over the data. Finally, the top tier is the front-end client that consists of APIs and tools that enable access to results through reporting, analysis, query, and data mining tools.

Note that the bottom tier (and thus the data warehouse itself) may consist of multiple databases containing structured data (i.e. organized into tables and columns). Every table has a schema containing the data type and description for each column. The schemas enable query tools to easily determine which tables to access and analyze when ingesting data.

--

--

Published in Towards AI

The leading AI community and content platform focused on making AI accessible to all. Check out our new course platform: https://academy.towardsai.net/courses/beginner-to-advanced-llm-dev

Written by Giorgos Myrianthous

I strive to build data-intensive systems that are not only functional, but also scalable, cost effective and maintainable over the long term.

No responses yet

Write a response