Databases: OLTP and OLAP

Thursday, August 1, 2024

In today’s tech-driven world, understanding databases remains crucial, especially for those in tech or computer engineering. This article discusses databases in a specific and concise manner.

Let’s start:

What you see on your screen is the “frontend,” the part of a system that interacts with users. However, the actual data comes from the “backend,” a collection of systems including databases, servers, load balancers, and message queues. The backend is where the raw data, such as this blog post, is stored, often in formats like JSON objects.

Data is central to all computer systems and software. For example, platforms like Facebook store your messages, posts, and demographic details, while LinkedIn and Twitter (now X) store similar user data. These platforms use various methods to retrieve and display this data, depending on their specific needs.

Databases play a vital role in these processes. They can be broadly categorized into two types:

1. OLTP (Online Transaction Processing): These are write-intensive databases, commonly used for applications that require frequent updates and insertions, such as relational databases.

2. OLAP (Online Analytical Processing): These are read-intensive databases, optimized for analyzing large volumes of data. Document stores and key-value databases often fall into this category, along with graph databases for highly relational data.

Relational databases are traditionally more write-intensive, suitable for applications that require strict adherence to ACID properties (Atomicity, Consistency, Isolation, Durability). In contrast, document and key-value databases are typically more read-intensive, making them better suited for analytics and data retrieval tasks. Graph databases are particularly useful when dealing with many-to-many relationships, as they allow for efficient querying of highly interconnected data.

Businesses use vast amounts of data to make strategic decisions. During this process, most queries are read-intensive, as they involve analyzing data rather than writing new data. Relational databases can be inefficient for these tasks because they retrieve entire rows of data, even when only specific attributes are needed. This is where column-based databases come in handy. They store each attribute in separate files, making it easier to perform aggregate functions like averages or sums.

However, neither row-based nor column-based databases are ideal for all scenarios, particularly when dealing with highly relational data. In such cases, graph databases offer a more efficient solution, allowing for seamless queries across complex relationships.

In conclusion, databases are a core component of every system, and understanding their various types and trade-offs is crucial for developing efficient systems. Whether you’re dealing with OLTP or OLAP workloads, selecting the right database type can make all the difference.

Thank you for reading.