Understanding Buffer vs. Cache in Linux Memory

Linux TLDR
Last Updated:
Reading time: 6 minutes
🚀
Quick Overview

They are not that easy to comprehend in one sentence, but for simplicity, you can think of buffers as a way to store file metadata (permissions, location, etc.) during I/O operations for efficient transfer. While cache is used for storing actual file contents in memory to speed up future retrieval.

buffer vs cache

What are Buffers and Caches in Linux Memory?

Perhaps you’ve all come across the terms “buffer” and “cache” at some point, and some of you might be wondering about the difference between the two.

If you look at their definitions, you’ll notice they share a similar philosophy but differ in terms of functionality.

Definition of 🧾 Buffer: A buffer is a temporary storage area in computing that holds file metadata information like permissions, location, etc. during I/O operations. Buffers are extensively used in numerous aspects of computing, such as networking, I/O operations, and visual processing. [1]

Definition of 📦 Cache: Cache memory temporarily holds frequently used information and programs, providing faster data access for the CPU, as server RAM is slower and located further away from the CPU. A cache hit speeds up data retrieval, enhancing system efficiency.

As you can see, both are used to store temporary but different types of data to speed the process, like I/O operations or for CPU optimizations.

But what is exactly the difference between them? To know more, let’s dig deeper.

Difference Between Buffer and Cache in Linux Memory

To understand the difference between these terms, let’s begin to fully understand what exactly buffer and cache [2] are.

What is 🧾 Buffer?

Buffers are the disk block representation of the data, which contains the metadata information of the files or data during the data transfer from one place to another that is stored under the page caches.

When a data request is made in the page cache, the kernel first checks the data in the buffer, which contains metadata that points to the actual files or data in the page caches, serving as an intermediary.

It usually follows sophisticated management strategies, such as first in, first out (FIFO) and first come, first served (FCFS), for buffering.

What is 📦 Cache?

A cache stores often-used data or instructions to speed up future requests by keeping copies in the kernel, specifically within the RAM, to enhance disk data access and improve I/O performance, basically providing faster access to data than the original source.

The data that will be cached is determined using complex algorithms, and modern CPUs provide a built-in mechanism for this task. Of course, you do not have to enable them separately; they’re preconfigured for your system.

It usually follows sophisticated management strategies, such as least recently used (LRU) or least frequently used (LFU), to decide which data to keep in the cache and which data to evict when the cache is full.

Comparison of Buffer and Cache

Here’s a side-by-side comparison of cache and buffer in tabular format to give you more insights:

FeatureBufferCache
PurposeTemporarily holds data during transfer between two processes or components to facilitate efficient data transfer.Stores frequently accessed data or instructions to improve data retrieval speed.
ContentHolds data exactly as it is received.Stores frequently accessed data or instructions.
ManagementTypically follows a FIFO or FCFS strategy.Uses complex algorithms (e.g., LRU, LFU) to decide what to cache.
Access SpeedDoes not necessarily provide faster access than the source.Provides faster access to data than the original source.
Usage ExamplesData transferred between a hard disk and memory may be buffered.Web browsers cache web pages for quick loading.
Storage LocationTemporary storage in RAM or other memory locationsTypically in high-speed memory (RAM or SSD)
Access ControlManaged by the system and sometimes by applicationsManaged by the system or applications
SizeBuffer sizes vary depending on the use case but can also be smaller than main memory or storage.Cache sizes are relatively small compared to main memory or storage.
Latency ReductionIt helps mitigate latency by optimizing data transfer.Reduces latency by storing frequently accessed data closer to the CPU.
Common Usage in LinuxUsed in I/O operations such as reading and writing to and from storage devices.File system caches and page caches are examples.

I hope their terminology and differences might be clear to you now. Let’s check out some practical examples on our Linux system to better understand them.

Understanding Buffer and Cache with Examples

You can monitor and manage the buffer and cache usage in Linux using various tools like “free“, “top“, and “vmstat“. These tools provide information about memory usage, including buffers and caches.

To see a better estimation of how much memory is really free and used by buffer or cache, run:

$ free -mh

Output:

free command output with data shown in megabytes with human-readable format

Command breakdown:

  • -m: show output in megabytes
  • -h: show human-readable output

If you look at the above picture, you will find the system has around 8 GB of RAM, of which 1.1 GB is used by the current running process, 46 MB for shared memory, and 2.7 GB is used to store buffer and cache.

Linux uses so much memory for disk cache to prevent RAM from going to waste. Keeping the cache means that if something needs the same data again, there’s a good chance it will still be in the cache in memory. [3]

Accessing data from there is approximately 1,000 times faster than retrieving it from the hard disk; if it’s not in the cache, the hard disk needs to be read anyway, but no time is lost in the cache scenario.

Q.1: What if you want to run more applications?

When you need to run resource-intensive applications that require more memory, your system will dynamically allocate the required memory to them by reclaiming it from already cached resources.

Q.2: Why do top and free show that most RAM is already consumed if it isn’t?

To understand why the top or free directly shows the complete size used to store buffer or cache data, it is to make it easier for the user to understand the memory consumption in their system [4].

To properly understand this, you need to become familiar with terms like occupied, used, free, and available in Linux, which are shown in the following table:

Memory that is used byYou’d call itLinux calls it
Running applicationsOccupiedOccupied
Buffer and CachedUsedUsed (and Available)
Rest of the MemoryFreeFree

So, in layman’s terms, the memory used by buffer and cached might show in the free command output that they are used, but behind the scenes, they are also available, but only when there is demand.

Q.3: From where does the data shown in the top or free command come from?

The actual data that top and free commands show comes from the “/proc/meminfo” file, a special kind of file in Linux that provides detailed information about the system’s memory usage and statistics.

💡
/proc is a special file system used to store information related to running processes.

You can directly read this file using the cat command to get information about your system memory.

$ cat /proc/meminfo

Output:

reading the meminfo file in proc directory using cat command

The values in “/proc/meminfo” are reported in kilobytes (KB), and you may need to interpret and analyze them in conjunction with other system monitoring tools and commands to get a comprehensive view of memory usage.

This data is often used by system administrators and developers to check the status of memory resources and identify potential issues related to memory utilization for diagnosis.

Demonstration of Buffer and Cache Working in Linux

At this point, you understood that the buffer is used to cache data that is about to be written, while cache is data that is already stored or cached in memory and used for reading data from files.

But there is a twist: buffer can also be used for reading, and cache can also be used for writing. Let me explain with an example.

📝
Before conducting this experiment, ensure you have a Linux system with appropriate permissions to create and manipulate files.

Experimenting with Buffer

To begin, open your terminal and start with an empty cache. You can clear the filesystem cache using the following command:

$ free -mh
$ echo 3 > /proc/sys/vm/drop_caches
$ free -mh

Output:

clearing the filesystem cache using the drop_caches file

The command “echo 3 > /proc/sys/vm/drop_caches” is used to clear cached data in the Linux kernel’s page cache. Whereas “3” means it clears both the page cache and the slab cache.

Now, run the following “vmstat 2” command in the same terminal window to monitor and display system performance statistics related to virtual memory at regular intervals (2 seconds in this case).

$ vmstat 2

Output:

vmstat command output

In the above picture, you have to only focus on the “buff”, which stands for buffer, and the “cache“, which stands for cache column, and the unit is measured in KB.

You can launch another terminal window or tab and run the following dd command to create a demo file that will perform read and write operations in different locations:

$ dd if=/dev/sda3 of=/dev/null bs=100M count=500

Output:

creating sample file using dd command for showcasing the buffer example

Now, immediately switch back to your previous terminal windows, running the “vmstat 2” command, and observe the changes in the “buff” column.

buffer size is increasing the the vmstat command output

When you examine the “buff” and “cache” columns, you’ll notice that while using the dd command to read the disk, both the buffer and cache sizes increase, but the buffer experiences significantly faster growth.

Experimenting with Cache

For this experiment, again open a new terminal window or use the existing one to clear the filesystem cache using the following command:

$ free -mh
$ echo 3 > /proc/sys/vm/drop_caches
$ free -mh

Output:

clearing the buffer and cache from the system using special drop_caches file

Now, run the following “vmstat 2” command in the same terminal window.

$ vmstat 2

Output:

vmstat command output

In the above picture, you have to only focus on the “cache” column, launch another terminal window or tab, and run the following command:

$ dd if=/dev/zero of=testfile bs=100M count=500

Output:

Creating a sample file using the dd command

Now, immediately switch back to your previous terminal windows, running the “vmstat 2” command, and observe the changes in the “cache” column.

Monitoring the vmstat 2 output

By observing the “cache” column, you find the dd command kept performing read and write operations in the memory while the “buffer” remained unchanged.

Meaning, when reading from the disk, data is stored in the buffer, whereas reading a file results in data being stored in the cache.

Final Word

In simple words, you can understand that a buffer is something that can cache data that needs to be written or data that will be read from the disk. While cache is used to cache files that are being read or written.

I hope this difference is clear to you. If you have any questions, queries, or find any corrections that need to be made, then let us know in the comment section.

Till then, peace!

Join The Conversation

Users are always welcome to leave comments about the articles, whether they are questions, comments, constructive criticism, old information, or notices of typos. Please keep in mind that all comments are moderated according to our comment policy.