Well, the slowest steps in the fetch-execute cycle are those that require accessing memory (Englander, 2007). CPU caching is a technique developed to minimize the impact that accessing memory has on the overall processing performance of a CPU.
The technique involves placing a small amount (or multiple amounts) of high-speed memory between the CPU and main memory. This memory is referred to as cache memory, which contains copies of data stored in main memory. Because cache memory is generally located on the CPU itself, or in locations which are quicker to access than main memory, the technique serves to improve processing performance by reducing the number of trips across the memory bus required to access data stored in main memory. Briefly, this is how it works:
Cache memory differs from regular memory in that it is organized into blocks. Each block provides a relatively small amount of memory (generally 8 or 16 bytes) and containing copies of data from the most frequently used main memory locations. Each block also contains a tag with the address of the data in main memory that corresponds to the data being held in the block.
For each step in the fetch-execute cycle that requires accessing memory, the CPU first checks to see if the data exists in the cache before referencing main memory. The CPU accomplishes this through a component called the cache controller, which examines the tags to see if the address of the request is already stored within the cache.
If the CPU determines that the data is already stored in cache, it uses it as though it were stored in main memory; thereby, saving the performance cost of having to access memory across the memory bus. If the data is determined to not exist in the cache, the data is copied from main memory to the cache to potential later reuse.
When multiple separate amounts of cache memory are implemented, they are referred to as levels of cache. The level closest to cache controller, which is also generally the fastest to access, is referred to as L1 (for level 1). Subsequent levels of cache are referred to as L2 (level 2) and L3 (level 3); although rarely more than three levels of cache are implemented.
Read more: Karchworld Identity