Original LegUp DocuWiki page here: The L1 cache is split into separate instruction and data caches and is controlled directly by the processor. The L2 cache is a unified cache and is controlled by the L2C cache controller. The L1 data cache can only be used when the memory management unit MMU is on.
The L1 cache is split into separate instruction and data caches and is controlled directly by the processor. The L2 cache is a unified cache and is controlled by the L2C cache controller. The L1 data cache can only be used when the memory management unit MMU is on.
The L2 cache can be enabled by programming the L2C controller using memory-mapped registers. To see how the processor performs in various configurations, see the benchmarking results at ARM Benchmark Results.
Enabling the MMU The MMU translates virtual addresses used by the processor into physical addresses that correspond to actual memory locations.
It also controls the caching behaviour of and access to different sections of the memory space. Several steps are involved when turning on the MMU: The following steps should be taken: Invalidate instruction, data, and unified TLBs. Invalidate L1 instruction and data caches.
Invalidate branch predictor array. For simplicity only Level 1 translation tables are used. A flat one-to-one mapping is used where virtual addresses are mapped to the same physical address. This is done using 1MB 'sections'. Since the address space is 4GB, this requires translation table entries.
The L1 translation table must be aligned 16kB aligned in the memory. Set domain access control register Set DACR to client or master mode for the domain s you used in the translation table entries. The following steps are taken to enable the L2 cache controller: Set the way size Set the read, write, and hold delays for Tag RAM Set the read, write, and hold delays for Data RAM Set the prefetching behaviour Invalidate the cache Enable the L2C cache controller The L2C also includes event counting registers that can be used to monitor hit and miss rates, and events related to speculative reads and prefetching.
If the L2 cache is enabled, but the I and C bits are cleared, the processor cannot take advantage of the L2 cache. Memory Performance Optimizations The following additional settings greatly enhance memory performance:A ACE transactions; ARM Cortex‑A35 Processor Technical Reference Manual Revision r0p2.
Evictions of dirty lines from the L1 or L2 cache, or streaming writes that are not allocating into the cache. WriteClean: Evictions of dirty lines from the L2 cache, when the line is still present in an L1 cache.
Exploring the Design of the Cortex-A15 Processor ARM’s next generation mobile applications processor Travis Lanier Cortex-A9, Cortex-A5, Cortex-A15 Both processors using 32K L1 and 1MB L2 Caches, common memory system.
The ARM Cortex-A9 Processors. Introduction. If it missed in the L1 cache, then there is also the opportunity to hit in L2 cache before finally being forwarded to the main memory. Write transactions to any coherent memory region, the SCU will enforce coherence before the write is forwarded to the memory system.
ARM Cortex-A9 MPCore.
This isn't a direct answer to your question but I have a correction: on Cortex-A9 the L1 caches are 4-way set associative and cache lines are only 32 bytes large.
Getting that correct will probably have a larger impact on your modeling. Your specifications are correct for Cortex-A Specifically, I am interested in Cortex-A8, Cortex-A9, and Cortext-A My blind guess is that a Cortex-A9 processor has one L1 read port and one L1 write port which are 64 bits wide.
My other guess is that it has a single shared read/write port. ARM Cortex-A8 L1 Data cache = 32 KB. 4-Way, 64 B/line.
HVAB: additional 6-bit TAG (hashed value produced from VA and ASID/PID) for each cache line to speed up data cache way select in case of hit, so it doesn't need TLB and full tag read / compare in critical read path.