Speed or security? Speculative execution in Apple silicon

Making a CPU do more work requires more than increasing its frequency, it needs removal of obstacles that can prevent it from making best use of those cycles. Among the most important of those is memory access. High-speed local caches, L1 and L2, can be a great help, but in the worst case fetching data from memory can still take hundreds of CPU core cycles, and that memory latency may then delay a running process. This article explains some techniques that are used in the CPU cores of Apple silicon chips, to improve processing speed by making execution more efficient and less likely to be delayed.

Out-of-order execution

No matter how well a compiler and build system might try to optimise the instructions they assemble into executable code, when it comes to running that code there are ways to improve its efficiency. Modern CPU cores use a pipeline architecture for processing instructions, and can reorder them to maintain optimum instruction throughput. This uses a re-order buffer (ROB), which can be large to allow for greatest optimisation. All Apple silicon CPU cores, from the M1 onwards, use out-of-order execution with ROBs, and more recent families appear to have undergone further improvement.

In addition to executing instructions out of order, many modern processors perform speculative execution. For example, when code is running a loop to perform repeated series of operations, the core will speculate that it will keep running that loop, so rather than wait to work out whether it should loop again, it presses on. If it then turns out that it had reached the end of the loop phase, the core rolls back to where it entered the loop and follows the correct branch.

Although this wastes a little time on the last run of each loop, if it’s had to loop a million times before that, accumulated time savings can be considerable. However, on its own speculative execution can be limited by data that has to be loaded from memory in each loop, so more recently CPU cores have tried to speculate on the data they require.

Load address prediction

One common pattern of data access within code loops is in their addresses in memory. This occurs when the loop is working through a stored array of data, where the address of each item is at a constant address increment. For this, the core watches the series of addresses being accessed, and once it detects that they follow a regular pattern, it performs Load Address Prediction (LAP) to guess the next address to be used.

The core then performs two functions simultaneously: it proceeds to execute the loop using the guessed address, while continuing to load the actual address. Once it can, it then compares the predicted and actual addresses. If it guessed correctly, it continues execution; if it guessed wrong, then it rolls back in the code, uses the actual address, and resumes execution with that instead.

As with speculative execution, this pays off when there are a million addresses in a strict pattern, but loses out when a pattern breaks.

Load value prediction

LAP only handles addresses in memory, whose contents may differ. In other cases, values fetched from memory can be identical. To take advantage of that, the core can watch the value being loaded each time the code passes through the loop. This might represent a constant being used in a calculation, for example.

When the core sees that the same value is being used each time, it performs Load Value Prediction (LVP) to guess the next value to be loaded. This works essentially the same as LAP, with comparison between the predicted and actual values used to determine whether to proceed or to roll back and use the correct value.

This diagram summarises the three types of speculative execution now used in Apple silicon CPU cores, and identifies which families in the M-series use each.

Vulnerabilities

Speculative execution was first discovered to be vulnerable in 2017, and this was made public seven years ago, in early 2018, in a class of attack techniques known as Spectre. LAP and LVP were demonstrated and exploited in SLAP and FLOP in 2024-25.

Mechanisms for exploiting speculative designs are complex, and rely on a combination of training and misprediction to give an attacker access to the memory of other processes. The only robust protection is to disable speculation altogether, although various types of mitigation have also been developed for Spectre. The impact of disabling speculative execution, LAP or LVP greatly impairs performance in many situations, and isn’t generally considered commercially feasible.

Risks

The existence of vulnerabilities that can be exploited might appear worrying, particularly as their demonstrations use JavaScript running in crafted websites. But translating those into a significant risk is more challenging, and a task for Apple and those who develop browsers to run in macOS. It’s also a challenge to third-parties who develop security software, as detecting attempts to exploit vulnerabilities in speculative behaviour is relatively novel.

One reason we haven’t seen many (if any) attacks using the Spectre family of vulnerabilities is that they’re hardware specific. For an attacker to use them successfully on a worthwhile proportion of computers, they would need to detect the CPU and run code developed specifically for that. SLAP and FLOP are similar, in that neither would succeed on Intel or M1 Macs, and FLOP requires the LVP support of an M3 or M4. They’re also reliant on locating valuable secrets in memory. If you never open potentially malicious web pages when your browser already has exploitable pages loaded, then they’re unlikely to be able to take advantage of the opportunity.

Where these vulnerabilities are more likely to be exploited is in more sophisticated, targeted attacks that succeed most when undetected for long periods, those more typical of surveillance by affiliates of nation-states.

In the longer term, as more advanced CPU cores become commonplace, risks inherent in speculative execution can only grow, unless Apple and other core designers address these vulnerabilities effectively. What today is impressive leading-edge security research will help change tomorrow’s processor designs.

Speed or security? Speculative execution in Apple silicon

Out-of-order execution

Load address prediction

Load value prediction

Vulnerabilities

Risks

Further reading

Solutions to Saturday Mac riddles 308

Resolve time better in LogUI build 48

HBO’s The Last of Us S2E6 recap: Look who’s back!

Solutions to Saturday Mac riddles 308

Resolve time better in LogUI build 48

HBO’s The Last of Us S2E6 recap: Look who’s back!

Apple’s AI Crisis: WWDC 2025 Nears Without Any Siri Upgrade

Transparent Powerbeats Pro 2 prototype appears in newly leaked image

Out-of-order execution

Load address prediction

Load value prediction

Vulnerabilities

Risks

Further reading

More Stories

You may have missed