How checking hashes can slow your Mac

Comparing Macs by their clock speed, the frequency of their CPU, can mislead badly. Because my Intel Mac has 8 cores running at 3.2 GHz doesn’t bring it close to the 6 P-cores running at 4.1 GHz in my M3 Pro, or even its 6 E-cores ambling along at no more than 2.7 GHz. One everyday function where you may notice this most is in computation of hashes.

Although not something we often choose to do, parts of macOS rely on that, and make the difference apparent in everyday tasks, like opening an app. This article explains how older Macs can perform poorly relative to Apple silicon models, and how that can affect you. It also brings an update to my integrity-checking utility Dintch that you may find interesting.

Checking hashes

Although we associate code signing with security certificates, as I pointed out in my brief history, in recent years its use has come to focus more on CDHashes within the signature, as a means of identification and verification of executable code.

Checking the integrity of files such as those containing executable code is simplest performed using a checksum, by adding all the data together as if it were mere numbers, using modular arithmetic, and comparing that sum with what’s expected. The snag with doing that using checksums is that they can all too easily result in false negatives, and fail to detect changes. More sophisticated checksums (Fletcher 64) are used to verify file system metadata in APFS because of their ease and speed of computation, but they’re not strong enough to use for security.

Their replacement for security purposes is the family of hash functions known as Secure Hash Algorithms. SHA-1 uses 160 bits, but was withdrawn from use after weaknesses were discovered, and replaced in most uses by SHA-256. That uses a more complex method to generate a 256-bit number using a one-way process, so there’s no way you can tell what the original contained from its hash, and each hash can fairly safely be assumed to be unique.

Inside each macOS code signature is a data structure containing SHA-256 hashes for all the code and other data protected by that signature, and that code directory is itself hashed to produce Code Directory Hashes, CDHashes (or cdhashes if you prefer). macOS security systems check the integrity of the code directory by comparing its saved CDHashes with fresh hashes of its contents, and can check the integrity of the code itself by comparing its hashes in the code directory with fresh hashes of the contents. There’s a lot of hashing going on.

Hashing performance

We tend to take hashing for granted and assume that it just happens almost instantly. In fact, even when optimised in the likes of CryptoKit in macOS it’s computationally intensive, and throughput can be significantly slower than that of reading data from a fast SSD. To compare SHA-256 hashing performance on a range of Macs, I have modified my free file integrity utility Dintch to report the time taken by hashing operations. Details of this new version are given below.

I created files of standard sizes between 1 MB and 10 GB, tagged them with SHA-256 hashes using Dintch, then checked those hashes with time measurements. To check hashes, Dintch streams a file from disk while CryptoKit computes the hash on that stream, as shown in the source code in the Appendix at the end. Another advantage of using Dintch is that it has control over which CPU cores it’s run on, in Apple silicon Macs, by setting the QoS for that thread.

My iMac Pro proved far slower than either a MacBook Pro M3 Pro or Mac mini M4 Pro, even when the hashing was performed at minimum QoS and run on the E cores of the latter two Macs. At its fastest, with a high QoS, the iMac Pro took 2.4 seconds to hash a 1 GB file, which took only 2.1 seconds on the M3 Pro’s E cores, or 0.34 seconds on the M4 Pro’s P cores. Other results are illustrated in the chart below.

This shows times required to hash files of between 1 and 100 MB size, with linear regressions fitted with additional values for 1 and 10 GB. Points and lines in red are those at high QoS, and those in blue at low QoS. The iMac Pro points are filled circles, those for the M3 Pro crosses, and the M4 Pro open diamonds. The upper pair of lines are from the iMac Pro, and slower than even the E cores in the M3 Pro. The almost coincident lines closest to the X axis are those for P cores in the M3 Pro and M4 Pro.

Converting those regression coefficients into hashing rates gives the following:

iMac Pro 0.41 / 0.18 GB/s (fast / slow)
M3 Pro 2.71 / 0.52 GB/s
M4 Pro 2.92 / 0.64 GB/s.

All of these rates are slower than the read speeds of the SSD containing the files being hashed, indicating that performance limitation is the result of the hash computation.

Mitigations

When looking at slow launches of apps, I recently used Pages and Calibre as examples. If they were required to undergo full verification of their stored CDHashes, including the whole protected contents, that would take significant time. Considering just the Frameworks folder in Pages’ app bundle, that amounts to nearly 250 MB, taking well over 0.5 seconds running at high QoS on an iMac Pro, but that could be less than 0.1 seconds on an M4 Pro.

Although the Mach-O binaries in Calibre’s MacOS folder are only just over 2 MB in size, its Frameworks folder is almost 1 GB, so would take over 2.5 seconds on the iMac Pro, but only around 0.4 seconds on the M4 Pro.

In practice such hefty computational tasks are avoided. macOS uses cached values for hashes as much as possible, and doesn’t appear to verify the whole protected contents of its CDHashes even during first run security checks. Hashes for protected contents are also saved in the code directory in per-page form, so that each page can be verified individually, without having to compute the hash for an entire Mach-O binary. As explained in Apple’s Tech Note, “This allows the system to run a code-signed executable and check its code signature lazily (in the computer science sense of that word).” “macOS doesn’t always check code as it’s paged in.”

But when a hash does need to be verified, expect substantial performance differences between Macs that might appear to have similar numbers of cores and clock frequencies.

Dintch 1.7

Dintch is one of three components of my file integrity system. This new version is recommended for those running Big Sur and later, as it optimises code for those more recent versions of macOS, and for Apple silicon Macs.

In addition, this new version 1.7 shows the time taken to check each file that has already been tagged with a SHA-256 hash. Even if you don’t want to use the app to protect file integrity, it can provide you with SHA-256 performance figures. To do that, click on the Tag button and select the folder whose files you want to use in the test. Once they have been tagged, tick the Verbose option, click the Check button and select that folder. The app’s window will show the time taken to check the SHA-256 hash on each tagged file in that folder. You can then compare the performance of your Mac against the figures I have given above for my Macs.

Dintch 1.7 is now available from here, for Big Sur and later: dintch17
from Downloads above, its Product Page, and via its auto-update mechanism.

Its siblings are:
Fintch 1.3 (Universal App for High Sierra to Sequoia) for drag-and-drop use on small folders and individual files, and
cintch 3 (Universal binary for Big Sur to Sequoia), a command tool with similar features.
Fuller details are on their Product Page.

Enjoy!

Reference

Apple TN3126: Inside Code Signing: Hashes

Appendix: Source code

This uses CryptoKit to hash a file stream using a buffer of size theBufferSize.

func getHash(thePath: String) -> SHA256Digest? {
var hasher = SHA256()
var theDigest: SHA256Digest?
if let theFileStream = InputStream(fileAtPath: thePath) {
theFileStream.open()
let theBuffer = UnsafeMutablePointer.allocate(capacity: theBufferSize)
while theFileStream.hasBytesAvailable {
let read = theFileStream.read(theBuffer, maxLength: theBufferSize)
if read >= 0 {
let theBufferPointer = UnsafeRawBufferPointer(start: theBuffer, count: read)
hasher.update(bufferPointer: theBufferPointer)
}
}
theDigest = hasher.finalize()
theBuffer.deallocate()
theFileStream.close()
}
return theDigest
}