Disk Images: How read-write disk images have gone sparse

0

Until about three years ago, most types of disk image had fixed size. When you created a read-write disk image (UDRW) of 10 GB, it occupied the same 10 GB on disk whether it was full or empty. The only two that could grow and shrink in size were sparse bundles and sparse disk images, with the former generally preferred for its better resizing and performance.

This changed silently in Monterey, since when read-write disk images have been automatically resized by macOS, and saved in APFS sparse file format. As this remains undocumented, this article explains how this works, and how and where you can use it to your advantage.

Sparse

In this context, the word sparse refers to two very different properties:

Sparse bundles and sparse (disk) images are types of disk image that can change in size, and can be stored on a wide range of file systems and storage, including on NAS and other networked storage.
Sparse files are stored in a highly efficient file format available in modern file systems, where only the data in a file is stored, and empty space within that file doesn’t waste storage space. It’s available in APFS, but not in HFS+.

Requirements

For read-write disk images to be sparse, the following are required:

The disk image must be saved to, and remain on, an APFS volume.
The file system within the disk image can be either APFS or HFS+, but not FAT or ExFAT.
The disk image must be created and unmounted first. For that initial mount, the disk image isn’t a sparse file, so occupies its full size on disk.
Whenever that disk image is mounted again, and has sufficient free space within its set limit, it will be saved in sparse file format, and occupy less than its full size on disk.

Demonstration

Create a new read-write disk image of at least 1 GB size with an internal file system of APFS on an APFS volume, using DropDMG, Disk Utility, or an alternative.
Once it has been created and mounted, unmount it in the Finder.
Select the disk image in the Finder and open the Get Info dialog for it. Confirm that its size on disk is the same as that set.
(Optional) Open the disk image using Precize, and confirm that it’s not a sparse file.
Double-click the disk image to mount it in the Finder, and wait at least 10 seconds before unmounting it.
Select the disk image in the Finder and open the Get Info dialog for it. Confirm that its size on disk is now significantly less than that set.
(Optional) Open the disk image using Precize, and confirm that it has now become a sparse file.

How it works

When you create the disk image, macOS creates and attaches its container, and creates and mounts the file system within that. This is then saved to disk as a regular file occupying the full size of the disk image, plus the overhead incurred by the disk image container itself. No sparse files are involved at this stage.

When that disk image is mounted next, its container is attached through diskarbitrationd, then its file system is mounted. If that’s APFS (or HFS+), it undergoes Trimming, as with other mounts. That coalesces free storage blocks within the image to form one contiguous free space. The disk image is then saved in APFS sparse file format, skipping that contiguous free space. When the file system has been unmounted and the container detached, the space used to store the disk image has shrunk to the space actually used within the disk image, plus the container overhead. Unless the disk image is almost full, the amount of space required to store it on disk will be smaller than the full size of the disk image.

This is summarised in the diagram below.

The size of read-write disk images is therefore variable depending on the contents, the effectiveness of Trimming in coalescing free space, and the efficiency of APFS sparse file format.

Conversion to sparse file

When mounting an APFS file system in a read-write disk image, APFS tests whether the container backing store is a sparse format, or a flat file. In the case of a newly created read-write disk image that hasn’t yet been converted into a sparse file, that’s detected prior to Spaceman (the APFS Space Manager) scanning for free blocks within its file system. When free blocks are found, APFS sets the type of backing store to sparse, gathers the sparse bytes and ‘punches a hole’ in the disk image’s file extents to convert the container file into sparse format. That appears in the log as:
handle_apfs_set_backingstore:6172: disk5s1 Set backing store as sparse
handle_apfs_set_backingstore:6205: disk5 Backing storage is a raw file
_punch_hole_cb:37665: disk3s5 Accumulated 4294967296 sparse bytes for inode 30473932 in transaction 3246918, pausing hole punching
where disk5s1 is the disk image’s mounted volume, and disk3s5 is the volume in which the disk image container is stored.

Trimming for efficient use of space

That conversion to sparse format is normally only performed once, but from then on, each time that disk image is mounted it’s recognised as having a sparse backup store, and Spaceman performs a Trim to coalesce free blocks and optimise on-disk storage requirements. For an empty read-write disk image of 2,390,202 blocks of 4,096 bytes each, as created in a 10 GB disk image, log entries are:
spaceman_scan_free_blocks:4106: disk5 scan took 0.000722 s, trims took 0.000643 s
spaceman_scan_free_blocks:4110: disk5 2382929 blocks free in 7 extents, avg 340418.42
spaceman_scan_free_blocks:4119: disk5 2382929 blocks trimmed in 7 extents (91 us/trim, 10886 trims/s)
spaceman_scan_free_blocks:4122: disk5 trim distribution 1:0 2+:0 4+:0 16+:0 64+:0 256+:7
accounting for a total of 9.8 GB.

Changes made to the contents of the disk image lead to a gradual reduction in Trim yield. For example, after adding files to the disk image and deleting them, instead of yielding the full 9.8 GB, only 2,319,074 blocks remain free, yielding a total of 9.5 GB.

For comparison, initial Trimming on a matching empty sparse bundle yields the same 9.8 GB. After file copying and deletion, and compaction of the sparse bundle, Trimming performs slightly better, yielding 2,382,929 free blocks for a total of 9.6 GB. Note that Trimming of sparse bundles is performed by APFS Spaceman separately from management of bands in backing storage, which isn’t a function of the file system.

Size efficiency

Although read-write disk images stored as sparse files are efficient in their use of disk space, they’re still not as compact as sparse bundles. For an empty 10 GB image, the read-write type requires 240 MB on disk, but a sparse bundle only needs 13.9 MB. After light use storing files, then deleting the whole contents, a 10 GB read-write disk image grows to occupy 501 MB, but following compaction a sparse bundle only takes 150 MB. That difference may not remain consistent over more prolonged use, though, and ultimately compacting sparse bundles may cease freeing any space at all.

It’s also important to remember that sparse bundles need to be compacted periodically, if any of their contents are deleted, or they may not reduce in size after deletions. Read-write disk images can’t be compacted, and reclaim disk space automatically.

Benefits and penalties

Read-write disk images saved as sparse files are different from sparse bundles in many ways. Like any sparse file, the disk image still has the same nominal size as its full size, and differs in the space taken on disk. Sparse bundles should normally only have band files sufficient to accommodate their current size, so their nominal size remains similar to the space they take on disk. The result is that, while read-write disk images in sparse file format will help increase free disk space, their major benefit is in reducing ‘wear’ in SSDs by not wasting erase-write cycles storing empty data.

Unlike the band file structure in sparse bundles, which can be stored on almost any disk, APFS sparse files have to be treated carefully if they are to remain compact. Moving them to another file system or over a network is likely to result in their being exploded to full size, and I have explained those limitations recently.

Both read-write disk images and sparse bundles deliver good read performance, but write performance is significantly impaired in read-write disk images but not sparse files. Encryption of disk images and sparse bundles also has significant effects on their performance, and in some cases write performance is badly affected by encryption. I have previously documented their performance in macOS Monterey, and will be updating those figures for Sequoia shortly.

Summary

Read-write disk images saved on APFS storage in Monterey and later are no longer of fixed size, and should use significantly less disk space unless full, provided the disk image has been unmounted at least once since creation, and the file system in the disk image is APFS or HFS+.
This is because macOS now saves the disk image in sparse file format.
To retain the disk image in sparse file format, it needs to remain in APFS volumes, and normal precautions are required to maintain its efficient use of space. The disk image can’t be manually compacted, in the way that sparse bundles require to be.
The main benefit of this strategy is to minimise erase-write cycles, so reducing ‘wear’ on SSDs.
Sparse bundles are still more efficient in their use of disk space, and have higher write speeds, but read-write disk images are now closer in their efficiency and performance.

Previous articles

Introduction
Tools

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.