APFS: Snapshots

In APFS, snapshots are intended to be quick and simple to create, and provide a complete mountable copy of a volume at an instant in time. They thus consist of the required file system metadata, the complete set of extents for stored data that would be required to reinstate that volume as it was then, together with a copy of the volume superblock at the time the snapshot was created.

As further changes are made to an active volume, differences increase between the current file system and its extents, and those of all snapshots of that volume. Ultimately, if a snapshot was left until everything on the active version of that volume was different, then that snapshot would effectively occupy the entire space used by that volume at the time the snapshot was made.

Because of this growth potential, Apple restricts access to snapshot features to code to which it grants the entitlements com.apple.developer.vfs.snapshot and com.apple.private.apfs.revert-to-snapshot. So far, it appears to have approved only apps that make backups and automatically delete their old snapshots to prevent them from overwhelming storage space. Occasionally, snapshots can become orphaned, in that the app that created them loses track of their existence, or is unable to delete them as intended. In those circumstances, manual deletion using an app with the required entitlements can free up a lot of space.

Metadata and objects

Each snapshot has its own Snapshot Metadata together with its extensions, including:

the object ID (OID) of the Extent Reference tree
the Volume Superblock OID
datestamps of the snapshot’s creation and last modification
an inum, which isn’t explained
flags, including one to indicate that there are dataless files pending, and another for a merge in progress
the snapshot’s name
the last transaction ID (XID) included in that snapshot
its UUID.

These are summarised in the diagram below.

Snapshots are declared as being read-only, and none of the operations performed on them in normal use appears to change their contents. Apple doesn’t explain why they contain their date of last modification, as that should always be the same as that of their creation.

Transaction IDs, or XIDs, are important throughout APFS, and of critical importance in snapshots, as they keep track of the order of events within the file system. Each transaction in a file system is uniquely identified by this number, a 64-bit unsigned integer that increases monotonically from a value of 1 (zero isn’t a valid XID on disk). Because of its type, APFS should never run out of XIDs: Apple cites an example of creating a million transactions every second, in which case it would take over half a million years before the highest permissible XID was reached. If you could wait that long, you’d be disappointed to discover that the XID can’t roll over, and just becomes invalid.

A snapshot life-cycle consists of creation, mounting and unmounting, and finally deletion.

Creation

Snapshot creation typically takes around 0.01 second. The first action recorded in the log is the locking of container global extents, to freeze them while the snapshot is created. Snapshot creation then waits for the volume purgatory cleaner to finish any cleaning it’s doing, then the log records
fs_insert_snapshot_metadata (249)
where 249 is the internal ID of that APFS operation. This log entry gives the snapshot name, the XID of the snapshot to be used as its identifier in future log entries, its Extent References and Volume Superblock OIDs. The current volume’s extent reference tree is moved to the snapshot, a fresh, empty extent reference tree is created, and its OID becomes the new extent reference tree for the active volume.

Once that’s complete, container global extents are unlocked, and APFS resumes normal service.

Mount

Snapshots can be mounted like any other volume, and their contents accessed. This is a brief process, marked by the log entry handle_snapshot_mount (1144) giving the volume, with its snapshot XID and superblock OID. Following that entry is handle_mount (817), which gives the UUID, block size, block count, encryption and features of the volume. Finally, nx_volume_group_update (704) gives the name and role of the freshly mounted snapshot.

Unmount

The log sequence for unmounting a snapshot is more lengthy, and has to ensure the number of mounted volumes is adjusted, as well as removing the volume from the virtual file system (VFS) used at a kernel level:

apfs_log_op_with_proc (3091) gives the volume name
gbitmap_update_thread_terminate (6110) is called but skipped
apfs_vfsop_unmount (3501) reports that snapshot ‘deletion’ has been completed on the live file system
apfs_vfsop_unmount (3567) reports the number of mounted volumes for that container
apfs: total mem allocated gives the amount of memory allocated to APFS
apfs_vfsop_unmount (3580) then reports all done. going home followed by the total number of mounted APFS volumes.

Deletion

While snapshots are made in a fraction of a second, deleting them is a painstaking job for the container’s single Reaper task. First, container global extents are locked, just as for snapshot creation. Deletion starts with the log entry apfs_snap_vnop_remove (1107) and the named snapshot with its XID. The snapshot is apparently renamed with an entry of fs_insert_snapshot_metadata (249) similar to that during creation, to a name like com.apple.apfs.purgatory.3abc9 for the Reaper’s attention, following which container global extents are unlocked again.

The Reaper then starts to process the deleted snapshot, in the following log entry sequence:

cleanup_snapshot_purgatory (1805) announces the start of the process
delete_clone_fs (2915) is a volume backward merge from the following snapshot (with a higher XID) to the one being deleted; this gives current and next tree counts
cleanup_snapshot_purgatory (1826) announces that processing is complete

The volume backward merge is performed to reconcile extents between snapshots, and determine which extents are freed for reuse as a result of the deletion.

The final phase is omap_cleanup (1544), the start of cleaning up the object map of the deleted snapshot. When that’s complete, the log entry reports the total snapshot count for that volume, and the XID of the most recent remaining snapshot.

For the removal of a fairly minimal snapshot, this process may take around 0.2 seconds, mainly with cleaning up the object map.

Summary

Snapshots are quick and simple to create, and provide a complete mountable copy of a volume at an instant in time.
A snapshot consists of Snapshot Metadata and its extensions, extents required to reinstate that volume, and a copy of the volume superblock.
Snapshots can only be created and maintained by code granted restricted entitlements by Apple.
Transaction IDs, XIDs, are of critical importance, and used to identify snapshots in log entries.
During snapshot creation, container global extents are locked, the snapshot metadata created, the current extent reference tree moved to it, and a new tree created for the active volume.
Mounting a snapshot is also quick and simple, with a short series of log entries identifying it by XID.
Unmounting is slightly more complicated, as the snapshot has to be removed from the live file system, and mounted volume counts adjusted.
Deletion is slower and more complex still, including renaming for the Reaper, a backward merge from the following snapshot, then cleaning up its object map.

Articles in this series

1. Files and clones
2. Directories and names
3. Containers and volumes

Reference

Apple’s APFS Reference (PDF), last revised 22 June 2020.