ext4 jbd2 buffer-head migrate corruption

Filesystems which use buffer-heads where it cannot guarantees that there are no other references to the folio, for example with a folio lock, must use buffer_migrate_folio_norefs() for the address space mapping migrate_folio() callback. There are only 3 filesystems which use this callback:

the block device cache
ext4 for its ext4_journalled_aops, ie, jbd2
nilfs2

jbd2's use of this however callback however is very race prone, consider folio migration while reviewing jbd2_journal_write_metadata_buffer() and the fact that jbd2:

does not hold the folio lock
does not have have page writeback bit set
does not lock the buffer

And so, it can race with folio_set_bh() on folio migration. The commit ebdf4de5642fb6 ("mm: migrate: fix reference check race between __find_get_block() and migration") added a spin lock to prevent races with page migration which ext4 users were reporting through the SUSE bugzilla bnc#1137609 .

Although we don't have exact traces of the original filesystem corruption we can can reproduce filesystem corruption on ext4 on Linus' tree today on v6.15-rc2, that is with commit ebdf4de5642fb6 merged, by running the generic/750 for about ~ 20 hours on ext4 2k block size filesystem profile.

Reproducing with kdevops

This is easily reproducible with kdevops using:

make defconfig-ext4_2k SOAK_DURATION=432000
make -j128
make bringup
make linux
make fstests
make fstests-baseline TESTS="generic/750"

Traces

See the traces/ directory.

General pattern

We now have a slew of traces collected for the ext4 corruptions possible, we've used ChatGPT provide a summary of them:

do_writepages() # write back -->
   ext4_map_block() # performs logical to physical block mapping -->
     ext4_ext_insert_extent() # updates extent tree -->
       jbd2_journal_dirty_metadata()  # marks metadata as dirty for
                                      # journaling. This can lead
                                      # to any of the following hints
                                      # as to what happened from
                                      # ext4 / jbd2

         - Directory and extent metadata corruption splats or

         - Failure to handle out-of-space conditions gracefully, with
           cascading metadata errors and eventual filesystem shutdown
           to prevent further damage.

         - Failure to journal new extent metadata during extent tree
           growth, triggered under memory pressure or heavy writeback.
           Commonly results in ENOSPC, journal abort, and read-only
           fallback. **

         - Journal metadata failure during extent tree growth causes
           read-only fallback. Seen repeatedly on small-block (2k)
           filesystems under stress (e.g. fsstress). Triggers errors in
           bitmap and inode updates, and persists in journal replay logs.
           "Error count since last fsck" shows long-term corruption
           footprint.

Call trace (ENOSPC journal failure):
  do_writepages()
    → ext4_do_writepages()
      → ext4_map_blocks()
        → ext4_ext_map_blocks()
          → ext4_ext_insert_extent()
            → __ext4_handle_dirty_metadata()
              → jbd2_journal_dirty_metadata() → ERROR -28 (ENOSPC)

And so jbd2 still needs more work to avoid races with folio migration.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
README.md		README.md
trace-0001.txt		trace-0001.txt
trace-0002.txt		trace-0002.txt
trace-0003.txt		trace-0003.txt
trace-0004-fsck.txt		trace-0004-fsck.txt
trace-0004.txt		trace-0004.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ext4 jbd2 buffer-head migrate corruption

Reproducing with kdevops

Traces

General pattern

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

ext4 jbd2 buffer-head migrate corruption

Reproducing with kdevops

Traces

General pattern

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages