Current Challenges in UBIFS – Richard Weinberger, Sigma Star GmbH

UBI is … a kind of LVM for flash devices. Static volumes have checksumming (i.e. consistency is guaranteed), dynamic volumes don’t. UBIFS doesn’t use OOB, because you don’t know how much is available (e.g. it could all be used for ECC).

UBIFS is powercut tolerant, as opposed to almost all FTL devices (the device itself is often not powercut tolerant).

Flash chips are getting cheaper and bigger by using lower quality techniques, so even the reliability of SLC is not so great anymore.

Fastmap keeps a fixed pool of “open” blocks which will need to be scanned while attaching. When all of these have been written, a new fastmap checkpoint is written. The blocks containing the fastmap itself are pointed to by a superblock, which must be within the first 64 EB of the partition (arbitrary number). Usually the fastmap is a single block and is merged with the superblock. To enable fastmap, set CONFIG_MTD_UBI_FASTMAP=y and (once) set the fm_autoconvert UBI module parameter to 1; once the fastmap has been created it will be used. ubinize doesn’t create the fastmap, because it doesn’t know what the bad blocks are so can’t predict the exact location of each block.

Since 3.15, ubiblock (= UBI volume as RO block device, cfr. mtdblock, ideal for squashfs) is mainlined.

Since 4.0, online renaming of volumes is mainlined. Makes dual image upgrade much easier.

Data retention was not a big deal when UBI was created. Nowadays, both for MLC and SLC, read disturb is important: when you read a page very frequently, nearby pages may get corrupted (note: this gets worse when the flash has been erased a lot). Solution: read everything occasionally, so ECC detects bitflips and scrubs the block. It’s not sufficient to just read /dev/ubi0_X, since then you don’t read the metadata. So instead, a new ioctl to get usage statistics, and a userspace daemon ubi-healthd to decide if it’s time to do scrubbing. Since only relative info is needed, there’s no need to store the statistics on flash.

Paired pages: when writing a page, and a powercut happens, another page may get corrupted as well. Several solutions are being evaluated, see ELC presentation by Boris Brezillon.

Bitflips on empty space: UBIFS detects empty space by comparing with 0xff, so if there’s a bitflip in empty space, UBIFS assumes that a powercut has happened there. If there is ECC on erased blocks that’s OK, but not all NFC support that.

Recent patches add quota and access time support to UBIFS, so it can also be used for large data storage.

An unpacker is being created to be able to read a UBIFS image offline. This can be used to do an offline fsck and find out what went wrong.

Common UBIFS misuses: A separate UBI image for each UBIFS volume… this reduces the amount of space available for wear levelling. UBIFS in the bootloader: that’s way too much overhead, just support a static RO volume for the kernel, that’s much much easier (few hundres lines of code). There’s an implementation of fastmap in U-Boot, but it’s currently broken since it’s based on the old experimental code.

Although MLC has already been around for five years, people only care about it now because in the past we just used SLC or eMMC.

Unstable bit problem: a powercut during a block erase may create unstable bits which can never be read reliably anymore. However, this issue has never been demonstrated in practice.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s