UBIFS file system for OLPC
Artem Bityutskiy
Artem.Bityutskiy at nokia.com
Wed Dec 12 05:26:24 EST 2007
Dear OLPC community,
we are developing a new flash file system and we have the first stable alpha
version of it. The name of the file system is UBIFS. It is designed to work on
top of UBI which is a wear-leveling and volume management system for Flash
devices. For a little more information please, refer:
UBIFS: http://www.linux-mtd.infradead.org/doc/ubifs.html
UBI: http://www.linux-mtd.infradead.org/doc/ubi.html
UBI: http://www.linux-mtd.infradead.org/faq/ubi.html
========================
How to run UBIFS on OLPC
========================
1. Downloand USB stick image:
http://bombadil.infradead.org/~dedekind/olpc-redhat-stream-development-devel_ubifs.img.bz2
http://bombadil.infradead.org/~dedekind/olpc-redhat-stream-development-devel_ubifs.img.bz2.md5
2. bunzip2 olpc-redhat-stream-development-devel_ubifs.img.bz2
3. This image is an ext3 image, which is mostly identical to the Build 625
olpc-redhat-stream-development-devel_ext3.img.bz2. It contains partition table,
and you should run something like
dd if=olpc-redhat-stream-development-devel_ubifs.img of=/dev/sdb
4. Boot OLPC from the USB stick, open the terminal, become root, and go to
/ubifs_stuff directory.
5. Run install_ubifs.sh and it will install UBIFS on the internal NAND. It will
basically copy the contents of ext3 to the internal NAND, after formatting it
as UBIFS. Glance into the script - it is trivial.
WARNING: The script will wipe out all the data from your internal NAND flash.
6. Reboot, but do not remove the USB stick. Since OLPC bootloader does not
understand UBI/UBIFS, it will load the kernel and the initrd image from the USB
stick, which in turn, will mount NAND as the root file system. The USB stick
would not be needed otherwise.
===
Few notes: yes, for now we have to use hacks to put UBIFS on OLPC. The first
image is identical to the Build 625's ext3 image, but we put UBIFS-aware kernel
to it, and re-generated the initrd image. The init script in the initrd image
was also hacked a little, here is the patch:
http://bombadil.infradead.org/~dedekind/initrd.diff
The kernel we used is available at:
git://git.infradead.org/~dedekind/olpc-ubifs-2.6.git at branch stable-ubifs. It
is based on the stable branch, but we back-ported some stuff from 2.6.23.
==================
Some UBIFS details
==================
We do not have UBIFS design document of a white paper yet. But you are welcome
to ask questions.
UBIFS is not finished yet. Some features like xattr/ACL have to be implemented.
It needs a lot of optimizations. It needs more testing. It needs profiling. It
needs code-review and clean-ups. And so on. But the most difficult stuff is
already done.
We did test it to some extent, including recovery, which is needed if you
reboot uncleanly.
Here are some unsorted points about UBI/UBIFS without details.
1. At the moment we use LZO compression by default. We have zlib support as
well as "no compression" support, and this may be selected on per-inode basis.
2. UBIFS has write-back support which makes it fast on re-writes. But this area
needs improvements.
3. UBIFS does not scan whole media on mount and does not check it after mount
(JFFS2 does both). But we have to note, that UBI does scan the media, but it
reads only 1-2 pages from each eraseblock which is reasonably fast, at least
for OLPC's 1GiB flash. Currently OLPC NAND driver does not support multiple
writes per one NAND page, which makes UBI attach the flash longer (~2 seconds
now, but could be reduced to ~1.5 I think) and waste more space (4KiB per each
128KiB eraseblock, i.e., ~ 3.1%, but could be reduced to ~1.6%).
4. UBIFS is designed to be tolerant to unclean reboots (as well as UBI). When
recovering, UBIFS does not need to scan/read whole media, just the journal,
which has configurable size (~32MiB in OLPC by default).
5. The file system becomes much slower when it is close to be full, but this is
a normal for any flash FS. Indeed, we have to do a lot of GC work to turn small
pieces of dirty space scattered all over the place into free space.
6. There are difficulties with prediction of available space, it is _hard_, and
this is not ideal now in UBIFS. JFFS2 also has difficulties, but UBIFS has
more, because we support write-back. Indeed, if you have 10MiB of dirty pages
in the page cache, how do you know if they will be compressed or not, and how
much space they will take on flash? And this is not the only factor actually.
Most of the time, the df command will report _less_ available space then you
will actually be able to put on UBIFS. But it should never report more (the
prediction is pessimistic). This stuff needs improvements, we will work further
on this.
7. We do not have mkfs.ubifs yet.
8. All the on-flash and in-ram data structures in UBIFS are trees, so it is
logarithmically scalable.
9. Because 2.6.22 does not register_shrinker() interface, we disabled the TNC
shrinker for the OLPC kernel so far, but I think this may be fixed.
The shrinker is responsible for freeing clean znodes when the system needs
memory. Znodes are elements (nodes) of the TNC. TNC is the Tree Node Cache,
which is the in-memory cache of the on-flash indexing B-tree tree. The indexing
B-tree indexes the whole contents of the FS.
10. With UBI/UBIFS stack, OLPC does not have to store everything on one
partition. UBI gives you flexibility - you may have a separate UBI volume for
/home or /boot, etc. UBI takes care about wear leveling across whole chip.
We are very interested in your feedback. And of course, we are interested to
get help from you.
Adrian Hunter,
Artem Bityutskiy,
University of Szeged team
More information about the Devel
mailing list