risk factor - ext4 and disk corruption

Martin Langhoff martin at laptop.org
Wed Sep 7 11:06:19 EDT 2011


A quick heads up -- we've seen Jon Nettleton hit some disk corruption
repeatedly on his development/test machine. Diagnosis of the prob led
us to http://dev.laptop.org/ticket/11210 (erase-blocks doesn't work
with Toshiba eMMC devices - which is what we have on SKU 198 -
membrane kb units) but it is unclear whether that is the issue.

This is one of our current risks -- it is a concern because we need to
know real soon whether this is a hw issue (related to the eMMC parts)
or not. We have hit corruption issues with ext4 in the past (#9513)
and fell back to ext3. It is not clear however that ext4 is suspect:
the whole 11.2.0 dev cycle was done under ext4 and no disk corruption
incidents were reported AFAIK.

>From IRC discussion -- he seemed to hit it while:
 - developing, compiling,
 - using an ext SD card
 - running a patched kernel and xorg
 - presumably crashing a lot

We need to consider action around this

 - try to force the error -- I'll set up a test rig for this, applying
unclean shutdowns on a couple of SKU198 units
 - keep our eyes open for disk corruption reports, specially in builds
including the new gfx code
 - potentially switch back to ext3 in OOB

It is very hard to prove a negative - tracking this at
http://dev.laptop.org/ticket/11220

cheers,



m
-- 
 martin at laptop.org -- Software Architect - OLPC
 - ask interesting questions
 - don't get distracted with shiny stuff  - working code first
 - http://wiki.laptop.org/go/User:Martinlanghoff



More information about the Devel mailing list