[Server-devel] XS server: raid 6 configuration with build 160, 161, 162 not ok

John Watlington wad at laptop.org
Tue Apr 8 10:46:43 EDT 2008


On Apr 7, 2008, at 12:24 AM, Stefan Reitz wrote:

> Something is fishy.
>
> The below described symptoms can all be seen with build 160.
>
> For a little while I thought I may have been alarming everyone  
> without properly checking possible hardware causes. I ran drive  
> tests and one sata cable has come under suspicion of being the  
> culprit. It turned out to be non reliable. It caused an  
> intermittent contact problem that got previously covered up by a  
> running (CentOS) raid 6.
> But replacing the sata cable and running the hdd manufacturer's  
> utilities didn't fix it (low level format of all 4 drives. Then ran  
> diagnostics on them. No errors).
> Removing just (any) one drive seems to make the install work.

I've lost more than one day due to flaky SATA cables.

> I am confused by the XS setup messages:
> After the line: Starting HAL daemon: [ok]
> comes: FATAL: module md not found.
> then come several raid6 lines - 1st one: raid6: int 32x1 738MB/s
> [...]
> last one: raid6: using algorithm sse2x2 (3511 MB/s)
>
> Raid6 without module md and with only 3 hdds?

No, that code runs even if there is only a single disk (no RAID).
It appears to be a check for the fastest way to do the XOR on that
particular processor.

> The setup with four hdds leads to (mostly cyrillic) character soup  
> and crashes offering an anaconda dump (see attached file).
>
> With three hdds the setup finishes (with the same confusing raid6  
> message lines quoted above)
> and df -a -h reports the following:
> Filesystem         Size    Used    Avail   Use% Mounted on
> /dev/sda2          7.9G   1.4G    6.5G   18      /
> proc                      0        0        0     -     /proc
> sysfs                     0        0        0     -     /sys
> devpts                   0        0        0     -     /dev/pts
> /dev/sda1            99M    12M     82M 13%   /boot
> tmpfs               1014M       0  1014M   0%   /dev/shm
> /dev/mapper/VolGroup00-LogVol00
>                         442G   199M   419G   1%   /library
> none                      0        0        0     -     /proc/sys/ 
> fs/binfmt_misc
>
> This makes me think there is no raid and only sda is being used.  
> With module md not found this is only to be expected. But little do  
> I know and my expectations are pretty non-consequential in this  
> realm ;-)
> The hardware provided 3 500GB hdds (actually 4, but I had to unplug  
> one to get a set-up to finish).
>
> What do you think is going on?

No clue.  This properly discovered and partitioned/formatted the RAID  
under build 150 ?
I'm surprised that it worked in 150!

> Anybody ever tried this using a different mobo / sata controller  
> (mine is the on-board nVidia MCP61)?

Not at this end.

wad



More information about the Server-devel mailing list