Odd python proc control / buffering behaviour

Martin Langhoff martin.langhoff at gmail.com
Fri Aug 1 23:29:49 EDT 2008


something very basic is not working well with Python - reading 1MB
from a process and writing it to a file gets truncated at random
points. Not being a native Python speaker, review and comments
welcome. Hopefully I'm not losing my mind just yet.

Summary:
 - The script untars an XO image under fakeroot, saving a 1MB "state"
file from fakeroot.
 - Then concatenate all the "state" files from all the builds into a master one.
 - The concatenation ends up truncated (due to buffering issues?)

Some interesting aspects
 - If I comment out the untarring of the image just before we hit this
code, the bug disappears.
 - If we sleep for 1s, the bug disappears!
 - output redirection under os.system() also suffers the problem
 - os.fdatasync(), proc.wait() don't seem to help
 - there is no corruption in the file - just truncated
 - the truncation is not at a newline

   Code affected:
   ... just after untarring the image -

    # (cat *.state > .tmpstate) && mv .tmpstate rsyncd.all
    (tmpfh, tmpfpath) = tempfile.mkstemp()

    # Uncomment the line below and the truncation goes away!
    # time.sleep(1)

    # Using os.system() shows truncation problems
    #os.system('find %s -type f -name \'*.state\' -print0 | xargs -0
--no-run-if-empty cat > %s' % (options.statedir, tmpfpath))

    #
    # The Python way - shows truncation problems
    #
    pfind  = Popen(['find', options.statedir, '-type', 'f',
                    '-name', '*.state', '-print0'], stdout=PIPE)
    pxargs = Popen(['xargs', '-0', '--no-run-if-empty', 'cat'],
                   stdin=pfind.stdout,stdout=tmpfh)

    # wait() or a small loop checking for poll()!=None
    # neither helps
    pxargs.wait()
    # fdatasync() does not make a difference
    os.fdatasync(tmpfh)
    os.close(tmpfh)

Changing the stdout of pxargs to PIPE and reading it explicitly with a loop like

    while True:
        buf = pxargs.stdout.read(4096)
        if not buf:
            if pxargs.poll() == None:
                 continue
            break
        os.write(tmpfs,buf)

does not make any difference either.

The truncation is not stable - the file should get 1036681 bytes, and
it gets anywhere from 300KB to 700KB. I suspect that Python is
forgetting to flush the buffers when the process finishes. This is
python 2.5-15.fc7 on the XS image.


Any hints? Ideas?




m
-- 
 martin.langhoff at gmail.com
 martin at laptop.org -- School Server Architect
 - ask interesting questions
 - don't get distracted with shiny stuff - working code first
 - http://wiki.laptop.org/go/User:Martinlanghoff


More information about the Devel mailing list