[Etoys] [Vm-dev] hampering the desire of the VM and image to visit every object at startup time (multiple times)
John M McIntosh
johnmci at smalltalkconsulting.com
Tue Apr 14 13:04:20 EDT 2009
Yes, so WikiServer ( http://www.mobilewikiserver.com ) is an image
file of 10.5 MB.
As startup only 4.5 MB of OOPS memory is faulted in, and about 700K of
memory is altered, which reduces initial memory use by 6MB and reduces
the startup time by 3+ seconds.
Given the slow speed of the iPhone, and the fact I've a 64MB limit,
6MB is a lot, and 3 seconds is welcome. Unfortunately a full GC will
fault all that 10.5 MB in, however by doing some GC tuning one
can avoid a full GC until things are quite stressed.
I note on os-x desktop machines the entire 10.5MB is read in and the
pages marked as non-referenced, but obviously the rules for the
virtual memory subsystem are different.
First let me suggest we change
writeImageFileIO
/* header size in bytes; do not change! */
headerSize = 64;
f = sqImageFileOpen(imageName, "wb");
from 64 bytes to 4096 bytes, if possible.
Now let's explore why, and what is going on.
on unix if you decide to use mmap versus malloc to allocate storage
for oops space it does
#define MAP_PROT (PROT_READ | PROT_WRITE)
#define MAP_FLAGS (MAP_ANON | MAP_PRIVATE)
mmap(0, heapLimit, MAP_PROT, MAP_FLAGS, devZero, 0)
where heapLimit is usually 1GB, start location zero.
This returns a start location somewhere in memory, never zero, and
generally we have to swizzle all the object pointer. On a save and
restart of the squeak app the address
you get back *maybe* the same address, if it is the same we don't need
to swizzle pointers, however OpenBSD based systems likely will always
give a different location for security reasons.
I had then set a start location of 128MB, but found on os-x as the
number of apps goes up you don't get 128MB, so I settled on 500MB which
seems ok. 8GB pro macs running 52 applications fail at 500MB, but the
failure is it chooses it's own address, so we don't care...
Well yes that limits your squeak image to 3.5GB but it's doubtful that
a 32bit system will let you allocate a contiguous chunk of memory > 2
GB anyway.
Now the next issue was the original memory allocation logic would give
you the 1GB, and you would read the entire image into that memory area.
In thinking about this I thought why can't you mmap to the image file
for the size of the file rounded up to the page size, then mmap after
that memory
to anonymous memory upto the desired heapsize.
So two mmaps, one for the file, followed by another for young space.
I implemented this for the os-x vm and the iPhone VM.
In testing with a 500 Mhz powerpc laptop, I found the startup time was
reduced by 30% because it would fault in say a 20MB page by page as it
did the
needless flush primitive calls logic, versus reading the 20mb into
memory, the virtual memory pager was just more efficient at pullling
in the data either by better
I/O processing, or faster logic in finding the free pages.
Problems:
It turns out there is a flaw in the OS-X BSD mmap logic when you mmap
files on NFS drives, it hangs, and some people with I think
overstressed systems reported issues with the first mmap failing.
Because of this I reverted back to the old logic by default, and put a
flag in the info.plist SqueakUseFileMappedMMAP to enable the new logic.
Obviously for Linux you have to decide if this flaw exists and has not
been fixed?
Now the problem with headerSize
In the file mmap case the entire file is mapped into memory at 500MB,
but the oops space starts at 64 bytes, so memory is at 500MB+64
In the anonymous mmap case memory starts at 500MB, but the oops space
starts at 0, so memory is at 500MB.
If the headersize was 4096 we could mmap the file at 500MB-4096,
Or alter the anonymous case we could allocate at 500MB but stick the
oops space at 500MB +64 (header size).
However by using a headerSize of 4096 we can then get the oops space
to start on a page boundary, which may or may not have implications.
Anyway it would be good to resolve this bit of tricky logic.
I stuck the following code into ioRelinquishProcessorForMicroseconds()
since ioRelinquishProcessorForMicroseconds will only get triggered
once the
image finishes all it's startup logic and becomes *idle*. So that I
could determine how each page was viewed by the virtual memory
subsystem.
xtern unsigned char *memory;
extern usqInt sqGetAvailableMemory();
extern size_t fileRoundedUpToPageSize;
size_t pageSize= getpagesize();
size_t vmpagesize=sqGetAvailableMemory()/pageSize + 1;
char *what = malloc(vmpagesize);
int err = mincore(memory, sqGetAvailableMemory(), what);
int countRef=0, countMod=0,countZero=0, countOne=0, i;
for (i=0;i<fileRoundedUpToPageSize/pageSize;i++) {
if(what[i] == 0) countZero++;
if(what[i] == 1) countOne++;
if(what[i] == 3) countRef++;
if(what[i] == 7) countMod++;
{break for debugging here}
}
free(what);
On 14-Apr-09, at 1:58 AM, Bert Freudenberg wrote:
>
> On 14.04.2009, at 07:26, John M McIntosh wrote:
>
>> I created a pharo entry to track the problem the VM & image has in
>> wanting to visit every smalltalk object multiple times at startup
>> time.
>> Athought this behavior is masked by Gigaherz processors, it's very
>> evident as a problem on the iPhone. Fixing it results in reducing
>> MB of RAM memory usage and saves actual "seconds* of clock time at
>> startup.
>>
>> http://code.google.com/p/pharo/issues/detail?id=737&colspec=ID%20Type%20Status%20Summary%20Milestone&start=200
>
> Very nice. We experimented in that direction for OLPC which also is
> comparatively slow CPU wise, and even slower loading the whole image
> from the flash disk (which involves decompressing). Mmapping only
> the pages needed should give a considerable boost.
>
> Do we have evidence that an mmap base address of 500 MB works across
> platforms?
>
> - Bert -
>
--
=
=
=
========================================================================
John M. McIntosh <johnmci at smalltalkconsulting.com>
Corporate Smalltalk Consulting Ltd. http://www.smalltalkconsulting.com
=
=
=
========================================================================
More information about the Etoys
mailing list