[linux-mm-cc] I guess you have been following ksm.

Matthew Toseland toad at amphibian.dyndns.org
Mon Apr 20 07:48:45 EDT 2009


Most cheap SSDs have *very* slow random write; keeping it all compressed in 
RAM will likely be very much faster.

On Sunday 19 April 2009 06:12:16 John McCabe-Dansted wrote:
> On Fri, Apr 17, 2009 at 2:45 PM, Nitin Gupta <ngupta at vflare.org> wrote:
> > On Fri, Apr 17, 2009 at 4:02 AM, Peter Dolding <oiaohm at gmail.com> wrote:
> >> The copy on write system also appears to provide something else
> >> interesting.  ksm and compcache are both after allocation.   The
> >> interesting question is if Linux kernel should provide a calloc
> >> function.   So that on commit its is automatically stacked.  This
> >> would massively reduce the numbers of blank matching pages.  Linux
> >> system already has something that deals with malloc allowing over
> >> commits until accessed.
> >>
> >
> > Not sure if I understand you here. You mean all new allocation should
> > be zeroed to play better with KSM?
> 
> Not sure either, but it seems similar to my suggestion that we could
> use existing techniques to zero garbage. The suggested purpose of
> these techniqueswas security, but this would presumably also improve
> the compression ratio of compcache.  Apparently they require only ~1%
> overhead and we may be able to do even better that this if the goal is
> performance rather than security
> 
http://www.usenix.org/events/sec05/tech/full_papers/chow/chow_html/index.html
> 
> Unfortunately they have lost the code, so we would have to reimplement
> it from scratch.
> 
> > For simplicity, code currently in SVN
> > decompresses individual objects in a page before writing out to backing
> > swap device.
> 
> One complexity is that compressed pages could get fragmented. I am not
> sure if pages being adjacent on the swap device means that they are
> related, but even if not, there would be some book keeping regarding
> free space fragmentation.
> 
> As an aside, with decent wear leveling, swap on SSD is feasible, but
> compressing pages first would seem a good idea to reduce wear of the
> SSD device. I was thinking of some algorithms to write out pages to
> SSD in a optimized way. One obvious technique would be to write to
> pages in a round robin fashion, consolidating free space as we go.
> This would theoretically lead to perfect wear leveling, although
> skipping over sectors that have little free space would seem a good
> idea. However most PCs have SSD devices that do their own wear
> leveling. I am not sure what the best strategy for these devices would
> be.
> 
> However it does seem to me that as SSD devices become more common SSD
> optimised swap would be useful, as despite the obvious disavantages
> SSD does have the advantage of fast random reads so swapping to SSD is
> less likely to kill performance than to HDD. Modern SSD drives can
> survive years of properly wear levelled writes and although SSD is
> reasonably expensive per MB, a particular machine may well happen to
> have substantial free SSD space but limited memory. Perhaps I should
> find some SSD related people to ask about SSD optimized writes?
> 
> > For duplicate page removal COW has lower space/time overhead than
> > compressing all individual copies with ramzswap approach. But this
> 
> If we wanted, we could keep only a single copy of duplicated pages in
> compcache. Since we compress the pages anyway, we may be able to
> assume that the non-duplcated pages are fairly random, allowing us to
> implement a hash table with minor overhead. This may be worthwhile if
> we have many VMs of the same OS.
> 
> However maybe it would be better to just run compcache and KSM
> together and let each one handle its own strengths. (Hopefully this
> would also mean less work for Nitin :)
> 
> > virtual block device approach has big advantages - we need not patch
> > kernel and keeps things simple. Going forward, we can better take
> > advantage of stacked block devices to provide compressed disk based
> > swapping, generic R/W compressed caches for arbitrary block devices
> > (compressed cached /tmpfs comes to mind) etc.
> 
> I understand tmpfs will swap out unused pages to disk, so we already
> have a compressed cached tmpfs of sorts. I can see a number of
> advantages to an explicitly cc'd tmpfs, e.g. the option of larger
> blocks with a better compression ratio though, and smarter decisions
> as to when to compress pages. However the current set up has the
> advantage that it is very simple, as compcache doesn't have to worry
> about any of this, and presumably tmpfs is optimized to be very fast
> when it does not need to be swapped out. I am not sure if a block
> device can hold onto pages the kernel hands to it without needing to
> memcpy (mostly because I know very little about the  Linux kernel
> internals).
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 835 bytes
Desc: This is a digitally signed message part.
Url : http://lists.laptop.org/pipermail/linux-mm-cc/attachments/20090420/b6454e05/attachment.pgp 


More information about the linux-mm-cc mailing list