[linux-mm-cc] Compression structure implementation

Nitin Gupta nitingupta.mail at gmail.com
Fri Jul 7 09:10:19 EDT 2006


Hi All,

Just finished with implementation for compression structure (as
desc. on Wiki). Following is a short description, demo, Notes and TODO.
(Code is at CompressedCaching/Code).

-- The storage begins as a single page. As you add pages to it, it expands till
it reaches its max limit. As you take out pages, it shrinks, freeing up pages
when it has no chunks left.
-- Adjacent free chunks are merged together.
-- Each page can be compressed using different algo.


Again, interface is via proc
/proc/storage-test/{readpage, writepage, show_structure}

- writepage: write a page on this to compress and store it in ccache.
- readpage: write 'id' (desc. later) of page you want.
- show_structure: read to show current snapshot of ccache storage.

Here's a short demo:

[storage-test]# insmod storage-test.ko
[storage-test]# cat /proc/storage-test/show_structure

Page: 1
        Chunk: 1, Size: 4096, Free

* begin as single page (one chunk of size PAGE_SIZE)
* Now add a page to it:

[storage-test]# dd if=/lib/libpthread.so.0
of=/proc/storage-test/writepage bs=4096 count=1

[storage-test]# cat /proc/storage-test/show_structure

Page: 1
        Chunk: 1, Size: 1020, Busy, [id: 0: WKdm]
        Chunk: 2, Size: 3076, Free

* 'break' initial chunk and store compressed page. page is given 'id'=0 and
compressed using WKdm.

* Now, fill it with more data:

[storage-test]# a=1
[storage-test]# while [ "$a" -le 15 ]; do dd
if=/lib/libpthread.so.0 of=/proc/storage-test/writepage bs=4096 count=1; let
"a+=1"; done

[storage-test]# cat /proc/storage-test/show_structure

Page: 1
        Chunk: 1, Size: 1020, Busy, [id: 0: WKdm]
        Chunk: 2, Size: 972, Busy, [id: 1: WK4x4]
        Chunk: 3, Size: 1566, Busy, [id: 2: LZO]
        Chunk: 4, Size: 538, Busy, [id: 3: WKdm]

Page: 2
        Chunk: 5, Size: 482, Busy, [id: 3: WKdm]
        Chunk: 6, Size: 972, Busy, [id: 4: WK4x4]
        Chunk: 7, Size: 1566, Busy, [id: 5: LZO]
        Chunk: 8, Size: 1020, Busy, [id: 6: WKdm]
        Chunk: 9, Size: 56, Busy, [id: 7: WK4x4]

Page: 3
        Chunk: 10, Size: 916, Busy, [id: 7: WK4x4]
        Chunk: 11, Size: 1566, Busy, [id: 8: LZO]
        Chunk: 12, Size: 1020, Busy, [id: 9: WKdm]
        Chunk: 13, Size: 594, Busy, [id: 10: WK4x4]

Page: 4
        Chunk: 14, Size: 378, Busy, [id: 10: WK4x4]
        Chunk: 15, Size: 3718, Free

* ccache expands to add more compressed pages.
* algos are cyclically selected for now -- code to guess which algo should be
used to compress page goes in guess_algo() [ccache.c]
* a compressed page can take more than one chunk. Chunks with same 'id' belong
to same compressed page.
* It currently contians 11 compressed pages. It didn't store 15 more pages as
said since max ccache size is set to 4 (uncompressed) pages. You can change
the max size by changing MAX_CC_SIZE #defined in ccache.h -- in real there
will be sysctl for this.

* Now, take out some pages('id'=3):

[storage-test]# echo "3" > /proc/storage-test/readpage
[storage-test]# cat /proc/storage-test/show_structure

Page: 1
        Chunk: 1, Size: 1020, Busy, [id: 0: WKdm]
        Chunk: 2, Size: 972, Busy, [id: 1: WK4x4]
        Chunk: 3, Size: 1566, Busy, [id: 2: LZO]
        Chunk: 4, Size: 538, Free	<---------

Page: 2
        Chunk: 5, Size: 482, Free	<---------
        Chunk: 6, Size: 972, Busy, [id: 4: WK4x4]
        Chunk: 7, Size: 1566, Busy, [id: 5: LZO]
        Chunk: 8, Size: 1020, Busy, [id: 6: WKdm]
        Chunk: 9, Size: 56, Busy, [id: 7: WK4x4]

Page: 3
        Chunk: 10, Size: 916, Busy, [id: 7: WK4x4]
        Chunk: 11, Size: 1566, Busy, [id: 8: LZO]
        Chunk: 12, Size: 1020, Busy, [id: 9: WKdm]
        Chunk: 13, Size: 594, Busy, [id: 10: WK4x4]

Page: 4
        Chunk: 14, Size: 378, Busy, [id: 10: WK4x4]
        Chunk: 15, Size: 3718, Free

* these free chunks are not merged since they are in different pages. Chunks
never cross page boundary.

* Now, take out another page:
[storage-test]# echo "2" > /proc/storage-test/readpage
[storage-test]# cat /proc/storage-test/show_structure

Page: 1
        Chunk: 1, Size: 1020, Busy, [id: 0: WKdm]
        Chunk: 2, Size: 972, Busy, [id: 1: WK4x4]
        Chunk: 3, Size: 2104, Free	<---------

Page: 2
        Chunk: 4, Size: 482, Free
        Chunk: 6, Size: 972, Busy, [id: 4: WK4x4]
        Chunk: 7, Size: 1566, Busy, [id: 5: LZO]
        Chunk: 8, Size: 1020, Busy, [id: 6: WKdm]
        Chunk: 9, Size: 56, Busy, [id: 7: WK4x4]

Page: 3
        Chunk: 10, Size: 916, Busy, [id: 7: WK4x4]
        Chunk: 11, Size: 1566, Busy, [id: 8: LZO]
        Chunk: 12, Size: 1020, Busy, [id: 9: WKdm]
        Chunk: 13, Size: 594, Busy, [id: 10: WK4x4]

Page: 4
        Chunk: 14, Size: 378, Busy, [id: 10: WK4x4]
        Chunk: 15, Size: 3718, Free

* adjacent chunks in page-1 are merged. Now a single free chunk of size 2104B.
* If you free page 'id'=1 now, then chunk 1 & 3 can't be merged -- chunk-2 in
b/w.

* Now free another chunk:

[storage-test]# echo "10" > /proc/storage-test/readpage
[storage-test]# cat /proc/storage-test/show_structure

Page: 1
        Chunk: 1, Size: 1020, Busy, [id: 0: WKdm]
        Chunk: 2, Size: 972, Busy, [id: 1: WK4x4]
        Chunk: 3, Size: 2104, Free

Page: 2
        Chunk: 4, Size: 482, Free
        Chunk: 5, Size: 972, Busy, [id: 4: WK4x4]
        Chunk: 6, Size: 1566, Busy, [id: 5: LZO]
        Chunk: 7, Size: 1020, Busy, [id: 6: WKdm]
        Chunk: 8, Size: 56, Busy, [id: 7: WK4x4]

Page: 3
        Chunk: 9, Size: 916, Busy, [id: 7: WK4x4]
        Chunk: 10, Size: 1566, Busy, [id: 8: LZO]
        Chunk: 11, Size: 1020, Busy, [id: 9: WKdm]
        Chunk: 12, Size: 594, Free

* Page 'id'=10 was read, chunks 13, 14 were freed. Chunks 14, 15 as now
adjacent and in same page so they were merged. This gave a chunk that spans
entire page so the page was freed (along with related data sturcts).


* NOTES

-- Incompressible pages (compressed size >= PAGE_SIZE) are never stored.
-- No locking has been done - will do it when intergrating with main ccaching
code.
-- Shrinking not done - although pages are freed are chunks are taken out. But
'voluntary' shrinking has problems. It requires (mix of) two things:
	1. To free a page, move all chunks it has to other pages and free it.
	2. Move pages to swap disk (as compressed or decompress then swap?)
I've done them on paper but not in code. For (2) different things are done
for anon, clean page-cache and dirty page-cache pages. Its bit of mess but
seems feasible :)
-- struct chunk's 'master chunk list' should be singly linked. Again have this
on paper but code used doubly linked list. This will save 4-bytes (32bit
sys) per chunk!  (and we can have many chunks per compressed page).


* TODO

-- Where are the stats?? show stats, in particular:
	1. total space taken by metadata
	2. avg. no. of chunks per chunk_head
	3. time to compress, decompress, read/write comp page from/to ccache.
-- Implement shrinking (part 1).
-- Test it...crash it...fix it...crash..fix... :)


Cheers,
Nitin Gupta
-------------- next part --------------
A non-text attachment was scrubbed...
Name: storage-test.tar.gz
Type: application/x-gzip
Size: 50712 bytes
Desc: not available
Url : http://mailman.laptop.org/pipermail/linux-mm-cc/attachments/20060707/bc37b6ba/storage-test.tar-0001.gz


More information about the linux-mm-cc mailing list