[OLPC-devel] [nitingupta.mail at gmail.com: CCache work status]

Marcelo Tosatti marcelo at kvack.org
Mon Jun 5 10:41:35 EDT 2006


FYI

----- Forwarded message from Nitin Gupta <nitingupta.mail at gmail.com> -----

From: Nitin Gupta <nitingupta.mail at gmail.com>
Date: Mon, 05 Jun 2006 00:05:10 +0530
To: linuxcompressed-devel at lists.sourceforge.net
CC: mtosatti at redhat.com, Rik van Riel <riel at redhat.com>,
   Marcelo Tosatti <marcelo at kvack.org>
Subject: CCache work status

Hi,

This is kind of weekly work summary for ccache work... :)

I originally started this project with focus on "general" desktop systems 
(OpenOffice + KDE/Gnome on 128MB RAM?). But now I'll instead focus it on 
much more exciting case -- The OLPC systems. This should also bring its 
applicability to a wide range of Linux based embedded devices. This does not 
put original goal out of way but it just results in some change of priority 
- much of things are same for both.

The most important things I learnt for OLPC systems w.r.t ccaching are:

-- They DO NOT want on-disk compressed swapping (problems: slow speed for 
writing to flash space, wear leveling).
-- For in-memory compression, they have much higher priority for compressing 
anonymous memory pages instead of filesystem backed (page-cache) pages.

I originally started out to handle page-cache pages. Since I'm now focusing 
on OLPC systems, I'll change gears to anon pages now. This has slowed 
development for time being as I still have only half-baked roadmap for anon 
pages, but now design work has progressed nicely and have started its 
implementation work :)
--------------------------------------------------------------------

The approach I've taken is this:

* General
It creates a virtual swap device (another swap_info_struct in swap_info[]). 
Its size is set though a /proc entry created by 'cc_control' module 
(attached). This swap is set to highest priority and is identified in other 
code paths with SWP_COMPRESSED flag. So, when adding page to swap cache, it 
is given 'type' no. as for this virt swap until it reaches its max capacity 
(as set through /proc interface).

* Storing page to ccache
When swap to disk occurs in pageout(), same things will occur as is done for 
page-cache pages -- it doesn't need to know if it's anon or fs-backed page. 
These things are - replace corresponding swapper radix tree node to a 
'chunk_head' which then points to the location of 'chunks' in ccache storage 
structure.

* Restoring an anon page from ccache
When page fault occurs, handle_pte_fault() first checks swap cache if it 
already has required page which then calls find_get_page() over swapper 
space radix tree. Here, a compressed page is handled exactly as for page 
cache page i.e. checked if looked-up page has PG_compressed set, decompress 
it and make radix node point back to this uncompressed page.

* Notes
-- This approach deals easily with case where no real swap devices exist. 
This virt swap is a separate entity with all separate data structures like 
swap_map[] as for a normal swap device.
-- This only requires working with arch independent swp_entry_t instead of 
arch dependent representation of it in PTEs.
-- More code sharing for handling page-cache and swap-cache pages - much of 
code wouldn't know if its working with anon or fs-backed page like 
add_to_swap_cache(), find_get_page() (and friends).

* About files attached
-- cc_control module: This creates two proc entries: 
/proc/cc_control/{max_anon_cc_size, max_fs_backed_cc_size}. Write to 
max_anon_cc_size, the size of ccache for anon pages in units of no. of 
pages. Writing this value calls set_anon_cc_size() in mm/swapfile.c which 
creates this virtual swap device with size as passed and highest priority 
(hard-coded to 100). (max_fs_backed_cc_size is dummy for now).
-- patch for 2.6.17-rc5 that has this set_anon_cc_size() with some small 
misc handling.

TODO Next: merge work from prev patches (as on CompressedCaching/Code) with 
this one to get on with anon page ccache.

(I'll try to update CompressedCaching page soon for anon page handling)


PS: Marcelo, I'm unable to send mail to marcelo AT kvack DOT org (just get 
mail delivery failure notice), so sending to mtosatti AT redhat DOT com 
instead. Anyway, added your kvack address too, to see if I get it this 
time... :)



Cheers,
Nitin Gupta


diff -urN orig/include/linux/cc.h cc-devel/include/linux/cc.h
--- orig/include/linux/cc.h	1970-01-01 05:30:00.000000000 +0530
+++ cc-devel/include/linux/cc.h	2006-06-03 18:26:28.000000000 +0530
@@ -0,0 +1,6 @@
+#ifndef _CCACHE_H
+#define _CCACHE_H
+
+int set_anon_cc_size(unsigned long size);
+
+#endif
diff -urN orig/include/linux/swap.h cc-devel/include/linux/swap.h
--- orig/include/linux/swap.h	2006-06-03 18:24:31.000000000 +0530
+++ cc-devel/include/linux/swap.h	2006-06-03 18:25:40.000000000 +0530
@@ -109,6 +109,7 @@
 	SWP_ACTIVE	= (SWP_USED | SWP_WRITEOK),
 					/* add others here before... */
 	SWP_SCANNING	= (1 << 8),	/* refcount in scan_swap_map */
+	SWP_COMPRESSED	= (1 << 10),	/* it's compressed cache for anon pages */
 };
 
 #define SWAP_CLUSTER_MAX 32
diff -urN orig/mm/swapfile.c cc-devel/mm/swapfile.c
--- orig/mm/swapfile.c	2006-06-03 18:24:14.000000000 +0530
+++ cc-devel/mm/swapfile.c	2006-06-03 18:25:58.000000000 +0530
@@ -630,7 +630,7 @@
  */
 static unsigned int find_next_to_unuse(struct swap_info_struct *si,
 					unsigned int prev)
-{
+{	
 	unsigned int max = si->max;
 	unsigned int i = prev;
 	int count;
@@ -1160,6 +1160,8 @@
 	char * pathname;
 	int i, type, prev;
 	int err;
+
+	printk(KERN_INFO "sys_swapoff called.\n");
 	
 	if (!capable(CAP_SYS_ADMIN))
 		return -EPERM;
@@ -1180,7 +1182,8 @@
 	spin_lock(&swap_lock);
 	for (type = swap_list.head; type >= 0; type = swap_info[type].next) {
 		p = swap_info + type;
-		if ((p->flags & SWP_ACTIVE) == SWP_ACTIVE) {
+		if (((p->flags & SWP_ACTIVE) == SWP_ACTIVE)
+			&& !(p->flags & SWP_COMPRESSED)) {
 			if (p->swap_file->f_mapping == mapping)
 				break;
 		}
@@ -1371,6 +1374,96 @@
 __initcall(procswaps_init);
 #endif /* CONFIG_PROC_FS */
 
+
+int set_anon_cc_size(unsigned long num_pages)
+{
+	int i, error, prev;
+	unsigned int type;
+	unsigned long maxpages;
+	struct swap_info_struct *p;
+	struct swap_extent *new_se;
+
+	printk(KERN_INFO "set_anon_cc_size called\n");
+
+	error=0;
+	spin_lock(&swap_lock);
+	p = swap_info;
+	for (type = 0 ; type < nr_swapfiles ; type++,p++)
+		if (!(p->flags & SWP_USED)) break;
+
+	maxpages = num_pages;
+
+	INIT_LIST_HEAD(&p->extent_list);
+	p->flags = SWP_USED | SWP_COMPRESSED;
+	p->swap_file = NULL;
+	p->bdev = 0;
+	p->old_block_size = 0;
+	p->lowest_bit = 1;
+	p->highest_bit = maxpages - 1;
+	p->cluster_nr = 0;
+	p->inuse_pages = 0;
+	p->max = maxpages;
+	p->pages = maxpages;
+	p->cluster_next = 1;
+	p->prio = 100;
+	p->next = -1;
+	spin_unlock(&swap_lock);
+
+
+	/* initialize swap map */
+	if (!(p->swap_map = vmalloc(maxpages * sizeof(short)))) {
+		error = -ENOMEM;
+		goto out;
+	}
+	memset(p->swap_map, 0, maxpages * sizeof(short));
+	p->swap_map[0] = SWAP_MAP_BAD;
+
+	/* initialize swap extents */
+	new_se = kmalloc(sizeof(struct swap_extent), GFP_KERNEL);
+	if (new_se == NULL) {
+		error = -ENOMEM;
+		goto out;
+	}
+	new_se->start_page = 0;
+	new_se->nr_pages = maxpages;
+	new_se->start_block = 0;
+	list_add_tail(&new_se->list, &p->extent_list);
+	
+	p->curr_swap_extent = new_se;
+
+
+	mutex_lock(&swapon_mutex);
+	spin_lock(&swap_lock);
+	p->flags = SWP_ACTIVE | SWP_COMPRESSED;
+	nr_swap_pages += maxpages;
+	total_swap_pages += maxpages;
+
+	/* insert swap space into swap_list */
+	prev = -1;
+	for (i = swap_list.head; i >= 0; i = swap_info[i].next) {
+		if (p->prio >= swap_info[i].prio) {
+			break;
+		}
+		prev = i;
+	}
+	p->next = i;
+	if (prev < 0) {
+		swap_list.head = swap_list.next = p - swap_info;
+	} else {
+		swap_info[prev].next = p - swap_info;
+	}
+	spin_unlock(&swap_lock);
+	mutex_unlock(&swapon_mutex);
+
+	return error;	// can only be 0 here now...
+
+out:
+	printk(KERN_INFO "Error initializing anon compressed swap.\n");
+	return error;
+}
+
+EXPORT_SYMBOL(set_anon_cc_size);
+
 /*
  * Written 01/25/92 by Simmule Turner, heavily changed by Linus.
  *


----- End forwarded message -----



More information about the Devel mailing list