Announce: Compressed cache alpha-007 (kernel 2.6.18)

Wed Sep 20 11:26:25 EDT 2006

On Wed, 2006-09-20 at 17:12 +0200, Nick Piggin wrote:
> On Wed, Sep 20, 2006 at 04:39:45PM +0200, Peter Zijlstra wrote:
> > On Wed, 2006-09-20 at 19:34 +0530, Nitin Gupta wrote:
> > > Peter Zijlstra wrote:
> > > > On Wed, 2006-09-20 at 18:23 +0530, Nitin Gupta wrote:
> > > > 
> > > >>> Don't like the bit_spin_trylock/bit_spin_unlock in handle_ccache_fault;
> > > >>> what's wrong with TestSetPageLocked() and unlock_page() ?
> > > >>>
> > > >> If (PageCompressed(page)) is true then 'page' is 'struct chunk_head'
> > > >> not 'struct page' and you cannot use unlock_page() on chunk_head.
> > > > 
> > > > ClearPageLocked()
> > > > 
> > > 
> > > But then how will you do equivalent of wait_on_chunk_head() ?
> > 
> > Doh, now I see, its not &page->flags but &ch->flags.
> > 
> > Humm, you'd need to open code this I think, bit_spin_trylock() may not
> > be ordered strong enough (on !x86). Got my head in a twist, but I think
> > this is correct:
> > 
> > static inline
> > int chunk_head_trylock(struct chunk_head *ch)
> > {
> > 	int locked = test_set_bit(PG_locked, &ch->flags);
> > 	if (!locked)
> > 		smb_wmb();
> > 	return !locked;
> > }
> > 
> > static inline
> > void chunk_head_unlock(struct chunk_head *ch)
> > {
> > 	int locked;
> > 	smp_wmb();
> > 	locked = test_clear_bit(PG_locked, &ch->flags);
> > 	BUG_ON(!locked);
> > }
> 
> A plain clear_bit(PG_locked, ...) here will leak, but all atomic and
> bitop RMW ops are defined to be strongly ordered before and after (by
> Documentation/atomic_ops.txt), so I think we're OK?

Ah, yes, I read it like so:

They must execute atomically, yet there are no implicit memory barrier
semantics required of these interfaces.

        int test_and_set_bit(unsigned long nr, volatile unsigned long *addr);
        int test_and_clear_bit(unsigned long nr, volatile unsigned long *addr);
        int test_and_change_bit(unsigned long nr, volatile unsigned long *addr);

> In your above examples, I think you probably want full smp_mb() there
> because you do want to prevent loads inside the critical section from
> being completed before the bit is visible or after the bit is cleared.

bit_spin_unlock() does smp_mb__before_clear_bit() which would also be
enough, since on the wacky platforms that equals smp_mb().

Nitin, it looks like the current code would indeed be correct, sorry for
the noise.

Thanks Nick.