MMU vs GPU (Was: The "iGoogle bug")

Fri Sep 21 09:50:44 EDT 2007

Right now, the X kernel driver infrastructure and memory management
isn't adequate, and is being worked on actively elsewhere.

Until that is done (or nearly done, to where the feedback to those
people would have value), work in this area isn't worth our time.

But someday.
                             - Jim

On Fri, 2007-09-21 at 13:31 +0200, NoiseEHC wrote:
> According to the databook, GP_BASE_OFFSET (page 270) is included in the 
> command buffer (page 239). If you push the command buffer into the 
> kernel then implementing that should be trivial. (Now I have realized 
> that X runs in user mode and this amd driver is not a kernel driver but 
> an X driver. What a stupid architecture...) Since GP_BASE_OFFSET defines 
> a 16MB long buffer on a 4MB boundary, if the driver rejects bitmaps 
> larger than 12MB then you do not even have to split at 16MB boundary 
> (only at every 4KB). Of course, you have to split at page boundary 
> anyway so you can just calculate GP_BASE_OFFSET every time...
> However it is not clear if it would increase speed enough that it would 
> worth implementing, after all one line of data will be held in the L1/L2 
> cache so the video processor does not need to fetch memory. Keeping 
> buffers in the X client's memory space would speed up things but I think 
> that it would break some X semantics (for example if you push something 
> to the X server, it would became a NOP but after that the client should 
> not modify the bitmap). Now this would take a looong time to implement.
> I wanted to look at X.org's implementation but their servers seems to be 
> down.
> 
> Bernardo Innocenti wrote:
> > NoiseEHC wrote:
> >
> >   
> >>>  - Seeing if we can get the blitter to read source data directly from system
> >>>    memory.  I'd be very surprised if there was no way to make it work
> >>>    with virtual memory enabled, because, without such a mechanism, the
> >>>    blitter would be less than fully useful.
> >>>   
> >>>       
> >> Could somebody shed some light on this, please?
> >>
> >> I think that probably the Linux kernel has some page locking function 
> >> which returns a list of physical addresses from a virtual address has 
> >> not it?
> >>     
> >
> > That's virt_to_phys(), yes... but it's not available in userspace.
> > All the people I've consulted agreed it's not easy to translate
> > virtual addresses from within a process.
> >
> >
> >   
> >> The Channel 3 DMA can be programmed to read from any 16MB block 
> >> from the 32 bit address space. Why is it hard to combine the two?
> >> Why is it even necessary to upload bitmaps to "video memory"?
> >>     
> >
> > Yes... UMA systems already pay a price in terms of memory bandwidth,
> > they should at least be compensated with the advantage of not having
> > to do the migration crap.
> >
> > It's very likely a leftover from the original PC architecture with
> > separate CGA/EGA/VGA cards.  Even now that GPUs are being integrated
> > on the same physical die of the CPU, they still look and act like
> > external PCI devices :-)
> >
> > DRM is supposed to help solve the virt_to_phys() problem.
> > But if we could do as you suggest and just use bitmaps scattered
> > through memory pages, we'd be *much* faster.
> >
> >   
> _______________________________________________
> Devel mailing list
> Devel at lists.laptop.org
> http://lists.laptop.org/listinfo/devel
-- 
Jim Gettys
One Laptop Per Child