Audio sample bank on the XO.

Jean Piché jean at piche.com
Sun Jul 15 08:40:06 EDT 2007



Hello all,

After trial-2 madness has passed, we have to make a serious audio  
decision: we really need a system-wide audio sample library. I  
suspect many activities that use sounds (including our own) will be  
blocked if a decision is not made. Here are some reasons:

• Presently, activities wanting to share audio resources cannot do so  
through a shared directory.  Containerizing activities may actually  
agravate that situation but I am presuming this can and will be fixed  
regardless. I will enter this on trac. An activity supplying a  
resource for its own private use when the resource is of  general use  
leads to the wasteful duplication of data.

• We need a sample bank to play Standad MIDI Files off the internet.  
I have a few misgivings about the quality of the music available for  
download in this format, but the SMF is a very compact and useful  
music storage technology. I am certain all will agree it should be  
supported on the machine.

• Last and not least: this resource needs to live locally, not on a  
remote server. In the vast majority of cases, kids will use the  
resources that are immediately available and that do not require  
installation. There is no reason an appropriate system-wide resource  
cannot be put together for the XO.

Here is a proposal:

A default library:

There are compelling reasons to make the XO audio sample bank conform  
to the General MIDI spec ( http://en.wikipedia.org/wiki/ 
General_Midi ). The specification contains 127 sounds plus a number  
of drum/percussion sounds for a total of roughly 180 individual  
sounds, many of them quite short. GM is biased towards western  
instruments but it provides everything to correctly play midifiles  
off the internet. A GM sample bank could be put together for the XO  
in a relatively modest space leaving room for a number of other  
sounds needed by individual activities such as TamTam, eToys or other  
Csound-based activities.

Allocation on disk

A figure of 25MB was discussed at OLPC headquarters last year as a  
disk allotment for sound file resources. I am assuming this is still  
the case. The standard GM1 set  can take a lot, not so much or really  
little disk space depending on how it is planned. I propose that the  
set be given 10Mb of the availbale space. The rest can be made  
available to activities needing special sounds.  TamTam, for  
instance, would use many sounds from the GM1 set and many custom sounds.

Location and priviledges:

The audio sample bank should be located in the system tree (/usr/ 
share/sounds ?) where it is readily readable by any activity. A  
policy is needed for write-access but activities would need the  
possibility of copying audio resources in this location at install  
time up to a limit of 25MB. To take the example of TamTam again, any  
special sounds needed by TamTam wouls be copied into the designated  
location at install time. The location would not be writable by  
individual users. If kids wish to use their own sounds, these would  
go into their home directory and activities would have to provide  
ways of integrating those sounds.

Sampling rate and format:

Sampling rate has a major impact on quantity of storage required and  
audio quality.  There is also an impact on performance. A faster  
sampling rate has a huge impact on performance. Here is a comparison  
of sr vs length for 25MB storage of 16bit linear PCM monophonic audio  
files:

sr				duration
32kHz		=	390 seconds
22.05kHz	=	566 seconds
16kHz		=	781 seconds

Consider that the small speakers on the XO are not capable of  
rendering anything above a 22kHz sr. On the other hand,  earpods may  
be available giving a vastly improved bandwidth. We shoudl also  
consider the possibility of connecting the audio output of the XO to  
external sound systems.  Barry suggested that a 32kHz sr is desirable  
to cover higher audio quality applications and I would agree with  
that. Even at 32k, we have enough space for a good set of sounds in  
addition to the GM1 set. It is also  easy to pull down from 32k to re- 
sample at 16k and 8k, so activities can compensate for performance  
loss at higher sr.  The proposed format for individual sounds is:  
16bit linear PCM single channel @ 32KHz.

Keep in mind that this concerns audio samples only, not audio  
soundfiles in general. Playing mp3 and wave files is an entirely  
different problem which is not concerned with sampling rate issues.


Sound names:

The GM1 sample set proposes standard names for sounds along with  
their MIDI Program Change number from 0 to 127:  http://www.midi.org/ 
about-midi/gm/gm1sound.shtml . I propose that the sounds simply be  
stored under these names in WAV or AIFF format with, possibly, a  
number at the end of the name.  Any other slots above the GM set  
would start numbering from, say, 256 with the name of the sound file  
determined by the activity that puts  it in the bank there.

Availability and Licencing:

A good sample set is not trivial to put together. The problem of  
licencing is also specially thorny in our case.  Some of the public  
domain sounds used in TamTam may have slight restrictions and we will  
need to adress this before FRS.  Whatever the situation and if  
resources are not readily available to develop a sampel set of our  
own, I strongly suggest that OLPC start looking at companies that  
would be wiling to licence one of their existing sample sets. Rick  
Boulanger did say last month that negociations were taking place with  
M-Audio to adapt one of their sample libraries.  Are there  
developments in that direction?  I would hazard that OLPC will not  
have the choice but negociate a deal with an existing sample company  
or, alternatively, record its own bank. GPL sounds do exist out there  
but the quality is spotty and not all categories of sounds are  
readily available.


Comments sought and welcome!


Jean Piché
TamTam





More information about the Devel mailing list