[Sugar-devel] The quest for data

Martin Dluhos martin at gnu.org
Fri Jan 10 06:37:02 EST 2014

On 10.1.2014 11:55, Anish Mangal wrote:
> Sorry for being late to the party. Clearly the "quest for data" is a commonly
> shared one, with many different approaches, questions, and reporting/results.
> One of the already mentioned solutions is the sugar-stats package, originally
> developed by Aleksey, which have now been part of dextrose-sugar builds for over
> a year, and the server side (xsce). 
> http://wiki.sugarlabs.org/go/Platform_Team/Usage_Statistics
> The approach we followed was to collect as much data as possible without
> interfering with sugar-apis or code. The project has made slow progress on the
> visualization front, but the data collection front has already been field tested.
> I for one think there are a few technical trade-offs, which lead to larger
> strategy decisions:
> * Context v/s Universality ... Ideally we'd like to collect (activity) context
> specific data, but that requires tinkering with the sugar api itself and each
> activity. The other side is we might be ignoring the other types of data a
> server might be collecting ... internet usage and the various other logfiles in
> /var/log
> * Static v/s Dynamic ... Analyzing journal backups is great, but they are
> ultimately limited in time resolution due to the datastore's design itself. So
> the key question being "what's valuable?" ... a) Frequency counts of activities?
> b) Data such as upto the minute resolution of what activities are running, which
> activity is active (visible & when), collaborators over time ... etc ... 
> In my humble opinion, the next steps could be: 
> 1 Get better on the visualization front. 
> 2 Search for more context. Maybe arm the sugar-datastore to collect higher
> resolution data. 

I think that you are absolutely right, Anish. In my project, I am currently
focused on the former point, but I am running into limitations regarding the
data stored in the datastore. As Sameer suggested, let's create a wiki page with
a list of the data that's the community finds important and then compare that
list with what's currently collected in the datastore.

More information about the Devel mailing list