very simple datastore reimplementation

Tomeu Vizoso tomeu at tomeuvizoso.net
Wed May 7 06:42:49 EDT 2008


Hi all,

as you know, the DS is currently one of the weakest points in Sugar.
In my opinion, the major cause for it is because implementation was
started with very high goals and with little actual knowledge of what
would be actually needed in the short and medium term.

The current codebase is quite a bit more flexible than what we
actually need, and this brings a cost in maintainability (this affects
in turn stability) and performance. We now have much more information
about what's needed than back then, so perhaps we should rethink a bit
the tradebacks we made back then.

Have lately been thinking about how would look like a new
implementation that provided just what we need and reached the
conclusion that it would be quite easy to write and could bring us the
stability and performance that we needed, along with a more adequate
base in which to base the new features that we now know that are
urgently needed.

It turned out to be quite easy, as all the hardest problems were
already solved in the current implementation. In little more than 500
lines of code, we get an useful replacement that lacks:

- Support for mounting removable datastores. We have agreed on moving
to just list the files in removable devices, without the DS having
anything to do. Although extending the DS capacity with SD cards is an
interesting feature, it brings many non-trivial issues that make this
a longer term feature.

- Filtering by arbitrary metadata properties. If only the journal is
allowed to execute find() calls, then we can restrict the properties
that need to be stored in the index, gaining disk space and speed when
searching. Do we really need this functionality?

And we gain:

- Small and simple codebase.

- More robustness in the storage of metadata, this is stored in plain
text files in the file system.

- More robustness regarding the use of the full text engine, as this
is only used by the journal and this could fall back to a a plain list
of the files, in case the index was corrupted anyhow.

- Storage of custom metadata properties such as the current page
number in Read (this important feature wasn't working right now due to
some fragility in how we interacted with the full text index).

Work still left to do:

- Complete the query support, basically copying code from the older DS.

- Measure performance and possibly trade some of the current
simplicity for faster operation.

- Expose the files with a human readable name, for legacy apps and
maybe for backups? Using a FUSE plugin?

- Monitor one dir where legacy apps will be allowed to write files to,
and move new files to the datastore along with some default metadata.

- Delta compression and version tracking.

- Index rebuilding after corruption.

- Full text indexing of the text content of files.

Here is the code:

http://dev.laptop.org/git?p=users/tomeu/datastore;a=summary

Rough guide for trying on the xo:

- yum install git
- git-clone git://dev.laptop.org/users/tomeu/datastore datastore
- cd datastore
- ./autogen.sh --prefix=/usr
- make
- sudo make install
- ctrl-alt-del

The current file layout is as follows:

[olpc at xo-0C-D0-FF ~]$ ls ~/.sugar/default/datastore2/
01  0e  1d  2a  43  59  60  6a  76  87  93  a4  ac  b4  bf  d4  e7  f4  ff
02  11  20  33  44  5a  65  6d  78  8a  95  a5  af  b5  c4  d6  e8  f6  index
05  13  21  37  48  5c  66  72  7e  8d  96  a6  b1  b8  ce  d8  e9  f8
06  18  28  3f  49  5d  68  74  81  91  97  a7  b2  bb  d1  de  ec  fc
0b  1b  29  40  4e  5f  69  75  85  92  9d  a9  b3  bc  d2  e5  ed  fd

[olpc at xo-0C-D0-FF ~]$ ls ~/.sugar/default/datastore2/01
012beac5-9d4e-477e-848d-d7ef6a731fca

[olpc at xo-0C-D0-FF ~]$ ls
~/.sugar/default/datastore2/01/012beac5-9d4e-477e-848d-d7ef6a731fca/
012beac5-9d4e-477e-848d-d7ef6a731fca  extra_metadata  metadata

[olpc at xo-0C-D0-FF ~]$ ls
~/.sugar/default/datastore2/01/012beac5-9d4e-477e-848d-d7ef6a731fca/extra_metadata/
preview

[olpc at xo-0C-D0-FF ~]$ file
~/.sugar/default/datastore2/01/012beac5-9d4e-477e-848d-d7ef6a731fca/012beac5-9d4e-477e-848d-d7ef6a731fca
/home/olpc/.sugar/default/datastore2/01/012beac5-9d4e-477e-848d-d7ef6a731fca/012beac5-9d4e-477e-848d-d7ef6a731fca:
OpenDocument Text

[olpc at xo-0C-D0-FF ~]$ cat
~/.sugar/default/datastore2/01/012beac5-9d4e-477e-848d-d7ef6a731fca/metadata
{"activity_id": "e8594bea74faa80539d93ef1a10de3c712bb2eac",
"title_set_by_user": "0", "uid":
"012beac5-9d4e-477e-848d-d7ef6a731fca", "title": "Write Activity",
"timestamp": 1210006266, "mtime": "2008-05-05T16:51:06.143085",
"fulltext": "mec mac", "keep": "0", "icon-color": "#00588C,#00EA11",
"mime_type": "application/vnd.oasis.opendocument.text", "activity":
"org.laptop.AbiWordActivity", "share-scope": "private"}

[olpc at xo-0C-D0-FF ~]$ ls -l ~/.sugar/default/datastore2/index/
total 101
-rw------- 1 olpc olpc     0 2008-05-07 07:54 flintlock
-rw-rw-r-- 1 olpc olpc    12 2008-05-05 16:48 iamflint
-rw-r--r-- 1 olpc olpc    17 2008-05-07 10:36 postlist.baseA
-rw-r--r-- 1 olpc olpc    17 2008-05-07 10:36 postlist.baseB
-rw-rw-r-- 1 olpc olpc 16384 2008-05-07 10:36 postlist.DB
-rw-r--r-- 1 olpc olpc    17 2008-05-07 10:36 record.baseA
-rw-r--r-- 1 olpc olpc    17 2008-05-07 10:36 record.baseB
-rw-rw-r-- 1 olpc olpc 16384 2008-05-07 10:36 record.DB
-rw-r--r-- 1 olpc olpc    17 2008-05-07 10:36 termlist.baseA
-rw-r--r-- 1 olpc olpc    17 2008-05-07 10:36 termlist.baseB
-rw-rw-r-- 1 olpc olpc 16384 2008-05-07 10:36 termlist.DB
-rw-r--r-- 1 olpc olpc    17 2008-05-07 10:36 value.baseA
-rw-r--r-- 1 olpc olpc    17 2008-05-07 10:36 value.baseB
-rw-rw-r-- 1 olpc olpc 49152 2008-05-07 10:36 value.DB

Opinions?

Thanks,

Tomeu



More information about the Devel mailing list