very simple datastore reimplementation
Tomeu Vizoso
tomeu at tomeuvizoso.net
Wed May 7 06:42:49 EDT 2008
Hi all,
as you know, the DS is currently one of the weakest points in Sugar.
In my opinion, the major cause for it is because implementation was
started with very high goals and with little actual knowledge of what
would be actually needed in the short and medium term.
The current codebase is quite a bit more flexible than what we
actually need, and this brings a cost in maintainability (this affects
in turn stability) and performance. We now have much more information
about what's needed than back then, so perhaps we should rethink a bit
the tradebacks we made back then.
Have lately been thinking about how would look like a new
implementation that provided just what we need and reached the
conclusion that it would be quite easy to write and could bring us the
stability and performance that we needed, along with a more adequate
base in which to base the new features that we now know that are
urgently needed.
It turned out to be quite easy, as all the hardest problems were
already solved in the current implementation. In little more than 500
lines of code, we get an useful replacement that lacks:
- Support for mounting removable datastores. We have agreed on moving
to just list the files in removable devices, without the DS having
anything to do. Although extending the DS capacity with SD cards is an
interesting feature, it brings many non-trivial issues that make this
a longer term feature.
- Filtering by arbitrary metadata properties. If only the journal is
allowed to execute find() calls, then we can restrict the properties
that need to be stored in the index, gaining disk space and speed when
searching. Do we really need this functionality?
And we gain:
- Small and simple codebase.
- More robustness in the storage of metadata, this is stored in plain
text files in the file system.
- More robustness regarding the use of the full text engine, as this
is only used by the journal and this could fall back to a a plain list
of the files, in case the index was corrupted anyhow.
- Storage of custom metadata properties such as the current page
number in Read (this important feature wasn't working right now due to
some fragility in how we interacted with the full text index).
Work still left to do:
- Complete the query support, basically copying code from the older DS.
- Measure performance and possibly trade some of the current
simplicity for faster operation.
- Expose the files with a human readable name, for legacy apps and
maybe for backups? Using a FUSE plugin?
- Monitor one dir where legacy apps will be allowed to write files to,
and move new files to the datastore along with some default metadata.
- Delta compression and version tracking.
- Index rebuilding after corruption.
- Full text indexing of the text content of files.
Here is the code:
http://dev.laptop.org/git?p=users/tomeu/datastore;a=summary
Rough guide for trying on the xo:
- yum install git
- git-clone git://dev.laptop.org/users/tomeu/datastore datastore
- cd datastore
- ./autogen.sh --prefix=/usr
- make
- sudo make install
- ctrl-alt-del
The current file layout is as follows:
[olpc at xo-0C-D0-FF ~]$ ls ~/.sugar/default/datastore2/
01 0e 1d 2a 43 59 60 6a 76 87 93 a4 ac b4 bf d4 e7 f4 ff
02 11 20 33 44 5a 65 6d 78 8a 95 a5 af b5 c4 d6 e8 f6 index
05 13 21 37 48 5c 66 72 7e 8d 96 a6 b1 b8 ce d8 e9 f8
06 18 28 3f 49 5d 68 74 81 91 97 a7 b2 bb d1 de ec fc
0b 1b 29 40 4e 5f 69 75 85 92 9d a9 b3 bc d2 e5 ed fd
[olpc at xo-0C-D0-FF ~]$ ls ~/.sugar/default/datastore2/01
012beac5-9d4e-477e-848d-d7ef6a731fca
[olpc at xo-0C-D0-FF ~]$ ls
~/.sugar/default/datastore2/01/012beac5-9d4e-477e-848d-d7ef6a731fca/
012beac5-9d4e-477e-848d-d7ef6a731fca extra_metadata metadata
[olpc at xo-0C-D0-FF ~]$ ls
~/.sugar/default/datastore2/01/012beac5-9d4e-477e-848d-d7ef6a731fca/extra_metadata/
preview
[olpc at xo-0C-D0-FF ~]$ file
~/.sugar/default/datastore2/01/012beac5-9d4e-477e-848d-d7ef6a731fca/012beac5-9d4e-477e-848d-d7ef6a731fca
/home/olpc/.sugar/default/datastore2/01/012beac5-9d4e-477e-848d-d7ef6a731fca/012beac5-9d4e-477e-848d-d7ef6a731fca:
OpenDocument Text
[olpc at xo-0C-D0-FF ~]$ cat
~/.sugar/default/datastore2/01/012beac5-9d4e-477e-848d-d7ef6a731fca/metadata
{"activity_id": "e8594bea74faa80539d93ef1a10de3c712bb2eac",
"title_set_by_user": "0", "uid":
"012beac5-9d4e-477e-848d-d7ef6a731fca", "title": "Write Activity",
"timestamp": 1210006266, "mtime": "2008-05-05T16:51:06.143085",
"fulltext": "mec mac", "keep": "0", "icon-color": "#00588C,#00EA11",
"mime_type": "application/vnd.oasis.opendocument.text", "activity":
"org.laptop.AbiWordActivity", "share-scope": "private"}
[olpc at xo-0C-D0-FF ~]$ ls -l ~/.sugar/default/datastore2/index/
total 101
-rw------- 1 olpc olpc 0 2008-05-07 07:54 flintlock
-rw-rw-r-- 1 olpc olpc 12 2008-05-05 16:48 iamflint
-rw-r--r-- 1 olpc olpc 17 2008-05-07 10:36 postlist.baseA
-rw-r--r-- 1 olpc olpc 17 2008-05-07 10:36 postlist.baseB
-rw-rw-r-- 1 olpc olpc 16384 2008-05-07 10:36 postlist.DB
-rw-r--r-- 1 olpc olpc 17 2008-05-07 10:36 record.baseA
-rw-r--r-- 1 olpc olpc 17 2008-05-07 10:36 record.baseB
-rw-rw-r-- 1 olpc olpc 16384 2008-05-07 10:36 record.DB
-rw-r--r-- 1 olpc olpc 17 2008-05-07 10:36 termlist.baseA
-rw-r--r-- 1 olpc olpc 17 2008-05-07 10:36 termlist.baseB
-rw-rw-r-- 1 olpc olpc 16384 2008-05-07 10:36 termlist.DB
-rw-r--r-- 1 olpc olpc 17 2008-05-07 10:36 value.baseA
-rw-r--r-- 1 olpc olpc 17 2008-05-07 10:36 value.baseB
-rw-rw-r-- 1 olpc olpc 49152 2008-05-07 10:36 value.DB
Opinions?
Thanks,
Tomeu
More information about the Devel
mailing list