Summary
This documentation describes internal technical processes that make it easier to understand and work with the data when it is to be handled outside of the Receipts Space app. The architecture is designed in the spirit of “Local First” and “File Over App”.
File format - Sync
Log
Changes to data are written to a special continuous “log”. Each installation writes its own log. Installations are distinguished by a unique clientId
. The entries in each log are numbered consecutively with whole numbers, starting with 0
. In the file system, a separate file is created for each entry for technical reasons (avoidance of conflicts, integrity check, simplification for external file sync services), which are written to individual directories according to a distribution algorithm of 1000 files each.
Changes
The changes to the local database are collected as a list and summarized in a transaction (transactions
). Each individual database object is uniquely identified by the value of _id
. The integer value of _v
is used for the Last-Write-Wins (LWW) method (Lamport-Clock) to ensure that the database is ultimately consistent when it is restored. Only the new values whose _v
entry is greater than the found value are set during sync. If there is a conflict, the entry with the higher timestamp wins. This method means that the order in which the change entries are applied is irrelevant (CRDT).
File format
The file is stored in the simplest form as a JSONL file. The first line is the header and all data following the line break, in its actual binary representation, is the content.
The header contains the check elements s
with the size in bytes of the content. c
is the SHA256 checksum of the content as Base64 with URL-compliant encoding (-
instead of +
and _
instead of /
) without padding. t
is the time of creation as a timestamp in seconds. v
contains the version number of the format, which is currently 1
, but is thus prepared for adaptations. Example:
{"s":1473,"v":1,"c":"QHqyEU4WJOFsnxitlmsXFmpCXV2kZCCctzvO50_3IgM","t":1732311704}
The individual changes per line follow as JSON objects, as defined in JSONL. See above. Example:
{"_id":"47855a70de36449f821d40b45f8c170a","_type":"category","_v":1,"title":"Telekommunikation"}
File attachments
Files are stored outside the database as assets (assets
). These are also stored per client with a consecutive index, analogous to the transactions. Database entries refer to assets with a special URL that contains the following data:
cid
: The client ID.idx
: The index within the asset repository.
Metadata is not stored, but is part of the reference, which in turn allows the consistency of the data to be checked. Further information is:
checksum
: SHA-256 encoded as Base64.size
: Size in bytes.type
: MIME type.name
: File name.
The link to the asset is encoded as a URL:
asset:///
cid
/idx
/name
?t=type
&s=size
&d=checksum
Example:
asset:///1EH7BEtuL9xOz5aTpEyI4K/466/unnamed?s=6284&t=application%2Fpdf&d=JAd0HmXcSIVVdYMmDBjfVZeTvAyXQ94GmjA6CwSwOYU
Special data types
- The reference to another data set is realized by the
_id
of the entry. The name of the property usually corresponds to the type of the target object:_type
. - Date without time is realized by an integer value with the format
YYYYMMDD
. December 1, 2024 is therefore displayed as20241201
. - The amount is represented as a floating point number, e.g.
1.23
. - Timestamps are usually accurate to the second.
Example
node.js project that reads the Receipts Sync file, creates the database and writes all data (JSON) and assets to the export
folder: