This documentation describes the technical architecture of the data format for use outside the Receipts Space app. The design follows the principles of “Local First” and “File Over App”.
Highlights
- Offline-First: Fully usable without a server
- Conflict-Free: CRDT-based synchronization across arbitrary storage media
- Tamper-Resistant: Chained transactions with optional signing
- Durable: Simple, documented format (JSON/JSONL)
- Portable: Data belongs to the user, not the app
Example code on Github
Directory Structure
A workspace is organized as follows:
workspace/
├── info.json → Metadata
├── transactions/<clientId>/<...>/<idx>.dat → Change log
└── assets/<clientId>/<...>/<idx>.dat → File attachmentsWorkspace Metadata (info.json)
| Field | Description |
|---|---|
apiVersion | Format version, currently 3 |
workspaceType | Always receipts2 for Receipts Space |
workspaceId | Unique workspace ID |
createDate | Creation time (Unix timestamp in seconds) |
encryption | (optional) Encryption configuration → Encryption |
Transactions
Changes are stored in an append-only log. Each client maintains its own log identified by a unique clientId. Entries are numbered sequentially (0.dat, 1.dat, …) and, to avoid filesystem conflicts, are split into subdirectories using a distribution algorithm (max. 1000 files per folder).
Conflict-Free Synchronization (CRDT)
Each database record contains:
| Field | Description |
|---|---|
_id | Unique ID of the record |
_type | Data type (e.g., category, receipt) |
_v | Version counter (Lamport clock) |
Synchronization uses Last-Write-Wins (LWW): a record is adopted only if its _v is greater than the existing value. If _v is equal, the timestamp decides. This makes application order irrelevant (CRDT).
File Format
Each transaction file consists of two parts separated by a newline:
- Header (first line): JSON object with metadata
- Content (rest): JSONL (newline-delimited JSON) formatted changes (UTF-8)
Header fields:
| Field | Description |
|---|---|
s | Content size in bytes |
c | SHA256 checksum of the content (URL-safe Base64, no padding) |
t | Creation time (Unix timestamp in seconds) |
v | Format version, currently 1 |
p | (optional) Hash of the previous transaction → Chaining |
did | (optional) Unique Device ID of the installation |
Example:
{"c":"Sg96RWfbFLN6_d3gsG65IJjJVgM9Jw9yFbxSzaYrxv8","did":"8rc37rrax970579qkvra65cke4","p":"za_9iYEZLZtTmdv0MMaV_IwD_tF17Sx-imb31KhkgUY","s":4134,"t":1768137121,"v":1}
{"_id":"47855a70de36449f821d40b45f8c170a","_type":"category","_v":1,"title":"Telekommunikation"}Device-ID
The Device ID did is a unique identifier per installation (device or user). It is typically set in the first transaction of a client and remains the same even when different Client IDs are used. The same did can be shared by multiple clients, but each client always has only one did.
Transaction Chaining
Using the optional p (previous) field, each transaction can be linked to its predecessor. The value is the SHA256 hash of the complete previous file (header + content).
┌────────────────────┐ ┌──────────────┐ ┌──────────────┐
│ 0.dat │◄────│ 1.dat │◄────│ 2.dat │
│ p: hash(info.json) │ p │ p: hash(0) │ p │ p: hash(1) │
└────────────────────┘ └──────────────┘ └──────────────┘Guarantees:
- Integrity: Tampering changes the hash → broken chain detectable
- Completeness: Missing transactions break the chain
- Order: Chronology is cryptographically secured
Verification: Verify from the last transaction backwards that each p value matches the computed hash of the previous file.
File Attachments (Assets)
Binary files are stored separately as assets, following the same per-client sequential indexing as transactions.
Asset URL format: asset:///<clientId>/<index>/<filename>?s=<size>&t=<mimeType>&d=<checksum>
| Parameter | Description |
|---|---|
clientId | Client ID |
index | Sequential index |
filename | Original filename |
s | Size in bytes |
t | MIME type |
d | SHA256 checksum (Base64) |
Example: asset:///1EH7BEtuL9xOz5aTpEyI4K/466/invoice.pdf?s=6284&t=application%2Fpdf&d=JAd0HmXcSIVVdYMmDBjfVZeTvAyXQ94GmjA6CwSwOYU
Metadata is included in the URL so integrity can be checked when loading the asset.
Data Types
| Type | Format | Example |
|---|---|---|
| Reference | _id of the target object | "category": "abc123" |
| Date | Integer YYYYMMDD | 20241201 (Dec 1, 2024) |
| Amount | Floating point number | 1.23 |
| Timestamp | Unix seconds | 1732311704 |
Some fields have special behavior to be aware of:
- The document type is stored in
doctype. Currently there is only one document type with the IDd0c5d0c5d0c5d0c5d0c5d0c5d0c5d0c5. If this is not set, thecreditvalue determines whether the record is an income (true) or an expense (false). - Contact (
contact) and category (category) are referenced by their ID. - Tags in
tagsare stored as a dictionary where the keys are the reference/ID and the values indicate whether the tag is set (truthy) or removed (falsy). - The tax total is stored in
tax. The tax details are stored intaxDetails. This is also a dictionary where the keys represent the tax rate as a string (e.g."19.0") and the values represent the amount. - The confirmation (which in revision-safety mode also makes the record read-only) is stored in
confirmed.
Encryption (optional)
For sensitive data, transactions and assets can be fully encrypted (header + content). Encryption is configured workspace-wide in info.json — a workspace is either fully encrypted or not encrypted at all.
Configuration in info.json
{
"apiVersion": 3,
"workspaceType": "receipts2",
"workspaceId": "...",
"createDate": 1732311704,
"encryption": {
"algorithm": "aes-256-gcm",
"kdf": "pbkdf2",
"kdfHash": "sha256",
"kdfIterations": 100000,
"salt": "base64-random-salt-16bytes",
"verify": "base64-encrypted-test-string"
}
}| Field | Description |
|---|---|
algorithm | Encryption algorithm: aes-256-gcm |
kdf | Key derivation function: pbkdf2 |
kdfHash | Hash function for PBKDF2: sha256 |
kdfIterations | Number of iterations (min. 100,000 recommended) |
salt | Random salt (Base64, min. 16 bytes) |
verify | Encrypted test string for password verification |
Key Derivation
Password + Salt → PBKDF2-SHA256 (kdfIterations) → Workspace Key (256 bit)Password Verification
The verify field contains the string receipts2 encrypted with the workspace key:
verify = [IV (12B)][Ciphertext + AuthTag (16B)]Check: Decrypting verify → Result = receipts2? → Password correct ✓
File Layout (encrypted)
Transactions and assets are stored fully encrypted:
[IV (12B)][Ciphertext + AuthTag (16B)]
Ciphertext = AES-GCM(Header + "\n" + Content)Encryption Flow (AES-256-GCM)
Encrypting:
- Create the normal transaction file (Header + Content)
- Generate random IV (12 bytes)
- Encrypt:
encrypted = AES-GCM(key, iv, plaintext) - Store:
[IV (12B)][encrypted] (contains Ciphertext + AuthTag)
Decrypting:
- Extract IV (first 12 bytes)
- Extract
encrypted(rest, contains Ciphertext + AuthTag) - Decrypt:
plaintext = AES-GCM-Decrypt(key, iv, encrypted) - Parse Header (first line) and Content (rest)
- Verify checksum
cagainst Content
Security Notes
| Aspect | Recommendation |
|---|---|
| Algorithm | AES-256-GCM (authenticated encryption) |
| IV | Never reuse IVs, always generate randomly (12 bytes) |
| Key Derivation | PBKDF2 with at least 100,000 iterations |
| Salt | Generate once per workspace, store in info.json |
| Password Verification | AES-GCM AuthTag fails on wrong password |
Limitations
When encryption is enabled:
- Sync requires password: Without the password no metadata is readable
- No chain verification without decryption:
pis encrypted - Filename remains visible: Only content is protected
Note
With encryption enabled, the entire transaction is unreadable without the correct password. Integrity verification via c is performed after decryption.