Technical Documentation

This documentation describes the technical architecture of the data format for use outside the Receipts Space app. The design follows the principles of “Local First” and “File Over App”.

Highlights

  • Offline-First: Fully usable without a server
  • Conflict-Free: CRDT-based synchronization across arbitrary storage media
  • Tamper-Resistant: Chained transactions with optional signing
  • Durable: Simple, documented format (JSON/JSONL)
  • Portable: Data belongs to the user, not the app

Example code on Github

Directory Structure

A workspace is organized as follows:

workspace/
├── info.json                                    → Metadata
├── transactions/<clientId>/<...>/<idx>.dat      → Change log
└── assets/<clientId>/<...>/<idx>.dat            → File attachments

Workspace Metadata (info.json)

FieldDescription
apiVersionFormat version, currently 3
workspaceTypeAlways receipts2 for Receipts Space
workspaceIdUnique workspace ID
createDateCreation time (Unix timestamp in seconds)
encryption(optional) Encryption configuration → Encryption

Transactions

Changes are stored in an append-only log. Each client maintains its own log identified by a unique clientId. Entries are numbered sequentially (0.dat, 1.dat, …) and, to avoid filesystem conflicts, are split into subdirectories using a distribution algorithm (max. 1000 files per folder).

Conflict-Free Synchronization (CRDT)

Each database record contains:

FieldDescription
_idUnique ID of the record
_typeData type (e.g., category, receipt)
_vVersion counter (Lamport clock)

Synchronization uses Last-Write-Wins (LWW): a record is adopted only if its _v is greater than the existing value. If _v is equal, the timestamp decides. This makes application order irrelevant (CRDT).

File Format

Each transaction file consists of two parts separated by a newline:

  1. Header (first line): JSON object with metadata
  2. Content (rest): JSONL (newline-delimited JSON) formatted changes (UTF-8)

Header fields:

FieldDescription
sContent size in bytes
cSHA256 checksum of the content (URL-safe Base64, no padding)
tCreation time (Unix timestamp in seconds)
vFormat version, currently 1
p(optional) Hash of the previous transaction → Chaining
did(optional) Unique Device ID of the installation

Example:

{"c":"Sg96RWfbFLN6_d3gsG65IJjJVgM9Jw9yFbxSzaYrxv8","did":"8rc37rrax970579qkvra65cke4","p":"za_9iYEZLZtTmdv0MMaV_IwD_tF17Sx-imb31KhkgUY","s":4134,"t":1768137121,"v":1}
{"_id":"47855a70de36449f821d40b45f8c170a","_type":"category","_v":1,"title":"Telekommunikation"}

Device-ID

The Device ID did is a unique identifier per installation (device or user). It is typically set in the first transaction of a client and remains the same even when different Client IDs are used. The same did can be shared by multiple clients, but each client always has only one did.

Transaction Chaining

Using the optional p (previous) field, each transaction can be linked to its predecessor. The value is the SHA256 hash of the complete previous file (header + content).

┌────────────────────┐     ┌──────────────┐     ┌──────────────┐
│       0.dat        │◄────│    1.dat     │◄────│    2.dat     │
│ p: hash(info.json) │  p  │  p: hash(0)  │  p  │  p: hash(1)  │
└────────────────────┘     └──────────────┘     └──────────────┘

Guarantees:

  • Integrity: Tampering changes the hash → broken chain detectable
  • Completeness: Missing transactions break the chain
  • Order: Chronology is cryptographically secured

Verification: Verify from the last transaction backwards that each p value matches the computed hash of the previous file.

File Attachments (Assets)

Binary files are stored separately as assets, following the same per-client sequential indexing as transactions.

Asset URL format: asset:///<clientId>/<index>/<filename>?s=<size>&t=<mimeType>&d=<checksum>

ParameterDescription
clientIdClient ID
indexSequential index
filenameOriginal filename
sSize in bytes
tMIME type
dSHA256 checksum (Base64)

Example: asset:///1EH7BEtuL9xOz5aTpEyI4K/466/invoice.pdf?s=6284&t=application%2Fpdf&d=JAd0HmXcSIVVdYMmDBjfVZeTvAyXQ94GmjA6CwSwOYU

Metadata is included in the URL so integrity can be checked when loading the asset.

Data Types

TypeFormatExample
Reference_id of the target object"category": "abc123"
DateInteger YYYYMMDD20241201 (Dec 1, 2024)
AmountFloating point number1.23
TimestampUnix seconds1732311704

Some fields have special behavior to be aware of:

  • The document type is stored in doctype. Currently there is only one document type with the ID d0c5d0c5d0c5d0c5d0c5d0c5d0c5d0c5. If this is not set, the credit value determines whether the record is an income (true) or an expense (false).
  • Contact (contact) and category (category) are referenced by their ID.
  • Tags in tags are stored as a dictionary where the keys are the reference/ID and the values indicate whether the tag is set (truthy) or removed (falsy).
  • The tax total is stored in tax. The tax details are stored in taxDetails. This is also a dictionary where the keys represent the tax rate as a string (e.g. "19.0") and the values represent the amount.
  • The confirmation (which in revision-safety mode also makes the record read-only) is stored in confirmed.

Encryption (optional)

For sensitive data, transactions and assets can be fully encrypted (header + content). Encryption is configured workspace-wide in info.json — a workspace is either fully encrypted or not encrypted at all.

Configuration in info.json

{
  "apiVersion": 3,
  "workspaceType": "receipts2",
  "workspaceId": "...",
  "createDate": 1732311704,
  "encryption": {
    "algorithm": "aes-256-gcm",
    "kdf": "pbkdf2",
    "kdfHash": "sha256",
    "kdfIterations": 100000,
    "salt": "base64-random-salt-16bytes",
    "verify": "base64-encrypted-test-string"
  }
}
FieldDescription
algorithmEncryption algorithm: aes-256-gcm
kdfKey derivation function: pbkdf2
kdfHashHash function for PBKDF2: sha256
kdfIterationsNumber of iterations (min. 100,000 recommended)
saltRandom salt (Base64, min. 16 bytes)
verifyEncrypted test string for password verification

Key Derivation

Password + Salt → PBKDF2-SHA256 (kdfIterations) → Workspace Key (256 bit)

Password Verification

The verify field contains the string receipts2 encrypted with the workspace key:

verify = [IV (12B)][Ciphertext + AuthTag (16B)]

Check: Decrypting verify → Result = receipts2? → Password correct ✓

File Layout (encrypted)

Transactions and assets are stored fully encrypted:

[IV (12B)][Ciphertext + AuthTag (16B)]

Ciphertext = AES-GCM(Header + "\n" + Content)

Encryption Flow (AES-256-GCM)

Encrypting:

  1. Create the normal transaction file (Header + Content)
  2. Generate random IV (12 bytes)
  3. Encrypt: encrypted = AES-GCM(key, iv, plaintext)
  4. Store: [IV (12B)][encrypted] (contains Ciphertext + AuthTag)

Decrypting:

  1. Extract IV (first 12 bytes)
  2. Extract encrypted (rest, contains Ciphertext + AuthTag)
  3. Decrypt: plaintext = AES-GCM-Decrypt(key, iv, encrypted)
  4. Parse Header (first line) and Content (rest)
  5. Verify checksum c against Content

Security Notes

AspectRecommendation
AlgorithmAES-256-GCM (authenticated encryption)
IVNever reuse IVs, always generate randomly (12 bytes)
Key DerivationPBKDF2 with at least 100,000 iterations
SaltGenerate once per workspace, store in info.json
Password VerificationAES-GCM AuthTag fails on wrong password

Limitations

When encryption is enabled:

  • Sync requires password: Without the password no metadata is readable
  • No chain verification without decryption: p is encrypted
  • Filename remains visible: Only content is protected

Note

With encryption enabled, the entire transaction is unreadable without the correct password. Integrity verification via c is performed after decryption.

Example

Github