Skip to content

vsync v0.11.0 — Spec

Status: design · scope = test-vector format for cross-language conformance · purely additive — no wire-format change, no CLI change.

One theme: three runtime libraries in three languages must produce identical bytes. Python, TypeScript, and Go each ship a vsync runtime (the lib API surface is v0.12; the matching CLI surface is v0.10). Cross-language byte compatibility of the RQE1 / RQEM0001 / VSYNC_CONFIG envelopes is the load-bearing guarantee of the whole project — if one lib decrypts a blob the other two can't, the product is broken. This spec defines the shared test-vector format that all three libs (plus the Bun CLI itself) run their conformance suites against.

For envelope wire details, see v0.2-secret-lib.md (RQE1, RQEM0001). For audit-log format, see v0.4-audit-log.md. For the runtime API the vectors validate, see v0.12. For the CLI subcommand that ultimately generates vectors, see v0.10.

This spec is the format of a vector. The vectors themselves, and the generator that produces them, are out of scope (§9, §10).


1. Location

Test vectors live at docs/specs/test-vectors/ in this repo. They're co-located with the spec docs because they are part of the spec. The bytes on disk are the normative definition of "what a conforming impl must produce / accept" — prose in the wire spec is the description; the vector is the fixture.

text
docs/specs/
  v0.2-secret-lib.md
  v0.4-audit-log.md
  v0.9-…
  v0.10-…
  v0.11-conformance-test-vectors.md      ← this file
  test-vectors/
    rqe1-decrypt/
      hello-world.bin
      hello-world.json
      empty-plaintext.bin
      empty-plaintext.json
    rqe1-decrypt-error/
      wrong-passphrase.bin
      wrong-passphrase.json
      corrupt-magic.bin
      corrupt-magic.json
      truncated-ciphertext.bin
      truncated-ciphertext.json
    rqem0001-manifest/

    config-blob/

    fallback-chain/

    asset-path/

    error-taxonomy/

One subdirectory per category (§4). Vectors are not nested deeper than one level under test-vectors/.


2. File pairs

Each vector is a pair: a .bin carrying raw bytes and a .json carrying metadata. The pairing is by basename within a category directory.

text
rqe1-decrypt/hello-world.bin     ← the raw bytes a lib must operate on
rqe1-decrypt/hello-world.json    ← the inputs + expected outputs
  • .bin is the byte payload exactly as it would appear on the wire (RQE1 envelope, RQEM0001 manifest, VSYNC_CONFIG blob, etc.). No base64, no hex — raw bytes.
  • .json is UTF-8, LF, 2-space indent, trailing newline. No comments.
  • A .bin with no matching .json (or vice versa) is a lint error in CI.
  • Some categories (fallback-chain, error-taxonomy for purely-API negative cases) don't need bytes; in that case the .bin is omitted and the loader (§8) treats the vector as JSON-only. The .json field inputs.bin is null in that case.

3. Metadata JSON schema

The .json is a single object. The schema is small on purpose; growth happens by adding categories, not by adding free-form fields.

Positive case:

json
{
  "category": "rqe1-decrypt",
  "description": "Decrypt a passphrase-derived RQE1 envelope; expect 'hello world' UTF-8",
  "inputs": {
    "passphrase": "correct horse battery staple"
  },
  "expected": {
    "plaintext_hex": "68656c6c6f20776f726c64",
    "plaintext_utf8": "hello world",
    "error": null
  },
  "generated_by": "vsync@<commit-sha>",
  "spec_version": "v0.12"
}

Negative case:

json
{
  "category": "rqe1-decrypt-error",
  "description": "Wrong passphrase yields WrongPassphraseError",
  "inputs": {
    "passphrase": "wrong-passphrase"
  },
  "expected": {
    "plaintext_hex": null,
    "error": "WrongPassphraseError"
  },
  "generated_by": "vsync@<commit-sha>",
  "spec_version": "v0.12"
}

Required top-level fields on every vector:

FieldTypeNotes
categorystringOne of the names in §4. Must match the parent directory.
descriptionstringOne-line human summary. Shown in failure output.
inputsobjectCategory-specific; see §4.
expectedobjectCategory-specific. Always contains error: string | null.
generated_bystringvsync@<sha> of the generator commit. (§6)
spec_versionstringThe wire-spec tag this vector conforms to — v0.12, v0.13, etc. (§8)

expected.error is a string class name from the v0.9 error taxonomy when the case is negative, and null when positive. Mixing — non-null error and non-null plaintext_hex — is a lint error.

Hex encoding (plaintext_hex) is lowercase, unspaced, even-length. Binary plaintexts that aren't valid UTF-8 omit plaintext_utf8 (the loader compares on plaintext_hex only).


4. Categories

Each lib's CI runs the full corpus. No category is optional. A new category requires a spec PR; a new vector inside an existing category does not.

CategoryWhat it validatesinputs keysexpected keys (in addition to error)
rqe1-decryptPositive RQE1 decryptionpassphraseplaintext_hex, optionally plaintext_utf8
rqe1-decrypt-errorNegative RQE1 — wrong passphrase, corrupt magic, truncated ciphertext, bad GCM tagpassphrase— (only error)
rqem0001-manifestPointer-seal positive + negative (embedded vs remote ts mismatch, wrong magic)passphrase, remote_tsembedded_ts, payload_hex (positive)
config-blobVSYNC_CONFIG decode (magic-prefix + base64 + gzip → JSON); positive + negative (wrong magic, malformed gzip, unknown v:)— (bytes carry it all)config_json (positive)
fallback-chainGiven an in-memory vault state + simulated env, assert get() / source() / has() for a list of keysvault (object), env (object), queries (array of keys)results (array of {key, value, source, has})
asset-pathassetPath() materialization — file exists, perms 0600, content matches assetBytes(), cleaned up on close()key, vaultbytes_hex, mode_octal ("0600")
error-taxonomyClass identity across libs — same wrong-passphrase blob raises WrongPassphraseError, not a generic errorvaries by case— (only error)

For fallback-chain and error-taxonomy the JSON may stand alone (.bin absent — see §2).


5. Negative-case discipline

The whole point of the corpus is that a generic try / except Exception pass is insufficient. Libs must distinguish errors by class.

  • expected.error is the canonical class name from v0.12's error taxonomy (WrongPassphraseError, CorruptEnvelopeError, TruncatedEnvelopeError, ManifestTimestampMismatchError, UnknownConfigVersionError, …).
  • A lib passes a negative vector iff it raises an error whose class name matches expected.error:
    • Python: isinstance(e, vsync.WrongPassphraseError)
    • TypeScript: e instanceof WrongPassphraseError (or e.name === "WrongPassphraseError" on a discriminated union)
    • Go: errors.Is(err, vsync.ErrWrongPassphrase) against sentinel errors

A lib that catches Exception / Error / error and reports "the vector failed correctly" is not conforming. The conformance loader (§7) enforces class identity, not just truthiness.

If a lib's idiomatic error type is named differently (Go's ErrWrongPassphrase vs Python's WrongPassphraseError), the lib's loader maps the canonical name to the local sentinel — the mapping table is per-lib code, the canonical name in the vector is fixed.


6. Generation responsibility (during v0.x)

  • Until Python catches up to the spec, the Bun CLI is the canonical generator. It's the only impl that passes its own conformance suite today, so it's the only impl entitled to mint new vectors.
  • Once Python stabilises and passes the full corpus, Python takes over as canonical generator (decision K). The reference impl and the generator should be the same codebase — that's how you keep "the spec is what the reference impl does" honest.
  • Vector generation tooling is a separate work item — this spec defines the format, not the generator. The future CLI surface is vsync test-vectors generate (TODO, out of scope here — to be specified alongside v0.10 or later).
  • For early development, vectors may be generated by running known inputs through the CLI by hand and committing the resulting .bin + a hand-written .json. Every .json carries generated_by: "vsync@<sha>" for traceability — so when a vector turns out to be wrong, you can find the commit that minted it.

7. Conformance test protocol per lib

text
libraries/
  python/tests/conformance/
    loader.py
    test_conformance.py
  typescript/tests/conformance/
    loader.ts
    conformance.test.ts
  go/tests/conformance/
    loader_test.go
    conformance_test.go

Each lib ships a thin loader that:

  1. Walks docs/specs/test-vectors/<category>/*.json.
  2. For each JSON, loads the sibling .bin if present.
  3. Dispatches to a category-specific assertion (positive: decode-and-compare; negative: assert-class).
  4. Reports per-vector pass/fail with the vector's description and generated_by.

CI fails if any vector fails. Each lib's suite covers 100% of categories — no lib opts out. If a lib can't yet pass a category, it's not ready to ship.

The loader is per-lib (idiomatic — pytest, bun test, go test). The vector corpus is shared; the harness is not.


8. Versioning of test vectors (pre-1.0)

  • Each vector's spec_version field tags the wire-spec tag it conforms to: v0.9, v0.10, etc.
  • When the wire spec evolves pre-1.0, old vectors are deleted, not preserved (decision I). The corpus on main always describes one wire spec — the current one. There is no test-vectors/v0.12/ and test-vectors/v0.13/ side-by-side.
  • Vectors live in git. External references use a repo SHA (e.g. an upstream consumer pins docs/specs/test-vectors/ at SHA abc123).
  • The atomic-PR rule (decision H) is what makes this safe: a wire-spec change lands in one PR that simultaneously updates the spec doc, all three lib impls, and the vector corpus. No lib gets stranded on stale vectors because no PR is allowed to update one without the others.

Post-1.0 versioning (preserving old vectors for migration testing) is out of scope here — revisit when 1.0 is in sight.


9. What test vectors do NOT cover

  • Live S3 / GCS / Azure / AWS interaction — integration tests, lib-local, not part of the shared corpus. Mocking a backend reliably across three languages costs more than it returns.
  • Performance benchmarks — separate concern, per-lib. The corpus is a correctness gate, not a perf gate.
  • SDK ergonomics — idiomatic API differences across languages are not byte-format concerns. Whether Python uses a context manager or explicit .close(), whether TypeScript returns a Result<T, E> or throws, whether Go uses functional options — all lib choices, all out of scope. The vector pins the bytes and the error class; the lib decides how to spell the API.

10. Explicitly out of scope for this spec

  • The lib API surface — that's v0.9.
  • CLI subcommands — that's v0.10.
  • Actual test-vector contents — the .bin / .json files themselves are an output of building the system. They land in the same PR as the impls that produce/consume them; they aren't specified ahead of time in this doc.
  • A formal test runner / harness implementation — each lib is idiomatic; the harness is not shared.
  • Benchmarks.
  • The vsync test-vectors generate subcommand — mentioned in §6 as the future entry point; full surface to be specified alongside v0.10 or later.

11. Non-goals

  • Hex / base64 encoding of .bin inside the JSON — out. The .bin is raw bytes on disk so impls can mmap / fs.readFile / os.ReadFile without a decode pass. Hex appears only in expected.plaintext_hex for the comparison side.
  • Per-vector RNG seeds for "fuzzy" cases — out. Every vector is a fixed bytes-in / bytes-out pair. Fuzz testing is a per-lib concern.
  • A JSON Schema file — out for now. The schema is small enough that prose + examples in §3 carry it. Add a vector.schema.json if the corpus grows past ~50 vectors and drift becomes a problem.
  • Vector signing / checksums — out. The repo SHA is the integrity anchor.
  • Cross-version migration vectors (encrypted with v0.9 spec, decrypted by v0.10 reader) — out pre-1.0 by decision I.
  • Vectors for the audit-log CSV format (v0.4) — out for v0.11. The append-only protocol is bucket-side and tested in integration; if a cross-lib divergence ever shows up there, add an audit-log category in a later spec.

Released under the MIT License.