Skip to content

Ingest

Ingest is the single composed call plugins use to write to the corpus. ctx.ingest({ source, envelope, parties }) find-or-creates a Source, writes its Envelope and party rows, stamps plugin provenance, and emits a core.ingested outbox event — all in one atomic Store batch.

The corpus is an observation log: rows assert “this plugin observed this thing at this moment”. Ingest is the only sanctioned way for a plugin to add to that log. Before ingest existed, plugins reached the corpus through a ctx.source.write kludge that issued two non-atomic Store batches, never composed find-or-create, and wrote envelopes with empty parties. Ingest replaces that with one call the framework guarantees the shape of: source + envelope + parties + outbox event commit together or not at all.

Three surfaces converge here:

  • Source contributes a new core-internal findOrAppendCreate(builder, input) that appends find-or-create statements to a caller-owned BatchBuilder instead of opening its own batch.
  • Access contributes a core-internal appendEnvelope(builder, input) that appends envelope + party + core.envelope_indexed statements. The prior dynamically-typed access.writeEnvelope reach-around is gone; appendEnvelope is the only path that writes envelope rows.
  • Plugin system owns the RuntimeContext.ingest field, the new "ingest" capability that gates it, and the loader-side migration bridge that auto-upgrades plugins still asking for the legacy ["source_write", "emit_envelopes"] pair.

ctx.ingest runs synchronously: capability check → payload-size check → find-or-append-create → append-envelope → conditional core.ingested emit → store.batch. TOCTOU collisions on (locationKind, pathId) are retried transparently within tuning.findOrCreateMaxRetries.

  • ctx.ingest(input) — the plugin-facing method on RuntimeContext. Returns { source: { id, wasCreated }, envelope: { id, wasCreated } }. The two wasCreated booleans are how callers distinguish a brand-new write from idempotent collapse without a re-read — useful for cursor advancement and logging.
  • IngestInput{ source, envelope, parties }. Provenance fields are absent by design: the runtime context closure has already captured the calling plugin’s identity and ingest stamps it on every row. Plugins cannot supply or override assertedByPluginPackage / assertedByPluginVersion.
  • Content-derived envelope id — derived from (sourceId, transport, receivedAt, package, version). The same five-tuple twice produces the same id; the second call’s pre-batch SELECT finds the row, writes nothing, emits nothing, and returns wasCreated: false. rawPayload bytes are deliberately not in the hash so byte-noisy variants of one logical assertion collapse.
  • core.ingested event — emitted exactly once per call where at least one of source/envelope was newly created. Payload { sourceId, envelopeId, sourceCreated }, schemaVersion: 1. Fires in the same batch as the writes, so subscribers see it only after the rows are durable. The absence of an event is itself the auditable signal that a re-ingest collapsed.
  • "ingest" capability — net-new literal on the Capability union. Plugins declare it in manifest.capabilities.requested. The deprecated "source_write" and "emit_envelopes" literals remain in the vocabulary so existing host configs still parse; a loader-side migration bridge auto-grants "ingest" to any plugin that requested both legacy capabilities. The bridge is removed in a follow-up concept once first-party plugins migrate.
  • Host tuningmaxEnvelopePayloadBytes (default 64 KiB), findOrCreateMaxRetries (default 1), and parseIngestInput?: StandardSchemaV1<IngestInput> (optional runtime validator threaded through createCore). Per AGENTS.md, the framework boundary accepts any Standard-Schema-compatible validator (Zod, Valibot, ArkType, Effect Schema); when omitted, TypeScript is the only gate.

Exact signatures and acceptance criteria: @inseam/plugin-contract API · arch/ingest/spec.md.

Use ctx.ingest whenever a plugin observes a thing it wants the corpus to remember. There is no other write path — ctx.source.write is gone, and the underlying appendEnvelope / findOrAppendCreate helpers are core-internal precisely so plugins cannot bypass ingest to write partial state.

Things ingest deliberately is not:

  • Not a fetch path. Ingest is an assertion path. The plugin supplies whatever retrieval metadata it already knows from the observation that triggered the call. Content bytes never live on the source row — Connection’s fetch capability is the lazy retrieval path, and it can fail at any time (deletion, token expiry, revoked share) regardless of ingest having succeeded. Plugins consuming source bytes MUST treat fetch failure as an expected branch.
  • Not a reachability check. Ingest does not probe the upstream. The corpus stays correct even when sources later become unfetchable. Periodic sampled reachability is the concern of a future connection-health concept, not ingest.
  • Not a retrieval refresher. Re-ingest with fresher retrieval fields silently ignores them; the existing row is preserved. A future refreshRetrieval(sourceId, retrieval) call owns that path.
  • Not a batched-write API. One envelope per call. ctx.ingestMany([...]) is deferred until single-shot stabilizes against a real plugin.
  • Not an eviction surface. Provenance stamping makes plugin-scoped eviction possible later; the actual delete path is the concern of a future corpus-eviction concept.

receivedAt is upstream time, not now-on-this-host

Section titled “receivedAt is upstream time, not now-on-this-host”

The single most common footgun. IngestEnvelopeInput.receivedAt is the timestamp the plugin observed the assertion at the upstream — Gmail’s internalDate, the Date header on a webhook, the filesystem mtime. Fall back to ctx.clock.now() only when the upstream genuinely supplies no timestamp. receivedAt participates in envelope-id derivation: a plugin that passes ctx.clock.now() on every poll cycle produces a fresh envelope per cycle even when the upstream is unchanged, fragmenting what should have been one logical event into N. The framework does not police receivedAt; the contract is “plugin owns it, framework trusts it.”

  • LLM summary — dense reference for agents.
  • Source — owns the rows ingest find-or-creates.
  • Access — owns the Envelope, party rows, and the core.envelope_indexed event ingest appends.
  • Events — the outbox the core.ingested event rides; namespace guard rejects plugin-side emits of core.ingested.
  • StoreBatchBuilder + store.batch is the atomicity primitive every ingest call commits through.
  • Plugin systemRuntimeContext, the Capability union, CapabilityDeniedError, the loader-side migration bridge.