capa.storage¶
The run bundle: writer thread, per-sink writers, finalize-in-place, integrity hashing. The runtime emits; this layer is the only place that touches disk in the data path.
Narrative guides:
- What's in a bundle — the on-disk tour.
- Manifest and schema.
- Bundle write path — the Arrow-IPC-then-rewrite finalize protocol.
- Integrity and sealing —
sha256sum-compatible artifact hashing and thebundle_statusstate machine. - Reading a bundle — analyst recipes.
capa.storage ¶
Storage layer — the run bundle.
A run produces one directory containing every artifact needed to interpret it later. The storage layer provides:
- :class:
~capa.storage.bundle.RunBundleWriter— opens a run dir, owns sink lifecycle, drives thebundle_statusstate machine, writesmanifest.jsonat start and finalize. - :mod:
capa.storage.channel_samples_sink— normalizedscalars.parquetlong-table writer. - :mod:
capa.storage.device_records_sink— library-nativedevice_records/<adapter>.parquetsidecars (one writer per family). - :mod:
capa.storage.events_sink/ :mod:capa.storage.status_sink— transactional SQLite sinks for events and device snapshots. - :mod:
capa.storage.log_sink— JSON-lines append handle torun.log. - :mod:
capa.storage.finalize— pure-function finalize-in-place: rewrite in-flight Arrow IPC streams to final Parquet with large row groups, computemanifest.sha256, setbundle_status. - :mod:
capa.storage.integrity— sha256 over every artifact, insha256sum-compatible format. - :mod:
capa.storage.manifest— PydanticBundleManifestmodel. - :mod:
capa.storage.schema—BUNDLE_SCHEMA_VERSIONand migration registry.