Testing¶

95%+ of the test suite runs with no hardware attached — FakeTransport replays scripted byte exchanges, the fixture-file format lets captured hardware sessions round-trip through code review, and the device_matrix.yaml fixture is the empirical source of truth for per-firmware command availability. See Design §6 for the full strategy.

Running tests¶

uv run pytest                              # fast suite, no hardware
uv run pytest -m hardware                  # read-only hardware tests (needs ALICATLIB_TEST_*_PORT)
uv run pytest -m hardware_stateful         # requires ALICATLIB_ENABLE_STATEFUL_TESTS=1
uv run pytest -m hardware_destructive      # requires ALICATLIB_ENABLE_DESTRUCTIVE_TESTS=1

Default pytest run excludes every hardware* marker, so the fast suite is always hermetic. Coverage runs via uv run pytest --cov --cov-report=xml.

Hardware test tiers¶

Marker	What it does	Opt-in
`hardware`	Read-only — identify, poll, query commands. Never changes device state.	`ALICATLIB_TEST_*_PORT` env vars set
`hardware_stateful`	Changes device state (gas, setpoint, tare, unit-id). Reverts before exit where possible.	`ALICATLIB_ENABLE_STATEFUL_TESTS=1`
`hardware_destructive`	Factory reset, baud change, valve exhaust. No automatic revert.	`ALICATLIB_ENABLE_DESTRUCTIVE_TESTS=1`
`slow`	Excluded from the fast CI run; used for latency / soak benchmarks.	`-m slow` explicit pass

Markers are defined in pyproject.toml under [tool.pytest.ini_options]; opt-in env vars are read by tests/conftest.py.

FakeTransport¶

FakeTransport satisfies the Transport Protocol with a scripted reply table. Writes are recorded; reads drain from the scripted replies for that write. Forced-timeout and short-read knobs let tests exercise the protocol client's error paths deterministically.

Inline script¶

import pytest
from alicatlib.testing import FakeTransport

@pytest.mark.anyio
async def test_gas_select_encode() -> None:
    fake = FakeTransport({b"AGS 5\r": [b"A 5 N2 Nitrogen\r"]})
    # drive a Session through the fake; assert fake.writes, assert decoded reply

One key per unique write; one list of reply chunks per key. Multiple lines for multiline commands concatenate inside one list entry — ??M* scripts look like:

FakeTransport({
    b"A??M*\r": [
        b"A M01 Alicat Scientific\r"
        b"A M02 www.example.com\r"
        b"A M03 +1 555-0000\r"
        # ...
    ],
})

Forced errors¶

FakeTransport(..., fail=FailPlan(read_timeout_at_call=2)) fires a TimeoutError on the second read call; useful for testing retry / recovery paths. See the FakeTransport docstring for the full knob list.

`@pytest.mark.anyio`¶

Async tests use the AnyIO pytest plugin — not pytest-asyncio. The two auto-modes disagree and pytest-asyncio wraps fixtures in fresh tasks that break cancel scopes. tests/conftest.py wires up a parametrised anyio_backend fixture that runs every async test against both asyncio and trio for cross-backend coverage.

Fixture format¶

Captured hardware traffic lives in tests/fixtures/responses/ as plaintext .txt files. The format is deliberately skimmable:

# scenario: Set active gas to N2 (code 8) via GS command
# Response is "<unit_id> <code> <short> <long_name>".

> AGS 8
< A 8 N2 Nitrogen

Rules (testing.py):

Lines starting with # are comments.
Blank lines are ignored.
> introduces a send. The carriage-return terminator is appended automatically so the fixture stays human-readable.
< introduces one reply line (\r-terminated).
Multiple < lines after a single > concatenate into one scripted reply — the right shape for multiline commands.
Duplicate > entries are a file-format error, not a silent overwrite. Two writes of the same bytes must use two separate > blocks in order.

Loading a fixture¶

from alicatlib.testing import FakeTransportFromFixture

fake = FakeTransportFromFixture("tests/fixtures/responses/gas_select_n2.txt")

FakeTransportFromFixture is a drop-in replacement for the dictionary-constructed FakeTransport; the file is parsed once at construction and the scripted replies are populated from the > / < pairs.

Capturing new fixtures¶

An automated record_session(device, scenario) helper is planned but not shipped yet — the docstring in testing.py:33-35 notes it lands with the hardware integration suite.

For now, capture fixtures by hand or by pasting from a --log-level=DEBUG transcript. The protocol client emits one tx / rx DEBUG event per write / read on the alicatlib.protocol logger; translate the structured {direction, raw, len} extras into > / < lines. See troubleshooting.md §Getting raw wire bytes.

`device_matrix.yaml`¶

tests/fixtures/device_matrix.yaml is the empirical behaviour matrix — one (device_model, firmware, captured_at) triple per entry, with per-command status across the whole catalog:

- model: MC-100SCCM-D
  firmware: GP07R100
  family: GP
  captured_at: "2026-04-17"
  prefix:
    reads: none
    writes: dollar
  dialects:
    mm: gp_short_code_backspace_padded
    dd: legacy_backspace_padded
  commands:
    poll:          supported
    ve:            silent
    mm:            supported
    dv:            rejected        # firmware-gate (GP family)

Status taxonomy (defined at the top of the YAML):

Status	Meaning
`supported`	Device responds usefully; reply parses per the command spec.
`rejected`	Device returns `?` or a library gate blocks pre-I/O (firmware / kind / capability / media).
`silent`	Device returns nothing within a reasonable timeout.
`fallback`	Device returns a data frame instead of the proper reply (e.g. pre-10v05 with display lock).
`placeholder`	Degraded reply (`A 1 ---`, `A +0 1 ---`, `A Feature Not Enabled`).
`adc_counts`	Pre-10v05 `DCU` returns raw ADC counts — a different meaning entirely.
`untested`	No capture yet — default.

The matrix is validated against every command spec's firmware_families declaration by tests/unit/test_device_matrix.py: an entry marked supported on a family the spec doesn't allow fails CI, and vice versa. This is the load-bearing cross-check that keeps the command catalog honest about real-hardware capture evidence.

New captures append to the file — never silently edit an existing entry unless the original capture was mis-transcribed (and in that case, the commit message must document the re-transcription).

Coverage-layer guidance¶

Layer	Primary test strategy
Parsers (`protocol/parser.py`, `devices/data_frame.py`)	Pure-function unit tests against raw fixture bytes. Clock-free; no `FakeTransport` needed.
Commands (`commands/*`)	`encode` / `decode` round-trip against fixture replies; one test per spec covers the happy path and each gate.
Session gates	`FakeTransport` + `Session.execute`; assert the right typed exception and zero tx when a gate fires.
Factory + discovery	`FakeTransportFromFixture` against captured identification traces.
Manager + recorder	Lightweight `PollSource` stub (design §5.14); no full transport stack.
Sinks	Per-backend fixtures; `InMemorySink` as the oracle for `sample_to_row`.
Sync parity	Dedicated parity test compares every async / sync method pair by parameter name, kind, and default (design §5.16).

Hypothesis¶

Property-based tests use Hypothesis — configured via tests/conftest.py. Useful for encoder invariants (0 / False / None distinct, round-trip through parse, no ASCII-outside-range payloads), data-frame parser robustness, and sink row-layout stability.

Pre-push + CI¶

The pre-push hook runs mypy via uv run --frozen against the same dep groups CI uses, so "works locally" matches "works in CI" by construction. See .pre-commit-config.yaml.

CI runs ruff (format + lint), mypy + pyright, the test suite across Python 3.13/3.14 × Linux/macOS/Windows, uv build + twine check --strict, and a codegen idempotency guard on the generated registry. See .github/workflows/ci.yml.