Testing¶
95%+ of the test suite runs with no hardware attached —
FakeTransport replays scripted
byte exchanges, the fixture-file format lets captured hardware
sessions round-trip through code review, and the
device_matrix.yaml fixture
is the empirical source of truth for per-firmware command
availability. See Design §6 for the full strategy.
Running tests¶
uv run pytest # fast suite, no hardware
uv run pytest -m hardware # read-only hardware tests (needs ALICATLIB_TEST_*_PORT)
uv run pytest -m hardware_stateful # requires ALICATLIB_ENABLE_STATEFUL_TESTS=1
uv run pytest -m hardware_destructive # requires ALICATLIB_ENABLE_DESTRUCTIVE_TESTS=1
Default pytest run excludes every hardware* marker, so the fast
suite is always hermetic. Coverage runs via
uv run pytest --cov --cov-report=xml.
Hardware test tiers¶
| Marker | What it does | Opt-in |
|---|---|---|
hardware |
Read-only — identify, poll, query commands. Never changes device state. | ALICATLIB_TEST_*_PORT env vars set |
hardware_stateful |
Changes device state (gas, setpoint, tare, unit-id). Reverts before exit where possible. | ALICATLIB_ENABLE_STATEFUL_TESTS=1 |
hardware_destructive |
Factory reset, baud change, valve exhaust. No automatic revert. | ALICATLIB_ENABLE_DESTRUCTIVE_TESTS=1 |
slow |
Excluded from the fast CI run; used for latency / soak benchmarks. | -m slow explicit pass |
Markers are defined in
pyproject.toml under [tool.pytest.ini_options];
opt-in env vars are read by tests/conftest.py.
FakeTransport¶
FakeTransport satisfies the
Transport Protocol with a scripted
reply table. Writes are recorded; reads drain from the scripted
replies for that write. Forced-timeout and short-read knobs let tests
exercise the protocol client's error paths deterministically.
Inline script¶
import pytest
from alicatlib.testing import FakeTransport
@pytest.mark.anyio
async def test_gas_select_encode() -> None:
fake = FakeTransport({b"AGS 5\r": [b"A 5 N2 Nitrogen\r"]})
# drive a Session through the fake; assert fake.writes, assert decoded reply
One key per unique write; one list of reply chunks per key. Multiple
lines for multiline commands concatenate inside one list entry —
??M* scripts look like:
FakeTransport({
b"A??M*\r": [
b"A M01 Alicat Scientific\r"
b"A M02 www.example.com\r"
b"A M03 +1 555-0000\r"
# ...
],
})
Forced errors¶
FakeTransport(..., fail=FailPlan(read_timeout_at_call=2)) fires a
TimeoutError on the second read call; useful for testing retry /
recovery paths. See the FakeTransport docstring for the full knob
list.
@pytest.mark.anyio¶
Async tests use the AnyIO pytest plugin — not pytest-asyncio.
The two auto-modes disagree and pytest-asyncio wraps fixtures in
fresh tasks that break cancel scopes. tests/conftest.py wires up a
parametrised anyio_backend fixture that runs every async test
against both asyncio and trio for cross-backend coverage.
Fixture format¶
Captured hardware traffic lives in
tests/fixtures/responses/ as plaintext
.txt files. The format is deliberately skimmable:
# scenario: Set active gas to N2 (code 8) via GS command
# Response is "<unit_id> <code> <short> <long_name>".
> AGS 8
< A 8 N2 Nitrogen
Rules (testing.py):
- Lines starting with
#are comments. - Blank lines are ignored.
>introduces a send. The carriage-return terminator is appended automatically so the fixture stays human-readable.<introduces one reply line (\r-terminated).- Multiple
<lines after a single>concatenate into one scripted reply — the right shape for multiline commands. - Duplicate
>entries are a file-format error, not a silent overwrite. Two writes of the same bytes must use two separate>blocks in order.
Loading a fixture¶
from alicatlib.testing import FakeTransportFromFixture
fake = FakeTransportFromFixture("tests/fixtures/responses/gas_select_n2.txt")
FakeTransportFromFixture is a drop-in replacement for the
dictionary-constructed FakeTransport; the file is parsed once at
construction and the scripted replies are populated from the > /
< pairs.
Capturing new fixtures¶
An automated record_session(device, scenario) helper is planned
but not shipped yet — the docstring in
testing.py:33-35 notes it lands
with the hardware integration suite.
For now, capture fixtures by hand or by pasting from a --log-level=DEBUG
transcript. The protocol client emits one tx / rx DEBUG event per
write / read on the alicatlib.protocol logger; translate the
structured {direction, raw, len} extras into > / < lines. See
troubleshooting.md §Getting raw wire bytes.
device_matrix.yaml¶
tests/fixtures/device_matrix.yaml
is the empirical behaviour matrix — one (device_model, firmware,
captured_at) triple per entry, with per-command status across the
whole catalog:
- model: MC-100SCCM-D
firmware: GP07R100
family: GP
captured_at: "2026-04-17"
prefix:
reads: none
writes: dollar
dialects:
mm: gp_short_code_backspace_padded
dd: legacy_backspace_padded
commands:
poll: supported
ve: silent
mm: supported
dv: rejected # firmware-gate (GP family)
Status taxonomy (defined at the top of the YAML):
| Status | Meaning |
|---|---|
supported |
Device responds usefully; reply parses per the command spec. |
rejected |
Device returns ? or a library gate blocks pre-I/O (firmware / kind / capability / media). |
silent |
Device returns nothing within a reasonable timeout. |
fallback |
Device returns a data frame instead of the proper reply (e.g. pre-10v05 with display lock). |
placeholder |
Degraded reply (A 1 ---, A +0 1 ---, A Feature Not Enabled). |
adc_counts |
Pre-10v05 DCU returns raw ADC counts — a different meaning entirely. |
untested |
No capture yet — default. |
The matrix is validated against every command spec's
firmware_families declaration by
tests/unit/test_device_matrix.py:
an entry marked supported on a family the spec doesn't allow fails
CI, and vice versa. This is the load-bearing cross-check that keeps
the command catalog honest about real-hardware capture evidence.
New captures append to the file — never silently edit an existing entry unless the original capture was mis-transcribed (and in that case, the commit message must document the re-transcription).
Coverage-layer guidance¶
| Layer | Primary test strategy |
|---|---|
Parsers (protocol/parser.py, devices/data_frame.py) |
Pure-function unit tests against raw fixture bytes. Clock-free; no FakeTransport needed. |
Commands (commands/*) |
encode / decode round-trip against fixture replies; one test per spec covers the happy path and each gate. |
| Session gates | FakeTransport + Session.execute; assert the right typed exception and zero tx when a gate fires. |
| Factory + discovery | FakeTransportFromFixture against captured identification traces. |
| Manager + recorder | Lightweight PollSource stub (design §5.14); no full transport stack. |
| Sinks | Per-backend fixtures; InMemorySink as the oracle for sample_to_row. |
| Sync parity | Dedicated parity test compares every async / sync method pair by parameter name, kind, and default (design §5.16). |
Hypothesis¶
Property-based tests use Hypothesis —
configured via
tests/conftest.py. Useful for encoder invariants (0 / False / None
distinct, round-trip through parse, no ASCII-outside-range payloads),
data-frame parser robustness, and sink row-layout stability.
Pre-push + CI¶
The pre-push hook runs mypy via uv run --frozen against the same
dep groups CI uses, so "works locally" matches "works in CI" by
construction. See .pre-commit-config.yaml.
CI runs ruff (format + lint), mypy + pyright, the test suite across
Python 3.13/3.14 × Linux/macOS/Windows, uv build +
twine check --strict, and a codegen idempotency guard on the
generated registry. See
.github/workflows/ci.yml.