Skip to content

USB webcams

Audience: operators with one or more USB cameras on the rig (typically a visible-light camera mounted above the sample). Scope: capa's webcam adapter — the PyAV → MKV pipeline, the always-on preview pump, encoder choice and its saturation implications, UVC controls, and recovery from disconnect mid-run.


At a glance

Adapter id capa.devices.camera.webcam
Adapter family camera_visible
Real adapter capa.devices.camera.webcam
Sim adapter none — visible cameras have no sim counterpart today
Resource scheme webcam:<selector>
Container Matroska (.mkv)
Default codec libx264
Default frame size / rate 1280×720 @ 30 fps
Preview cadence 2 Hz (PREVIEW_INTERVAL_NS = 500_000_000)

Cameras are peers of devices, not subtypes — they implement Camera, not DeviceAdapter. Emissions are FrameReceipt + CameraHealth records (one per frame, posted from the worker thread), not ChannelSamples. See Devices overview § "The four sibling libraries" for the placement.

Supported hardware

Any camera the OS exposes through its native capture API and that PyAV can open via the per-OS demuxer:

OS Demuxer URL form
Linux v4l2 /dev/video0
macOS avfoundation default or 0
Windows dshow video=<friendly-name>

from_params picks the default for the current OS; input_format and input_url override.

Configuration

configs/hardware/webcam_real.toml
[[cameras]]
name = "visible_cam0"
adapter = "capa.devices.camera.webcam"
kind = "visible"
estimated_bps = 1_500_000        # 30 fps × 720p H.264 ≈ 2 Mbps; gives headroom
on_failure = "warn"

[cameras.params]
fps = 30
width = 1280
height = 720
codec = "libx264"
pix_fmt = "yuv420p"
input_format = "dshow"
input_url = "video=Logitech Webcam C930e"

Note: cameras live under [[cameras]], not [[devices]].

WebcamParams is the Setup-editor view; the adapter itself validates kwargs the existing way.

Key Default Notes
fps 30 (DEFAULT_FPS) Encoder target. Must be > 0.
width / height 1280 / 720 Frame size. The dshow list_options probe (Windows) enumerates the device's supported sizes — see Quirks.
codec "libx264" (DEFAULT_CODEC) See Encoder choice.
pix_fmt "yuv420p" (DEFAULT_PIX_FMT) H.264-compatible default.
input_url platform default OS-specific demuxer URL.
input_format platform default PyAV format name.

Encoder choice

The encoder is the dominant CPU cost of recording and the most common cause of the sat (saturation) pill turning yellow. Choices that ship:

codec Hardware When to use Saturation risk
libx264 CPU only Default. Available everywhere. Highest CPU — watch the sat pill on dense rigs.
h264_qsv Intel Quick Sync Intel iGPU available. Low CPU; lowest sustained risk.
h264_nvenc NVIDIA NVENC Discrete NVIDIA GPU available. Very low CPU.
mjpeg CPU Diagnostic / max-fidelity at small frame sizes. Very large output files; disk I/O becomes the limit before CPU does.

If the sat pill goes yellow on a rig that's idle elsewhere, switch from libx264 to one of the hardware encoders before trying to lower fps or resolution. See saturation deadline for the escalation contract.

How the pipeline is wired

One long-lived input pump opens the camera once at start_input_pump and closes it once at stop_input_pumpnot per-recording. The pump unconditionally emits 2 Hz preview JPEGs onto preview_stream and, while _recording=True, additionally encodes each frame to the output container and emits a FrameReceipt.

This is the design point that fixed the multi-second freeze of the live preview tile after every run-stop on Windows. The DirectShow filter graph hold-time used to be paid every time the camera was opened; opening once per WorkerPool instead of once per recording moves the cost to config load.

MKV container metadata carries the run-start UTC anchor so an external tool can re-correlate by absolute time without parsing capa's manifest.

Recording vs preview — the mental model

                     +----+
   Camera frame ---> | pump | --(every frame, while recording)--> encoder --> .mkv
                     |      | --(every 500 ms, always)-----------> preview JPEG
                     +----+

The live tile reads preview_stream — it is always fed, between runs and during runs. The encoder reads the same frames but only activates when start_recording(path) is called. The two streams share one PyAV input but diverge after that.

What this means in practice:

  • Preview never goes blank because the camera was idle between runs.
  • Recording badge on the Run tab is the only reliable signal that the encoder is also consuming the stream. Trust the badge, not the preview, to confirm capture is happening. See Camera preview.
  • Disconnect mid-run — the pump emits a CameraEvent; the encoder closes the output container; the preview also stops. The bundle keeps whatever was recorded; the operator sees the camera badge go red.

UVC controls (Windows only)

When the duvc-ctl wheel is installed, the adapter probes the camera for UVC properties (brightness, contrast, exposure, focus, zoom, pan, tilt) and adds the corresponding CameraCapability flags after open(). Operators can set these from the manual-control card.

Off Windows, UVC controls are not exposed. The flags stay absent and the UI does not render the widgets.

Capabilities

Always declared (_BASE_CAPABILITIES):

Flag Notes
SUPPORTS_DISCOVERY The per-OS discover_cameras() hook.
SERIAL_SELECT Honors CameraSpec.serial for exact-match selection.
MODEL_HINT Honors CameraSpec.model_hint for friendly-name matching.
LIVE_PREVIEW Pumps frames onto preview_stream.
STREAM_FORMAT PyAV reopens the input with new resolution/framerate on the next start_recording.

Added after open() (Windows, depending on what duvc-ctl probes): the UVC-control flags listed in base.py.

Discovery

discover_cameras() walks the per-OS enumeration API:

OS Path
Linux /sys/class/video4linux/video* + sysfs metadata via _probe_v4l2_info. Dedups multiple /dev/videoN nodes per physical camera by bus_info.
Windows duvc_ctl.list_devices() (requires the wheel).
macOS Returns [] — AVFoundation enumeration is future work; add macOS cameras by hand.

Each row carries selector, model, serial, and transport.

See Discovery for the cross-cutting UX.

Quirks

dshow list_options resolution probe

On Windows, the adapter calls av.open(input_url, format="dshow", options={"list_options": "true"}) to enumerate the device's pin formats — (width, height) pairs and per-resolution max fps caps. The call always fails with "Immediate exit requested" (intentional), but the format dump lands on the libav log channel first; the adapter captures it via av.logging.Capture and parses the max s=WxH fps=NN.NNN lines.

The result populates _supported_resolutions and _resolution_fps_caps so the Setup tab's resolution spinbox can be capped at the device's actual capability. On parse failure (or non- Windows paths) the lists stay empty and the spinbox falls back to a static set.

Open-retry backoff

OPEN_RETRY_DELAYS_S = (0.25, 0.5, 1.0, 2.0, 2.0, 2.0) covers the DirectShow hold-time after a previous cam.close() — cumulative ≈ 7.75 s, which matches the worst observed Logitech C930e release latency on Windows 11. POSIX paths typically open first try and the retries are dormant.

OPEN_RETRY_DEADLINE_S = 8.0 is the hard ceiling; past that, the underlying problem isn't transient and the original error is surfaced.

pix_fmt mismatches

PyAV will silently re-format frames if the requested pix_fmt doesn't match the device's native format. The _reformat_to_rgb24 helper handles the preview path; the encoder takes the requested pix_fmt verbatim. If the device cannot deliver the requested format the open will fail loudly — but if it can deliver a similar one, you may not notice the reformat cost until the saturation pill goes yellow.

Preview throttling and CPU impact

The 2 Hz preview encodes one JPEG every 15 frames (at 30 fps). PREVIEW_MAX_WIDTH = 320 keeps the payload well under 30 kB at PREVIEW_JPEG_QUALITY = 70. The encode itself runs in the worker thread already used for H.264, so the dominant cost is the recording encoder, not the preview.

No sim equivalent

Visible cameras do not have a sim adapter. UI iteration on the camera tile uses flir_ir_sim (IR) or a real webcam.

See also