Real-time vision for worker-safety monitoring

The problem

Industrial monitoring needs to catch a worker falling as it happens, on hardware a site can actually afford — one mid-range GPU, not a cluster. The hard part isn't a model that classifies falls offline; it's a pipeline that ingests a live stream, holds latency under control, and stays honest about what is real inference and what is a fallback.

Approach & tradeoffs

Sentinel Vision is a single observable FastAPI service, not a notebook. A frame flows through YOLO26-Pose → ByteTrack identity tracking → SAM 2.1 masks (called only on new, stale or uncertain tracks, to skip the expensive call when nothing changed) → a per-person skeleton transformer that reads the action over a temporal window.

The engineering decisions are about keeping a live system honest:

Bounded latest-frame queues so live latency can't drift under load — the pipeline drops old frames instead of falling behind.
Fail-closed production config that refuses to start on demo backends, and a /health/ready endpoint that exposes its degradations instead of hiding them.
Models as swappable adapters (PyTorch / ONNX / TensorRT), with an auditable kinematic fallback for the temporal head.

The temporal head is trained on the UR Fall Detection Dataset with a by-sequence split — falls in validation are never seen in training, so the score means something.

Results

Validation macro-F1 = 0.90 on held-out fall sequences — an honest split, not a synthetic 100%.
27.98 FPS, p95 49 ms end-to-end, ~1990 MiB VRAM on an RTX 2070, with 0% frame drops over the benchmark and every throughput / latency / drop SLO passing.
Full observability: Prometheus metrics, a live FPS/latency dashboard, and a WebSocket telemetry stream.

What I'd flag

Every FPS and VRAM figure is measured on one named GPU — nothing is extrapolated from a bigger card. Genuine zero-copy decode-to-inference is scoped honestly to a DeepStream boundary the portable build does not claim; readiness reports the active decode path instead of pretending it's PCIe-free.