Projects

2026 · Solo · computer-vision system

Real-time vision for worker-safety monitoring

Industrial fall detection that runs live on a single mid-range GPU — pose, tracking, segmentation and a temporal model behind one observable service, with no fabricated numbers.

The problem

Industrial monitoring needs to catch a worker falling as it happens, on hardware a site can actually afford — one mid-range GPU, not a cluster. The hard part isn't a model that classifies falls offline; it's a pipeline that ingests a live stream, holds latency under control, and stays honest about what is real inference and what is a fallback.

Approach & tradeoffs

Sentinel Vision is a single observable FastAPI service, not a notebook. A frame flows through YOLO26-PoseByteTrack identity tracking → SAM 2.1 masks (called only on new, stale or uncertain tracks, to skip the expensive call when nothing changed) → a per-person skeleton transformer that reads the action over a temporal window.

The engineering decisions are about keeping a live system honest:

The temporal head is trained on the UR Fall Detection Dataset with a by-sequence split — falls in validation are never seen in training, so the score means something.

Results

What I'd flag

Every FPS and VRAM figure is measured on one named GPU — nothing is extrapolated from a bigger card. Genuine zero-copy decode-to-inference is scoped honestly to a DeepStream boundary the portable build does not claim; readiness reports the active decode path instead of pretending it's PCIe-free.