
Our inference observability stack: telemetry, benchmarks, and model cards
How do we keep tabs on our growing network of edge inference servers with a small team of engineers? Three systems: per-request telemetry, synthetic benchmarks, and model cards generated from benchmark output.
Read more →





