Case Study: Latent achieves 99.999% uptime and 600 ms latency with Baseten

A Baseten Case Study

Preview of the Latent Case Study

Latent delivers pharmaceutical search with 99.999% uptime on Baseten

Latent Health uses Baseten to power AI-native medical search and clinical question answering for major U.S. health systems. As its multi-modal workflows grew across notes, labs, medications, documents, and media, Latent faced increasing infrastructure complexity, difficulty managing many models, reliability risks, performance pressure, and slower experimentation while trying to serve millions of documents daily.

Baseten helped Latent deploy and orchestrate its compound AI system with Baseten Chains, while also improving runtime performance through the Baseten Inference Stack. By splitting workflows into independently scalable Chainlets and optimizing inference, Baseten enabled 99.999% uptime, 600 ms P90 end-to-end latency, and 6x improved GPU utilization, while also simplifying deployment, observability, and experimentation.


View this case study…

Latent

Allan Bishop

Head of Engineering


Baseten

13 Case Studies