Case Study: LMArena achieves secure, fast LLM web development evals with E2B

A E2B Case Study

Preview of the LMArena Case Study

How LMArena Collaborated with E2B to Build LLM Web Development Evals

LMArena, a UC Berkeley research team, developed WebDev Arena but faced significant challenges in securely and efficiently evaluating web applications generated by large language models. They needed a solution that could run and compare LLM-generated code in real-time with strict isolation, high speed, and at scale, which led them to partner with the vendor E2B.

E2B provided its secure sandbox environments, which are isolated cloud environments designed for running AI-generated code. This solution enabled LMArena to execute code from multiple LLMs simultaneously with quick startup times and complete security. As a measurable result, the platform successfully ran over 50,000 model comparisons and started over 230,000 E2B sandboxes, allowing them to reliably benchmark the performance of different models.


View this case study…

LMArena

Aryan Vichare

Member of Technical Staff


E2B

6 Case Studies