Alluxio
20 Case Studies
A Alluxio Case Study
Tencent, one of the world’s largest technology companies and a gaming leader, needed a better way to support large-scale game AI offline training and feature extraction. Its workloads involved thousands of containers and tens of thousands of parallel process reads against large, version-specific game dependency files, creating heavy metadata pressure on CephFS and high latency that hurt performance and job reliability. Tencent turned to Alluxio to improve data locality and accelerate access without changing the POSIX-based workflow used by its AI applications.
Using Alluxio, Tencent built a 1,000-node cache layer on top of CephFS, with Alluxio FUSE sidecars, HA masters, co-located workers, and preloading of hot data through distributedLoad. Tencent also tuned the deployment by disabling passive cache and audit logging, optimizing JVM settings, and adding dynamic configuration and observability improvements. With Alluxio, Tencent reduced under-storage metadata pressure dramatically, cut job failure rates from 2.8% to 0.73% in benchmark testing, and supported 4,000 concurrent CPU cores for game AI feature extraction with stable performance.