Alluxio
20 Case Studies
A Alluxio Case Study
Amazon Web Services (AWS), a major cloud services provider, sought to improve the performance and manageability of analytics and AI workloads running on its Amazon EMR service, particularly for data stored in S3. The challenge involved eliminating the complexity of using HDFS in the cloud and enabling efficient data access across hybrid environments where data remained on-premises. Alluxio's data orchestration platform was leveraged to address this.
The solution implemented by Alluxio provided a tiered caching layer that co-locates data with compute on AWS EMR, synchronizing automatically with S3. This eliminated the need for a complex HDFS layer and allowed analytics frameworks like Spark, Presto, and Hive to achieve memory-speed data access. As a result, Alluxio helped customers achieve 5-10x speed improvements for their Hive and Spark queries running on S3 and enabled them to seamlessly burst compute workloads to the cloud without having to first copy on-premises data.