Case Study: Barclays achieves Spark job acceleration from hours to seconds with Alluxio

A Alluxio Case Study

Preview of the Barclays Case Study

Making the Impossible Possible with Alluxio: Accelerate Spark Jobs from Hours to Seconds

Barclays needed a faster, more flexible way to work with large datasets in Spark. Their existing process loaded data from a relational database into Spark for analysis, but Spark’s in-memory cache was volatile across job restarts, causing repeated reloads that could take half an hour or more and slow down iterative data science work.

Barclays implemented Alluxio as an in-memory storage layer integrated with Spark, using it to keep raw and processed data available across iterations without reloading from the RDBMS. With Alluxio, the team could reuse data for ETL, model training, and evaluation, cutting workflow iteration time from hours to seconds and dramatically reducing waiting time, network traffic, and database load.


Open case study document...

Alluxio

20 Case Studies