Case Study: Pagely achieves fast, cost-effective serverless data lake analytics with AWS Glue

A AWS Glue Case Study

Preview of the Pagely Case Study

How Pagely implemented a serverless data lake in AWS to facilitate customer support analytics

Pagely, an AWS Advanced Technology Partner that provides managed WordPress hosting, needed a faster, more cost‑effective way to analyze application logs for customer support and billing. Their legacy shell‑script reporting and raw JSON queries were slow and brittle (reports for large customers could take >8 hours or hit Athena timeouts), so Pagely moved to build a serverless data lake on AWS using Amazon S3, Amazon Athena, and AWS Glue to make log data queryable and maintainable.

Pagely worked with Beyondsoft and used ConvergDB to generate ETL workflows that run on AWS Glue, converting compressed JSON into partitioned Parquet files, consolidating 29.5M small files into ~14,000 files, and automating jobs via Terraform. The result: medium reports dropped from 91 seconds to 5 seconds (≈18x faster) and the largest customer analysis now completes in 24 seconds instead of hours; the small‑file consolidation cost was only $27. By leveraging AWS Glue, Pagely reduced query cost and time and empowered engineers to run analytics from lightweight machines while Athena does the heavy lifting.


Open case study document...

AWS Glue

107 Case Studies