MirrorWeb
21 Case Studies
A MirrorWeb Case Study
The National Archives faced a rapidly growing, more complex UK Government Web Archive (UKGWA) and needed to modernise how web and social media content was captured, stored and made searchable. After a procurement process they chose MirrorWeb and its UKGWA service on AWS for its cloud expertise and social‑media archiving capabilities to deliver a reliable, comprehensive public search and replay service.
MirrorWeb migrated the legacy archive to Amazon (using Snowballs and custom ingest hardware in two weeks), then built a new public site and a full‑text, faceted search stack using Elasticsearch and their cloud data pipeline WarpPipe. MirrorWeb indexed the entire collection at scale—spinning a 1000+ node cluster to process some 120TB and index 14 billion documents in about 10 hours—enabled near‑real‑time capture of hundreds of social accounts, improved deduplication and search accuracy, and delivered fast replay and high‑traffic capacity for the UKGWA.
John Sheridan
Digital Director