ABBYY
285 Case Studies
A ABBYY Case Study
SEMA Group, an Australian IT services company specializing in document-centric solutions, partnered with the Department of Parliamentary Services to digitize all pre‑1980 original Australian Hansard transcripts—hundreds of volumes dating back to 1901. The project faced significant challenges from fragile, discolored paper, variable print quality, changing typefaces and layouts, and complex sequencing between pre‑1953 integrated volumes and post‑1953 separate series.
SEMA implemented an end‑to‑end workflow using Kodak scanners, dual OCR engines (ABBYY and RecoStar), image cleanup, validation, metadata tagging and PDF/A plus XML generation for ingestion into the ParlInfo search system. As a result, all Hansard originals were successfully digitized, classified and validated, and the complete collection is now searchable and available to the public via the ParlInfo website.
Tony Smith
Software Product Manager