Cloud Native Computing Foundation
134 Case Studies
A Cloud Native Computing Foundation Case Study
The customer, IBM, faced challenges in managing machine learning model training across its private, public, and hybrid cloud environments. Their existing infrastructure, originally designed for public cloud, incurred high operational costs and could not be easily optimized to further reduce the already-improved training times. They needed a highly portable, efficient, and maintainable solution for their watsonx services. The vendor, the Cloud Native Computing Foundation (CNCF), offered a solution through its Knative Eventing project.
The solution involved implementing a new ML training infrastructure using Knative Eventing with a Kafka broker. This CNCF technology provided a serverless setup that simplified the architecture, eliminated over 40,000 lines of code, and enabled a 60% reduction in average training time from 50-90 seconds down to 15-35 seconds. This migration resulted in an enhanced user experience, significant cost benefits, and the ability to handle millions of training jobs per month on IBM's public cloud.