IBM announced two new managed services ? Red Hat AI Inference on IBM Cloud and Red Hat OpenShift Virtualization Service on IBM Cloud ? to help enterprises accelerate AI adoption and run security-forward, scalable and predictable virtualization environments.
Red Hat AI Inference on IBM Cloud, with built-in governance controls, is designed to help clients reliably integrate real-time AI inferencing directly into their production workflows across hybrid cloud environments. Red Hat OpenShift Virtualization Service on IBM Cloud provides a managed path to help clients migrate and run virtual machines (VMs) securely and at scale. Red Hat AI Inference on IBM Cloud is delivered as a managed service designed to span developer teams and agents.
It is built to enable organizations to standardize the orchestration, performance and governance of AI models across the enterprise while freeing developers and platform teams to focus on delivering the value-added applications and services their clients need. Red Hat OpenShift Virtualization Service on IBM Cloud is a managed virtualization service that can help enterprises migrate and operate VM-based workloads on Red Hat OpenShift with Kubernetes-based infrastructure, automated lifecycle management and a consistent foundation toward containerization and application modernization. These new services build on IBM's existing managed offerings across Red Hat Enterprise Linux, Red Hat OpenShift, Red Hat Ansible Automation Platform and Red Hat AI.
Red Hat AI Inference on IBM Cloud is an enterprise-ready, fully managed inference service designed to empower clients to run production-grade AI models without the complexity of managing GPUs, infrastructure or AI platforms. It brings together Red Hat AI's high-performance inference engine with IBM Cloud's enterprise-grade capabilities to help enterprises deploy AI models built for consistent performance and predictable cost. Operated and maintained by IBM Cloud, the service demonstrates how Red Hat AI can operate at enterprise scale with the security capabilities, reliability, and performance required for production workloads. IBM Cloud is the only cloud that provides a fully managed Red Hat AI add-on providing access to the full capabilities of Red Hat AI.
Key features include: Production grade performance at enterprise scale: The service is powered by vLLM and Red Hat AI's inference engine, optimized for high throughput and low latency, and designed to enable agents and applications to deliver consistent real-time performance. The model catalog includes Granite 4.0 H Small (IBM), Mistral-Small-3.2-24B-Instruct, Llama 3.3 70B Instruct, GPT-OSS-120B, and Nemotron-3-Nano-30B-FP8 with more open models and custom models planned starting in May 2026. Accelerated time to production: Designed to allow developers to integrate quickly using familiar OpenAI compatible APIs and without the need to manage GPUs or tuning runtimes, helping accelerate time-to-value.
Built-in security capabilities and governance: Integration with IBM Cloud IAM, audit logging, privacy controls, and SLA backed reliability is designed to give enterprises full visibility and governance over model use and to support mission critical applications. Models-as-a-Service: Red Hat AI Inference enables organizations to set up AI models as API-accessible, shared resources, promoting rapid AI adoption while reducing infrastructure burden. Red Hat AI Inference on IBM Cloud will be generally available on May 22, 2026.
Red Hat OpenShift Virtualization Service on IBM Cloud is in limited availability and is expected to be generally available in June 2026. Product features and timelines are subject to change at IBM's sole discretion and may not be available in all countries. Nothing in this release creates any warranties or alters applicable license terms. Statements regarding IBM's future direction and intent are subject to change or withdrawal without notice, and represent goals and objectives only.


















