IBM is equipping its upcoming z and LinuxONE mainframes with the advanced Telum processor and a novel accelerator to enhance AI and other demanding data workloads performance.
The newly introduced IBM Telum II processor features increased memory and cache capacities compared to its predecessor, including a state-of-the-art data processing unit (DPU) for IO acceleration, and improved on-chip AI acceleration capabilities.
Made with Samsung’s 5nm technology, the Telum II incorporates eight high-performance cores that operate at 5.5GHz, according to IBM, and expands the on-chip cache capacity by 40% with virtual L3 and L4 reaching 360MB and 2.88GB respectively.
“The performance of each accelerator is anticipated to quadruple, achieving 24 trillion operations per second (TOPS). However, TOPS alone do not convey the entire impact,” wrote IBM Fellows Christian Jacobi and Elpida Tzortzatos, who serve as CTOs of IBM Systems Development and z/OS and AI on IBM Z and LinuxONE respectively, in a blog post discussing the new processor.
“The effectiveness of an AI accelerator greatly depends on both its architectural design and the AI ecosystem optimization it supports. A tailored architecture is crucial for AI acceleration within production enterprise environments. Telum II is crafted to allow model runtimes to operate in conjunction with demanding enterprise tasks, ensuring high throughput and low-latency inferencing.”
In its highest specification, future IBM Z series could include as many as 32 Telum II processors alongside 12 IO cages. Each cage is capable of holding 16 PCIe slots, thus enabling the support for up to 192 PCIe cards. Enhanced custom I/O protocols are set to improve availability, error detection, and virtualization, fulfilling extensive bandwidth requirements while also offering redundancy and multi-pathing to protect against multiple failures occurring simultaneously.
“Enhancements in computing primitives are implemented to augment support for sizable language models within the accelerator. These advancements are intended to handle a wider variety of AI models, thus facilitating a more exhaustive analysis of both structured and textual data,” said Jacobi and Tzortzatos.
IBM introduced its first Telum processor in 2021, which featured an on-chip AI accelerator tailored for inferencing. With the latest generation, IBM has notably advanced the AI capabilities of the Telum II processor. In addition to an upgraded AI accelerator, Telum II now features a specialized DPU designed for IO acceleration, which streamlines system management and potentially boosts the performance of critical components, as reported by IBM.
From a networking and I/O perspective, one of the benefits of this approach is to move from a 2-port fiber connection (FICON) card to a 4-port card and consolidate the Open Systems Adapter (OSA) Express – the mainframe’s package for networking via a variety of networking protocols – and RDMA over Converged Ethernet (RoCE) Express offerings at the system level, according to Michael Becht, chief engineer and architect for IBM Z I/O channels, and Susan M. Eickhoff, director, IBM Z processor development.
“This change, available beginning with the next-generation IBM Z in the first half of 2025, will allow clients to maintain the same I/O configuration in a smaller footprint, to reduce data center floorspace as they upgrade and modernize their infrastructure,” Becht and Eickhoff wrote in a blog.
A complement to the Telum II processor is the new Spyre Accelerator, which provides additional AI compute capabilities.
The Spyre Accelerator will contain 1TB of memory and 32 AI accelerator cores that will share a similar architecture to the AI accelerator integrated into the Telum II chip, according to Jacobi and Tzortzatos: “Multiple IBM Spyre Accelerators can be connected into the I/O Subsystem of IBM Z via PCIe. Combining these two technologies can result in a substantial increase in the amount of available acceleration.”
Taken together, the IBM Telum II and the Spyre Accelerator mark a significant milestone in mainframe technology, as pointed out by Steven Dickens, chief technology advisor at The Futurum Group.
“The introduction of such advanced chip and AI technology by IBM for use in mainframes is incredibly significant and groundbreaking for enterprise customers,” stated Dickens.
The new advancements in processor technology primarily target AI development and managing complex workloads, but they also enhance performance and energy efficiency across other transaction-heavy applications, explains Tina Tarquinio, vice president of product management for IBM Z and LinuxONE.
“The Spyre accelerator covers a wide range of business applications,” Tarquinio mentioned. “For instance, IBM leverages it to enhance and support AI-driven tasks within our internal HR operations. The upcoming generation of IBM z is set to continue leading in system resilience, featuring eight nines of availability security and being uniquely quantum-safe.”
Analysts believe the latest Telum 11-based mainframes will significantly influence not only the development of enterprise AI but also enhance the efficiency of other operations including database management and cloud services, whether distributed or hybrid.
“These servers are designed to be incredibly powerful in handling input/output operations,” Dickens explained. “This means that they can run large databases like Oracle or MongoDB, or other critical applications much more effectively.”
The new approach allows customers to offload transactional workloads from the main CPU to an accelerator. This facilitates further evaluation and processing for machine learning, AI, and generative AI tasks, thereby optimizing operational scalability, according to Dickens.
“The scalability of this AI-enhanced mainframe platform, which includes chips, cards, and software, proves valuable for a variety of applications. These range from credit rating assessments to fraud detection, compliance, financial transactions, and the processing and simulation of documents,” Patrick Moorhead, founder, CEO, and chief analyst at Moor Insights & Strategy commented.
“Enterprises with mainframes typically use them for crucial applications that demand top-tier resilience and security. Traditionally, AI processes involved transferring data from the mainframe to a GPU server for processing and then returning it, which was neither efficient nor secure, particularly for operations like credit scoring, fraud detection, and compliance,” explained Moorhead.
IBM’s Jacobi emphasized the advantages of enhanced AI support for code security and compliance.
“Clients managing massive amounts of code, from tens to hundreds of millions of lines, are deeply concerned about security. This code represents complex business processes essential for running organizations such as insurance companies and banks, constituting valuable intellectual property,” Jacobi noted.
“Enabling AI analysis directly on the mainframe allows businesses to maintain that analysis within a secure environment, which is preferable to external processing,” added Jacobi.
“With the Spyre, we can cluster up to eight cards together to get to the memory size and compute capacity to run generative workloads on that code. And we’ll be integrating that with our higher-level stack products like Watson Code Assistant for Z with optimized models that are trained and tuned to have the knowledge that is necessary to do kind of mainframe code refactoring and mainframe code explanation,” Jacobi said.
The Telum II processor will be the central processor powering IBM’s next-generation IBM Z and IBM LinuxONE platforms and is expected to be available in 2025, according to IBM. The Spyre Accelerator, currently in tech preview, is also expected to be available in 2025.