The Big Iron evolution continues. IBM has rolled out the latest iteration of its mainframe, replete with AI technology designed to take data-intensive application support well into the future.
At the heart of the new z17 mainframe, available in June, is the 5.5 GHz IBM Telum II processor, which includes a built-in AI accelerator that IBM says will let customers run more than 450 billion inferencing operations in a day with one millisecond response time. The processor supports eight CPU cores per chip, 32 cores per system, and 36MB L2 cache memory, and it can run 24 trillion operations per second – a 40% increase in system throughput and fourfold reduction in overall latency compared to the existing Telum, IBM stated.
In addition, a 32-core AI accelerator called Spyre will be available in the fourth quarter as an optional PCIe card, and additional cards can be added depending on requirements. The Spyre accelerator is designed to handle emerging AI workloads such as generative and agentic AI.
“While AI models have mostly gotten larger over the past decade, the world is now also moving toward smaller, fit-for-purpose models. At the same time, the industry is seeing a rise in mixture of expert models and state space models, whose ideal uses and full capabilities are still being explored. Spyre has these capabilities baked in,” IBM stated. AI use cases are growing, says IBM, which counts more than 250 for IBM Z including financial fraud detection, medical image analysis, and credit risk scoring.
IBM Telum II processor with on-chip AI acceleration
IBM
“Our customers who have high volume transactional workloads were very interested in being able, in real time… to score their transactions for fraud, for example, whether those were debit card transactions or credit card transactions or core payments. They wanted to be able to embed AI in each transaction without slowing down those transactions,” said Elpida Tzortzatos, IBM Fellow and CTO of AI on IBM Z. “So what that translated into, from an AI infrastructure perspective, was having the ability to have hardware acceleration that can deliver, in the single-digit millisecond response times, a very high throughput.”
IBM has seen customers struggle to easily integrate AI into their existing environments, Tzortzatos said. “So we made sure that we not only delivered hardware acceleration, but also we built a very robust AI ecosystem of top of that hardware acceleration to help our clients really embed AI into their existing workloads and applications.”
Both predictive AI and generative AI are going to play a critical role in enterprise use cases and the type of AI models clients use, Tzortzatos said. Predictive AI models will continue to be the best fit for implementing use cases such as demand forecasting and anti-money laundering and fraud detection.
“Gen AI opens up the apertures for a whole set of new use cases around providing assistance, around being able to do document summarization, around being able to extract key insights of unstructured data,” Tzortzatos said.
Industry analysts weigh in on z17
Tech industry analysts say the z17’s ability to handle seriously high transactional workloads – such as AI inferencing, very specific AI applications, and traditional workloads – will allow the new Big Iron to play an important role in enterprise computing.
“This is cutting-edge server technology, kind of at its best, and I hope they get the credit for it,” said Steven Dickens, CEO and principal analyst with HyperFRAME Research. “At 5.5Ghz, when the rest of the industry is around 3Ghz combined with huge cache, it’s just an absolute beast of a machine, obviously specifically designed for the types of AI or heavy transactional applications and workloads that need it.”
“I think obvious use cases for AI, given the transactional nature of the workloads on the platform, [include] fraud management, for example, and being able to run IBM Granite AI, small language models in transactions. [That] is the interesting story, and that’s going to unlock applications at some of the biggest banks, telcos, retailers, government departments,” Dickens said.
The new system will be a draw for some specific AI use cases, notes Patrick Moorhead, founder, CEO and chief analyst with Moor Insights & Strategy.
“For the kind of customers that Z attracts – the banks, governments, and manufacturers – the AI will become important. Today, a lot of AI is offloaded off the mainframe, which is slow, costly and adds security risks. Applying AI at the point of data origin just makes sense,” Moorhead said.
“Many IBM customers are already doing this, but off the mainframe, which, for the reasons stated, aren’t optimal. I’m not saying all AI inference should be run on the mainframe, but very specific AI-enhanced use cases like fraud detection,” Moorhead said.
IBM engineer in Poughkeepsie, N.Y., tests components on the new z17 mainframe.
IBM
z/OS 3.2 preview and watson X code assistance
In addition to the hardware, IBM previewed z/OS 3.2, the next version of its flagship IBM Z operating system, expected in the third quarter. z/OS 3.2 is planned to provide support for the hardware-accelerated AI capabilities delivered with IBM z17’s full stack optimization across the Telum II Data Processing Unit (DPU), Artificial Intelligence Unit (AIU), and IBM Spyre AI accelerator, IBM stated.
This next release will offer more support for industry-standard technologies, languages, and application workloads so clients can grow and enhance their mission critical, core business applications while still retaining the cyber-resiliency, data locality, and the unique hardware benefits of IBM Z, the vendor says. Improvements are planned to enhance out-of-the-box support for Linux and z/OS container-based applications, as well as IBM Open Enterprise SDK for Python and hybrid cloud data processing, according to IBM.
IBM will also add a new version of its watson X Code Assistant for Z to help developers modernize mainframe applications. Watson X is IBM’s AI development studio and platform. New enhancements will include chat-style explanations, the ability to improve code understanding and business agility for PL/I applications, and AI code optimization support for COBOL to improve application performance, IBM stated.
Some other new interesting features of the z17 package include:
- A new mainframe observability package called IBM Z Operations Unite, which from a single interface will let customers collect event and metrics from various IBM Z data sources to provide a complete view of the infrastructure and more easily isolate and diagnose operational issues. According to IBM, the package, which reports data in the OpenTelemetry standard form, is designed to accelerate the time to detect anomalies and promises to reduce alert investigations.
- New capabilities from IBM’s recent purchase of HashiCorp will help standardize secrets management across hybrid cloud, IBM stated. The features will be part of IBM Vault, which offers identity-based security to manage access to secrets and help protect sensitive data.
- Tools to discover and classify sensitive data on the Z platform. When available, this capability will tap into Telum II for natural language processing and other newly created AI techniques so crown jewel data can be identified and protected before using in the AI data pipeline, IBM stated.