By David Linthicum, Founder, Linthicum Research; Enterprise Technology Analyst, SiliconANGLE & theCUBE On-prem or public cloud? These 12 considerations can help you optimize your deployment strategy for long-term AI success.
Enterprises looking to move to AI have a few core questions to answer: Should our AI run on public cloud providers or on-premise? What about co-location and managed services providers?
Many consider public clouds the ideal IT platforms, yet the cloud operating costs for many enterprises are 2 to 3 times what was expected. It’s no wonder that IT managers are gun-shy about adding AI infrastructure to their list of public cloud services.
Here’s some good news. Traditionally owned hardware systems can be less expensive than the cloud when projecting costs over a five-to-ten-year horizon. IT projects that once went straight to the cloud now get their numbers run against more traditional platforms, such as mainframes and other advanced on-premise platforms.
Moreover, there is little difference between the AI ecosystems offered by public cloud providers and those offered by traditional enterprise hardware players. In other words, the comparisons are more apples-to-apples than ever before. Looking beyond the hype and the billion-dollar cloud computing marketing budgets, what are the larger opportunities for enterprises to accelerate their AI deployment for strategic purposes? Let’s explore some of the core considerations.
The cloud is no longer the go-to solution for new system deployments. Consider that current data shows a 50-50 split between private and public infrastructure deployments for AI workloads.1 Organizations report the same split for on-premise/edge and public cloud options.
Many factors affect the business value and the total cost of ownership (TCO) for public cloud-based AI solutions. Most enterprises found that the TCO of their cloud systems, AI and non-AI, over a five-to-seven-year horizon was about 2.5 times higher than initially expected when they adopted cloud computing. This disparity and its root causes are explained in “An Insider’s Guide to Cloud Computing,” published in 2023. It pays to thoroughly understand public cloud-based solutions for AI platforms before committing to one.
The challenges of adopting a public cloud AI infrastructure include:
Hosting sensitive data in the cloud can raise privacy and compliance concerns. Most countries and the United States have adopted privacy regulations that mandate where and how data is stored. Cloud-based data breaches are becoming more common.
For example, the Snowflake data breach affected over 100 of its corporate customers2 including major companies like Ticketmaster, Santander Bank, Pure Storage, Neiman Marcus Group and Advance Auto Parts. While breaches also occur within on-premise deployments, they are often easier to contain and have a much smaller “blast radius.”
Expenses can become unpredictable. Initial public cloud AI infrastructure costs are low, but expenses can become unpredictable with increased usage or premium services, including AI services. The average enterprise sees its AWS cloud costs fluctuate between 20-35% month over month3 driven primarily by variable workload demands and elastic resource usage.
Dependence on a single provider may limit flexibility and complicate transitions to other services. Netflix’s transition from a monolithic architecture to a microservices-based system on AWS4 demonstrates one of the largest known cases of potential cloud vendor lock-in, as their entire streaming service now runs on AWS infrastructure with annual cloud costs exceeding $1 billion. This makes any future migration to another cloud provider highly complex and costly.
Of course, cloud computing would not be as leveraged without upsides. A few that are often sighted as the reason to use public cloud providers for AI include:
Complete AI ecosystem on demand. There is no need to piece together AI training, inference development, and deployment systems; they are provided as services. Often called the “AI Easy Button.”
The ability to scale at will. While it costs more than traditional on-prem AI solutions, they do provide the ability to auto-provision resources, such as GPUs and storage, as needed, on-demand.
1Dave Vellante, “Cloud vs. on-premises showdown: The future battlefield for generative AI dominance,” SiliconANGLE.com, August 2023.
2Aaron Drapkin, “Data Breaches That Have Happened in 2022, 2023, 2024, and 2025 So Far,” Tech.co, January 2025. 3Bernard Marr, “The 10 Biggest Cloud Computing Trends In 2024 Everyone Must Be Ready For Now,” Forbes.com, October 2023. 4“The Story of Netflix and Microservices,” GeeksforGeeks.com, May 2024.
Those who promote the use of traditional on-premise AI systems will quickly point to these key advantages:
On-premise platforms provide greater governance over data. This is crucial for industries that must follow stringent compliance rules and regulations. Take the European Union, for instance. The EU’s digital sovereignty is protected through four major regulations: GDPR for data privacy rights and consent, DSA for online platform accountability, DMA for fair digital market competition and the AI Act for artificial intelligence governance and risk management. Healthcare and finance are also prime examples, as well as regulations surrounding the emerging privacy laws.
Ongoing costs remain predictable. On-premise systems offer more predictability than public cloud systems because organizations have complete control over their infrastructure, costs and resource allocations. This results in fixed capital expenses rather than variable usage-based pricing, dedicated hardware performance without multi-tenant variability and direct management of compliance and security measures without depending on third-party providers. Although initial investments are high, the predictable ongoing costs avoid the variable and often hard-to-anticipate expenses incurred with cloud services.
On-premise systems can offer superior performance with lower latency when tailored for specific workloads. A public cloud system does not need to use a public or private network to transmit request and response information, and the lack of multi-tenant processes and storage management means direct access to system resources such as processors, memory and I/O.
A higher initial expenditure on hardware and maintenance can be a barrier to availability. The initial investment for on-premise systems includes hardware, software licenses, installation and ongoing maintenance costs. For example, the cost of each server can be around $6,000, with additional expenses for server refresh cycles, running servers, IT support and backup software, leading to a total capital expenditure (CapEx) of approximately $92,0005 plus an operating expense. While GPU shortages are often in the news and vary significantly over time, enterprises are still at a point where they can assume that the amount of GPUs available to meet demand won’t be an issue.
Data center power and cooling requirements for GPUs are significant. It helps to add GPU availability into cost comparison considerations, adding or removing capacity, cost of operations, physical security, business continuity and disaster recovery processes and systems.
5Dmytro Sosnovyk, “Cloud vs On Premise Cost Comparison: A Comprehensive Guide [2025],” s-pro.io, May 2024.
6 Laura DiDio, “IBM z16 and Power10 Deliver Highest Reliability Among Mainstream Servers for 15th Consecutive Year,” TechChannel.com, July 2023. 7Alejandro J. Calderón et al., “GMAI: Understanding and Exploiting the Internals of GPU Resource Allocation in Critical Systems,” ACM Digital Library, September 2020. 8Luboslava Uram, “How Do Mainframes Fit In The Cloud Era: A Challenge Or An Opportunity?” Forbes.com, July 2024.
When looking specifically at the mainframe as an AI platform option on-premise, here are several items to consider:
Mainframes offer robust and dependable operations, minimizing downtime. According to recent studies and surveys, mainframes demonstrate exceptional uptime performance statistics: The IBM z16 mainframe delivers nine nines (99.9999999%) of uptime.6 This translates to just over 30 milliseconds per server annual downtime, while IBM Power10 servers achieve eight nines (99.999999%) of up-time.
Mainframes do not require extraordinary power or cooling systems, streamlining extensive AI workload management. With data centers requiring a great amount of Gigawatts to drive AI, mainframes remain reasonable in their power consumption needs and thus have a reduced carbon output, generally speaking.
Mainframes provide direct access to computational resources, bypassing queues for GPU allocations. This direct access ensures consistent performance and immediate resource availability for critical workloads7, contrasting with the shared-resource model of cloud platforms where users might face delays during peak demand periods.
Mainframes prove more cost-effective for AI systems due to their predictable fixed costs8 versus the cloud’s variable usage-based pricing. Using dedicated hardware and direct access to computational resources, mainframes eliminate the expensive data transfer fees and GPU queue wait times standard in cloud environments. Additionally, companies avoid the compounding costs of cloud services that can escalate unexpectedly with increased AI model training and inference demands.
I do not promote one platform over another; each has advantages and disadvantages that determine its best fit. This article aims to open up discussions about selecting the right platform for AI, considering all requirements such as data privacy, performance and, most importantly, cost.
Mainframes are usually underrated when considering cost-effective and highly performing AI systems. Many enterprises bypass this consideration altogether, even though these resources already exist within most enterprises. This underestimation stems from outdated perceptions about mainframes and their value in various IT solutions, which often default to public cloud providers without thorough analysis.
It’s time for enterprises to consider options for on-site AI infrastructures along with public AI platforms. An accurate evaluation requires a thorough understanding of your AI requirements and the creation of an architecture that will bring the most long-term and short-term value back to the business.
Looking for more AI content? Check out the latest Maven Cast!
Enterprise AI: Fake-out or tidal wave?
Discover why AI's current boom is here to stay with veteran tech leaders Paul Gottsegen and Tim Beerman. In this episode, they discuss:
AI's transformative impact on enterprise operations
Key inhibitors to scaling AI in businesses
Ways to empower teams with AI
And MORE