- Disclaimers / Fine Print
- The usage of the term “Cult” (go here if you want to skip the fine print)
- You might be in the Cult of the Cloud if …
- Don’t care about the money…until they are forced to
- Some Public Cloud Myths
- Myth: Public Cloud providers are required to build a new global footprint
- Myth: The only alternative to public cloud is to build your own data centers
- Myth: The only way to protect myself against another failure in us-east-1 is to go multi region or multi cloud
- Myth: Public cloud is just a bunch of computers operated by someone else
- Myth: Hyperscale data centers are the same as any other data center
- Myth: You can’t run data centers better than the Hyperscale companies
- Myth: You need more staff to managed on prem environments
- Myth: You only pay for what you use with cloud
- Myth: I want to “burst” my workload to public cloud
- For a good laugh
- Lastly
Revealing the staggering level of (often times wilful) ignorance regarding hyperscale public cloud IaaS adoption
(the above line is for LinkedIn previews, as it doesn’t seem to handle the metadata right for some reason)
Before I begin..
I want to say up front I don’t have any problem with “Cloud” as a concept, or even “Cloud” offerings from service providers, or even hyperscale providers. That is not what this site is about.
This site is about the staggering level of (oftentimes wilful) ignorance of the vast majority of the people leveraging these services that has grown over the past fifteen years. Also the massive level of misinformation that is communicated from many sources that often look like they otherwise should know what they are talking about.
This is also specifically about IaaS as offered by the hyperscale providers, nothing to do with PaaS, or SaaS.
This site is also specifically targeting those that have a IaaS bill of at least $50,000 per month(many may be surprised to know the insane number of organizations who are spending that or more and have absolutely no idea what is going on, small companies can be spending ten times that even!). If your bill is a few grand a month or less, you may still find this informative, but it’s really not targeted towards your use case.
Maybe best case scenario here I manage to educate some people with insight into how they can save their organizations potentially millions of dollars per year(if not much more), which is not a bad thing to think about in a deteriorating global economy.
Lastly, before we really get going I wanted to say not a single word of this website was created, nor inspired by “AI”. This is entirely me. I have never yet had a need to use any LLM, chat bot or anything to-date, as such I have not yet used such tools for any purpose. I accept that such tools can provide great value to many people in some situations, I just haven’t come across such a situation yet. I did spend roughly 40 hours writing/reviewing/updating this website before launch(ironically seemed to come to about 1 hour per 1,000 words unsure how that measures up to any standards). I don’t intend it to be much of an ongoing blog of some kind, more of a single large brain dump from my perspective of working in IT over the last twenty five years. Worst case, I can use this website to point people to for more information rather than duplicate my efforts every time I go to write a comment in response to cloud stuff.
Introduction
This is a term I came up with in late 2025, probably around the time I was engaged with folks writing messages regarding a large AWS outage at the time, and just seeing the massive amount of misinformation being distributed as if it was fact. Not really new but the sheer volume of it was rather alarming.
Those of you that have been around in the industry a while may recall the hacker group “The Cult of the Dead Cow“. I have no opinion on the group or their actions(I know almost nothing of them other than the fact that they exist). But in case it wasn’t obvious, my “Cult of the Cloud” is named like their group, the phrase just made sense, so I leveraged that brand(?) idea to create this new term.
While I target hyperscale providers specifically(pretty much universally applies to them all), technically, there is not a hard restriction on what this term applies to. Subjective, I know! It can be complicated(more below).
My first exposure to this cult was actually way back in early 2010, which I documented as part of “My cloud journey“, ironically it wasn’t even hyperscale at that point.
When I say Cloud, I mean IaaS
There are three major types of “cloud” things, for the purposes of this blog I am referring solely to being a customer of IaaS. You can see a bit more on what I am talking about in this dedicated post.
While I may “pick on” AWS , I am using them only as an example(in part that is where my IaaS experience comes from), not singling them out as the only player. All other hyperscale cloud companies have adopted similar/same models as AWS, so all of the things I talk about apply to all of them.
So for the purposes of simplicity, on this website, when I refer to the term Cloud, I usually mean “IaaS Cloud” (most commonly as implemented by the Hyperscale providers). I say usual because I’m not a professional writer(and I am the only one doing this), and I may not be able to catch and clarify every single time I use that term. In part because it is kind of strange to have a term “Cult of the IaaS Cloud” by contrast, or “Cult of IaaS” (because I am not going after IaaS as a concept, IaaS is fine as a concept, by itself). I have confidence that most technical people at least will not have a problem following along. If you are not technical, well I still hope you can follow along, but if there is something that requires additional clarification let me know.
I believe hyperscale public cloud usage dwarfs all other forms of public cloud by a large margin, so I believe when most people think of public cloud IaaS they probably think of these providers.
The usage of the term “Cult”
For most of my life, the idea of a cult was always a distant concept. Something I saw on TV show or movie, or even in a news story such as the Waco Texas siege in 1993.
I see people falling into one of the following buckets (almost like two sides to the same coin) as being either in this “Cult of the Cloud”, or heavily influenced by the cult.
- When a person is presented with overwhelming evidence(yes I know that is subjective) as to why they should not use IaaS hyperscale cloud and their responses have no substance to them, one response could be nothing more than “I don’t believe you”. They have no actual evidence to refute anything, nor do they express any interest in obtaining such evidence to try to justify their position. They don’t care, you are just wrong(that doesn’t stop the person from initially disputing your claims, because if they did not dispute in some form you have no idea what their position is). The other person isn’t even required to respond. They can for example hear your arguments, and say “ok”. Then immediately discard them in their mind(without any justification or response back to the other person).
- To signify people who when asked a question regarding hosting, infrastructure, building applications and stuff, their only response is cloud related. So much so that perhaps they can’t even imagine how/why someone would do anything but cloud. Many more recent entrants into the workforce were raised on nothing but cloud. That’s all they know(I absolutely saw this part coming fifteen years ago). But many folks that have been around for a lot longer are also in the cult to some degree (this continues to surprise me even in 2025).
I have worked directly with such people, who seemed perfectly happy with the way things were then wanted to add in stuff like IaaC, and Kubernetes, and other technologies most often associated with public cloud deployments. Really for no apparent reason other than “it sounded cool”, and it was “on trend”. Eventually they went hard core wanting to do cloud stuff. Some eventually ended up at a company where they did attempt a data center migration but from what I was told that migration was sort of a failure as they spent several years on it and still never managed to fully move to cloud, before the org ran out of money and was acquired, most employees were axed.
(Side note: really seems there is a strong “Cult of AI” out there in 2025, from what I have read anyway in articles and comments(especially on LinkedIn), feel bad for those that are being forced to the stuff if they aren’t interested in using it, since the benefits seem mixed at best in most situations. Maybe things will cool down once the bubble bursts. I haven’t had any exposure to this situation myself with “AI”. Maybe someday someone will register the “cultof.ai” domain and write about it.)
You might be in the Cult of the Cloud if …
- You say something like
- “Running a global platform requires infrastructure only three companies can provide. That’s not competition. That’s dependency. And dependency just became dangerous.” (LinkedIn post)
- React as if the only solution to preventing impact from another AWS outage in a given region is either to go multi region, and/or multi cloud (MANY LinkedIn posts/comments)
- Public cloud really shines when you are at a large scale (LinkedIn comment)
- The only way I can scale my application to suite my users is “auto scaling” in public cloud (LinkedIn comment)
- “Our company needs to be in the cloud because we want to focus on the business not on managing infrastructure” (couldn’t find the exact quote on LinkedIn.)
- “We have no choice but to use AWS, and that’s a problem” (President of Signal chat platform, link is to news article)
- Really think these things are cool (the more matches, the more likely you may be in the cult) – I don’t believe any of these things bad(on their own), they all have their use cases, I just see them as indicators at the same time.
- You do something like
- Commit to spending $400 Million per year on Google cloud
- Commit to spending $300 BILLION on Oracle Cloud over a few years
- Anything that is NOT in one of these use case examples which are optimized for IaaS.
- You are
- Broadcom (who seems to be axing all of their non cloud VMware product SKUs, because profit!)
Don’t care about the money…until they are forced to
A sure sign of being in the “Cult of the Cloud” are organizations that invest heavily to move to public cloud, perhaps stay there for a while, then move out again. You can ask yourself, why did they move to begin with, and why did it sometimes take several years for them to realize reality and change course?
IMO, there is only one answer. There were people, almost certainly in high positions at the organization that were (perhaps unknowingly) members of the “Cult of the Cloud”, convinced without a doubt that Cloud is the future and they have to do it. So they spend a lot of money making it happen. They see it’s super expensive, but they keep at it, convinced they can make it work, maybe hiring people that specialize in optimizing your cloud spend. They keep at it .. until someone else likely forces their hand on the costs(or perhaps some other reason but I’d be 95% of the time it is cost), and they decide to move out.
Some Public Cloud Myths
(Note: this content is duplicated on this new dedicated page as of 12/6/2025, I am retaining this duplication for a few months as I have already sent out many links directly to these topics(and it’s impossible to auto redirect as the links were sent with html anchrors, but at some point maybe six months I will remove the myths from this home page and just direct people to the dedicated page)
Myth: Public Cloud providers are required to build a new global footprint
It really surprises me how often I see something like this written out. When it’s so easy to disprove with real world examples. If you need a new global footprint spun up in the next 24 hours, then yes you probably do need a public cloud provider for that initial deployment anyway. Most serious organizations don’t need to make such decisions on such tight time frames. I know one organization I was at needed a location spun up in Amsterdam in 2012 for example, we spent a good six months thinking about it before actually pulling the trigger(in that case it was to a data center with a tiny footprint, not even one full rack).
Most Global CDNs do not leverage public cloud for their edge services
I can’t realistically say there isn’t a single independent CDN out there that doesn’t use some cloud for their edge(but I can’t recall the last time I came across a CDN endpoint with a cloud IP). Obviously I am referring to non hyperscale branded CDNs(which include Azure Front Door, and Amazon Cloudfront both are operated by said cloud providers). But if you look at major CDNs like Akamai, or Cloudflare for example. All you need to do is find the IP of one of their endpoints, and use the WHOIS infrastructure to see who owns that IP. Even smaller CDNs like Fastly, CacheFly (which I have used for the past year), and Imperva (which I have used for the past 3 years) are similar.
At my previous company for quite a whle we used a CDN named Instart Logic. They were founded in 2010, right during the initial public cloud boom. So by no means were they a major player, they were tiny by comparison. I specifically recall in a conversation with their senior folks them mentioning sort of proudly that they could and did offer pricing to customers that was significantly less than what Amazon Cloudfront costed. Several years later Instart Logic ran out of money and their assets were acquired by Akamai. I think it was in fact Instart Logic that I may recall as being the only independent CDN (that I came across) that had one or more public cloud endpoints (in distant countries).
It would not surprise me, that if in SOME MARKETS, especially remote markets, that CDNs could leverage public cloud in those areas(assuming cloud providers exist there) in their very early days if their traffic is very low and they don’t have a lot of customers yet that would make some sense. Just as it makes sense for public cloud providers to leverage co-location services in those same kinds of markets to spin up their services faster than building(or buying) their own data center in that region.
Why do CDNs build infrastructure themselves? Much of the same reason as I do really: cost and control. But CDNs have another thing to worry about, which most customers don’t ever think about, and that is network routing. They need multiple internet providers as well as controlling their own BGP routes to maximize network performance for their customers.
I believe the same holds true for most global DNS providers as well(for the same reasons).
I actually had an exchange recently on LinkedIn with a employee from Cloudflare. Initially I joked and said something like “You mean you don’t need to use public cloud for building a global network? That’s not what I heard..” The initial response was Cloudflare does not leverage hyperscale services for any CDN services or CDN workers(relatively speaking, IMO, CDN workers are still a very new concept with Cloudflare leading the industry I believe in that area but I could be wrong). The latter bit actually kind of surprised me. I was assuming, until that point that it was possible Cloudflare would leverage hyperscale IaaS to handle “burst” workloads. But they clarified again they do not do that, everything is “on prem”.
Myth: The only alternative to public cloud is to build your own data centers
This myth was added December 2025, about a week after the site launched in the event you saw the site before and didn’t notice this, it’s because it wasn’t here yet.
I was thinking a lot last night, and thought this would be a good myth to have as well. I cover it less directly in other spots on the site, but here I will be direct. This myth is inspired by a LinkedIn post that I replied to a few weeks ago from someone who claimed to be a former AWS employee, and was separated from the company just a couple of months prior(so either they quit/fired/laid off around August 2025). I don’t know this person, but he put a pretty blunt rant in a LinkedIn post, and their algorithm pushed that to me a few days after they wrote it. I say rant because their post literally started with the words “ENOUGH!” (all in caps).
Anyway, they went on to suggest customers keep using public cloud(in response to the big AWS October 2025 outage), and not try to build their own cloud, because it’s really difficult to do and you aren’t smart enough to do it. That part is fine and I agree with actually(my argument is most orgs don’t need their own cloud, utility computing is more than sufficient for 90%+ of the workloads, for super large organizations OpenStack or VMware Cloud Foundation may be good solutions if they “require” IaaS on prem), but then they went into the myth.. they started talking about needing to build data centers(actual facilities), acquire the land for the data centers, and go through the construction process, why do that when you can just use public cloud?
The myth of course is you don’t need to build data centers, for the vast majority of organizations(95%+), the co-location service has been around for at least twenty five years, much longer than public cloud was even a twinkle in anyone’s eye. There are thousands of service providers around the world in almost every conceivable market for customers to rent data center floor space or rack space to put equipment. In fact, hyperscale companies themselves often leverage co-location in new markets to get up to speed quicker. Even in existing markets, I recall touring a small co-location data center (link to the actual tier 3 facility though it had a different owner at the time) north of Seattle fifteen years ago, literally half of the facility was Amazon gear, that was in their home market! The server hosting this very site is hosted at a co-location facility in Fremont, CA(run on servers owned by me personally and I pay the co-location bill). This server also runs email, DNS and other services for me.
Back to the person I responded to on LinkedIn. I gave them some info(this was before this site was an idea in my head), but their responses were somewhat nonsensical. We weren’t talking the same language. This person was obviously pretty deep in the cult as they could not form a coherent argument. Eventually they just stopped replying(which isn’t unusual when engaging with such people).
Myth: The only way to protect myself against another failure in us-east-1 is to go multi region or multi cloud
Another example where I’m consistently surprised the sheer volume of people and organizations echoing this nonsense. Especially in 2025. Now I could see people say this when us-east-1 had a power outage and was down for multiple days for several customers back in 2012. The cloud market was still evolving, vs today I’d say it’s quite mature. Point is we got a very important data point back in 2012(and more smaller points in the years since) that you should not rely on a single cloud region for important stuff. I have very little doubt, that any serious organization that is leveraging IaaS has technical people inside that are actually doing the work that have thought about it. Maybe they’ve even had formal discussions with their organization on this very topic. Especially for larger customers I have no doubt that such conversations have taken place at all of them at some point.
So then why do so many organizations have serious outages when us-east-1 goes down if this situation is common knowledge?
The answer is simple and obvious, something I raised fifteen years ago in my original blog post. Saying we need to go multi region or multi cloud is really easy. It is not easy to do. It’s also extremely expensive (as if public cloud wasn’t already extremely expensive). I am sure there are software packages that can make such availability between cloud providers and regions an easier process, however that will absolutely increase the costs even higher(if such tools were worked well, and cost effective they would be in much wider use).
So these organizations are making a conscious decision to not go multi region or multi cloud because it is too difficult/complex and too expensive. You really can’t blame them for it if some of Amazon’s own services like Alexa, Amazon Ring doorbells, and I think I read Amazon Prime had issues during recent us-east-1 issues in October 2025? If Amazon themselves doesn’t do this for everything then you know there’s reasons why all these other customers are not doing it either.
For me the answer is simple, as I said on my blog fifteen years ago, all you have to do is look at their SLA to be able to rule them out for running anything that is mission critical to your organization.
I moved my last organization out of AWS nearly fourteen years ago(with 7 month ROI), we have had 100% data center availability in the time since. In fact the only time in my career where I did not have 100% data center availability was in 2007, that facility was poorly designed and I moved my organization out within a couple months.
Myth: Public cloud is just a bunch of computers operated by someone else
I see people say “cloud is someone else’s computer” or similar quite often. Technically that is true, the myth really is in the implied simplicity of the system. Running a computer system is easy. Running a cloud platform is hard. It is an enormous amount of complexity, with an enormous amount of automation(which is yet more complexity) riding on top to try to tame the complexity below. A massive software code base that is constantly evolving and obviously will have bugs like anything else.
Just be careful and don’t confuse Public Cloud IaaS as being in the same league of reliability(see “You can’t run data centers better…” below for more clarification on the term reliability) as a simpler managed hosting operation for example.
Myth: Hyperscale data centers are the same as any other data center
I can see how less technical folks can be ignorant on this topic. They hear the term data center, and well that’s a pretty generic term. Even the U.S. Government has an absurd definition of what makes a data center a data center.
“This has been a point of contention within the nation’s bureaucracy itself, with several redefinitions of the term changing the number of data centers calculated. Back in 2010, it only meant sites that were 500 square feet or more, and met stringent availability requirements – criteria that covered 2,094 data centers.
Over the following years, that definition was expanded to include almost anything with a server that provides services (whether in a production, test, staging, development, or any other environment). A room containing only print servers, routing equipment, switches, security devices (such as firewalls), or other telecommunication components, was not a data center.
By May 2018, the government said it had 12,062 data centers (as of August 2017) – although it did not create nearly 10,000 data centers in seven years, most were already there, just not covered by the previous definition. Others were there previously, but had not been included by the 24 government agencies subject to FITARA and DCOI, due to poor accounting practices. It is therefore not clear how many data centers the US government has actually added over the past decade.”
www.datacenterdynamics.com article from 2019
IMO – anything less than say ~5,000 square feet of space used to host IT equipment is better referred to as a “server room” (regardless of whatever type of services are hosted there, I suppose one exception may be if the facility is built to N+1 availability standards which for such a small place I think would be very unlikely).
The point is(and sorry because I raise this in other area(s) on this same page) true hyperscale data centers are intentionally built to lower quality standards for cost purposes, with the intention that you work around those lower quality conditions by distributing your data/applications across multiple data centers(availability zones) and regions. Losing a data center is like losing a rack for a smaller customer(my stuff would not survive a rack failure, though I’ve never had a rack failure that wasn’t also a data center failure, last one was in 2007)., or could be like losing a single server for a really small customer. Compared to a world class tier 4 facility where it is built from the ground up to never go down(not that it’s impossible for an outage just extraordinarily unlikely, in my case during the past fourteen years not even the slightest blip indicating any service impact whatsoever).
Myth: You can’t run data centers better than the Hyperscale companies
I had people tell me that fifteen years ago. I didn’t believe it then. But if I am to be completely honest, fifteen years ago I lacked the data on both sides of that equation to properly answer it. I’ll also say I don’t run data centers, never have, never will. But people often use the term data center as a layman’s term for server/network/storage infrastructure, rather than being specific about operating the facility that such infrastructure is installed in. So when I refer to the term data center I am in fact NOT referring to actual facility operations of a data center.
Now that fifteen years has elapsed since my first blog post, and honestly the first year I heard comments such as those. I can say today, in 2025, with absolute confidence I can, and have operated data centers far better than even the public track record of any public cloud provider has done so themselves in the same period of time.
But this shouldn’t be a surprise to anyone technical really. If it is then you are missing the point about the architecture of hyperscale providers. In an ideal world, a customer would be distributed amongst several different regions (at a minimum) of a given hyperscale provider. Availability of that customer would not be dependent upon a single region(any more than my availability is dependent upon a single server), but on the entire system as a whole.
But we do not live in an ideal world, and we have fifteen years of hard evidence(customers going offline when us-east-1 goes down) now that most organizations do not operate in that ideal world either.
My systems are not built to survive a data center failure, they never have been. In large part because doing that is again, expensive. But more importantly, if you are deployed to a world class tier 4 data center, the likelihood of a full facility failure is exceptionally remote(I have twenty two years of data center experience), so much so that for most organizations(including every one I have worked for in the past 25 years) it is a waste of money to go that route. Now if you have the money and want to do it anyway, more power to you, that is great.
In my opinion, the whole concept of “Disaster Recovery” was really born out of “data centers”(aka “server rooms”) that were actually on site in corporate & government office buildings where availability is far less than a world class tier 4 data center. If your critical servers are in your office, well that is not a great place for them. One company I was at, ironically enough fifteen years ago was in this situation for their Microsoft Exchange system. Before I was even hired at the company, they had their Microsoft Exchange hosted on site in their office(despite having other critical systems in data centers). There was a giant wind storm, and the electricity for the entire city of Bellevue, WA (and others I think) went out for more than 24 hours. I was living there at the time and it happened to be the same city that this company’s HQ was in. They learned then, and moved their Exchange server(s) to their data center following the outage(or maybe during the outage I am not sure).
I’ll also toss this out there, a realization I came to about a decade ago
“For the vast majority of organizations, a disaster means only one thing – your data is gone and is not coming back. Outages, even extended multi day outages are a PITA for sure(I have personally worked through several), but not a disaster. It would certainly be nice to have protection against such extended outages, but reality is most organizations are unwilling to invest for that(in cloud or on prem).“
cultofthe.cloud quote
Look at the recent(as of this post) massive near PB scale data loss in South Korea. I saw so many comments somewhat attacking them for not having proper backups when they said why they don’t in the original articles.
“The G-Drive couldn’t have a backup system due to its large capacity”
South Korea officials on why there was no backup
There was no backup because it was too big. What does that mean? It means they wanted to back it up but there was no budget allocated to do so.
As I mentioned in my blog post fifteen years ago, true hyperscale data centers are not built to the same standards as the facilities I use. They don’t do this intentionally in order to save costs(and their decision is a sound one), because availability is not determined based on a single data center, or even a region, it is the system as a whole(all regions). Of course you have to have your applications deployed to the “system” rather than just a region, or a single availability zone. That is where of course the costs and complexity skyrocket(even further).
Another realization I had recently which it seems nobody else in the world has recognized yet(at least I haven’t seen signs of recognition) is regarding the recent large outage at AWS in October 2025. Bottom line is a single entry(technically, the lack of an entry in that case) in a DNS Zone hamstrung six data centers, rendering many services unavailable for an extended period of time. That is not a bug, that is a critical design flaw. How many other design flaws like that are there in their infrastructure? Same goes for Google, Microsoft and other hyperscale companies?
Such design flaws don’t exist in my environment because I don’t put such complexity into my systems. I have in fact had my VMware vCenter control plane that managed ~800 VMs fail and stay offline for over a week about a decade ago. The effect: zero impact to operations. It was a bit annoying, but nobody outside of my team knew or cared because it had no impact on anyone but my team(and me specifically). Recovery of the system was quite simple, but I spent a lot of time working with support out of paranoia regarding what may happen when I do recover the system(in the end, nothing bad happened, it was just paranoia, but I am a pessimist by nature).
You personally may not be able to run a data center as well as I can, but you don’t need to be me to get good results.
Myth: You need more staff to managed on prem environments
I cover this to some extent here. But will repeat myself to cite this out in the “myths” section of this website. This myth was added December 2025, about a week after the site launched in the event you saw the site before and didn’t notice this, it’s because it wasn’t here yet.
I have seen on occasion, going back to the origins of “My cloud journey” as far back is 2010, but many times in comments since, less technical people having ignorance around staffing for on prem vs public cloud. To be clear, I do believe that both situations (if done to their fullest potential) are very different skill sets.
My personal cloud journey had no staffing changes for on prem vs public cloud. But I don’t expect you to believe my story alone, so as an alternative I can only offer that the CEO of 37 Signals specifically stated on his blog that they did not have to make any staffing changes at all as a result of moving out of cloud.
“[..]And crucially, we’ve been able to do this without changing the size of the operations team at all. Running our applications in the cloud just never provided the promised productivity gains to do with any smaller of a team anyway.”
quote from the CEO of 37 signals
I firmly believe, if an organization has moderately competent staff operating public cloud IaaS (done to it’s fullest potential or close to it), that same staff will have the brainpower to pretty easily adapt to an on prem environment, if they choose to. On prem enterprise offerings are really built from the ground up to be easy to use, and high availability is often built in. You don’t need to design your application stack in a special way to account for unreliable infrastructure(you still can, as nothing is perfect). On prem enterprise offerings usually come with nice user interfaces as well, they may also have APIs, though they would me much less frequently used in most deployments. I’m sure if the staff has zero experience with basic networking concepts and servers etc, they may very well make some mistakes initially. Just as they would make mistakes initially doing stuff in public cloud. Vendors are often more than willing to provide free guidance, or even paid consulting as well if needed. Documentation is also generally better, and the quality of the products is higher, with a lower rate of change.
I’d also argue that for most organizations(90%+), on prem would really be a small environment. IT equipment is extraordinarily powerful these days, and you don’t need much of it to accomplish a lot. What used to take forty racks two decades ago can be done in less than half a rack today. In most situations, you can easily run two thousand virtual systems in a single rack without much effort. I believe that covers a large portion of the world’s organizations.
Public cloud IaaS on the other hand, if done to it’s fullest potential, almost universally requires additional complex automation that runs on top of the system in order to try to tame it (such as Terraform, or IaaC offerings, things cloud people would immediately recognize), to provide for situations like recovery from system failures etc(vs on prem environments recovery from server failures is automatic and seamless). Server lifetimes in public cloud can often be measured in days or months, vs on prem lifetimes can easily stretch to years or even a decade if desired.
You can try to operate public cloud like you would on prem, I believe most orgs do that, at least initially (certainly mine did, have read many stories over the years of other orgs doing the same). That will usually result in even higher costs, and more frustrations as your expectations will not be met(“you’re doing it wrong” – which I am happy to admit, though I was only following the lead of other people who had years more experience in AWS than me). My main point on the site here is regardless of whether you are “doing it right”, you can’t work around the fact that IaaS is broken-by-design (at least for cost and efficiency reasons).
Myth: You only pay for what you use with cloud
This myth was added December 2025, about a week after the site launched in the event you saw the site before and didn’t notice this, it’s because it wasn’t here yet.
This is a commonly used myth. But unlike most of the others it’s not ENTIRELY a myth, just mostly. I will clarify that shortly. The point is more people refer to “cloud” as a general purpose all encompassing platform when saying “pay for what you use”, and not being specific, in the vast majority of cases they aren’t specific due to their ignorance of the situation.
Paying for what you use
Taking AWS specifically as an example, there are some services that are truly “pay for what you use”. The biggest one is their S3 object storage. You pay for the bytes that you store in that, for the number of times it is replicated, and for the bytes going in and going out of the platform. Which is one of the reasons their costs are so high relative to some alternatives(disclaimer: have never used that service). Another is probably SQS, their queuing platform.
Paying for what you provision
For most orgs, I believe the bulk of their costs will be here. That includes things like EC2 (virtual machines), and RDS (managed database offering) being some of the biggest. Also that includes block storage such as EBS (which is tied into EC2), and probably other things (been so long since I used the services, I don’t feel a need to cite every service anyway).
Taking a super simple situation, you have workload that can benefit from 8 CPU cores for a few seconds at a time, a dozen times per hour. Most customers would just make a VM that has 8 CPUs on it and be done with it. This is where the trap is. Your 8 CPUs are idle for 90%+ of the time, you are paying for those 8 CPU cores regardless of their utilization. The capacity from those 8 CPU cores cannot be allocated to other resources in the meantime, leading to stranded capacity, which is the most key point regarding Hyperscale IaaS: broken by design, something I first raised fifteen years ago.
Same goes for disk space, any disk space that is not being used, you are being charged for regardless(disk space is especially wasted on the “instance level” disks attached to the VMs). In a utility computing model, you allocate space, but that space is not consumed until it is written to, and the space is then freed automatically when the data is deleted.
Not only disk space, but disk I/O is wasted as well(for non technical people I/O signifies the actual performance you get out of the storage). When the storage for that VM is idle, you can’t leverage that capacity for any other systems, it is stranded.
Same goes for memory, due to fixed instance sizes, you are forced into cookie cutter VM designs that do not offer precise levels of control over their capacity. The result is you are forced to choose the closet sized system to your requirements and just eat the wasted resources that come along with it, because almost certainly most workloads will not be able to fully leverage the CPU/Memory/Disk of every VM in the environment. With Utility Computing, resources are shared, so what one VM isn’t using, another VM can use in a safe manor.
Put most simply, go provision a 8 CPU VM on any cloud service, fire it up, don’t “use” it. Let it sit doing nothing. In a perfect world you’d be charged perhaps just a few cents for consuming a couple GB of disk space, and some tiny amount for consuming less than 1% of the CPU capacity and probably 2-3% of the memory capacity. But in the hyperscale IaaS world, you are charged the full price, regardless of whether or not you are using it. Just having it on, is considered using it. That may sound strange, but I’m sure many admins can relate to having servers “on” but nobody is “using them”(perhaps those who were using them forgot to tell IT they don’t need that system anymore for example, or they left the organization). “Usage” for me implies actual resource utilization. Hence you are paying for what you provision, not for what you use.
Myth: I want to “burst” my workload to public cloud
This myth was added December 2025, about a week after the site launched in the event you saw the site before and didn’t notice this, it’s because it wasn’t here yet.
I’ve heard this many times over the last decade and a half, even from technology leaders who I have worked for personally. They want to have their steady state “on prem” then be able to “burst” to a public cloud for large traffic events. On paper it sounds nice. But this strategy will not work in the vast majority of cases, and it is not public cloud specific. It often will not work for bursting to another “on prem” data center either for the exact same reason: latency.
Latency is a performance killer
When thinking about this myth, people never consider latency, which would kill such plans to burst almost dead in their tracks 90%+ of the time. There are exceptions to everything of course. One way you could work around this is if you specifically locate your “on prem” resources geographically near to your public cloud provider (or other “on prem” data center – say less than 10ms of network latency, could be more or less depending on your stack’s tolerance), which means within maybe 15-30 miles of the public cloud’s servers you intend to burst to(the fiber between your two facilities may have to travel some distance to a central location, so can’t rely on a basic map to determine distance). So in my case for example being hosted in Atlanta, GA, it would be ludicrous to expect to burst to a public cloud player located in Virginia, way too far away.
People almost universally overestimate their resource requirements
I’ve seen it time and again at least with people I have worked with. Big cloud aspirations, need cloud to scale, this and that.. when they literally have no idea how much resources are required to run their system. They just think I want to scale to 10X to 20X traffic, and to do that I think it would be best to burst to a cloud. Whereas in reality in most cases(all cases in my career actually) such 10-20X bursts in traffic amount to very little in additional infrastructure. Maybe at best a doubling of the server capacity. Yes, double capacity to handle 10-20X traffic, because most of the time the servers are way under utilized on normal traffic days. People seem to be thinking in CPU/memory capacities of two decades ago (single core CPUs, 2-4GB of memory per server), that’s the only way I think I can try to explain why they think the way they do.
Modern systems, even systems from a decade ago are obviously far more powerful than that. At my last org in 2014 I deployed a trio of 24-core/96GB memory systems with LXC running on bare metal for their main e-commerce stack. It was overkill for sure, but it was also super cheap especially taking into account the licensing for the e-commerce platform itself. Under normal workload the systems ran well under 10% CPU/memory. Under the highest load ever they probably topped out at 70-80% CPU, ran for about 6 years without a single issue, until the app stack was replaced with a new app stack.
If I had a stack with far more traffic, you can get servers literally with up to 384 cores today in two sockets(sixteen times the number of cores I had in 2014 with dual sockets), though I would go for fewer cores and distribute among a few additional systems depending on your overall capacity requirements. Web server software has had “auto scaling” of web workers for literally over twenty five years now. No need to fancy Kubernetes stuff, just set the settings in the container in LXC and let it go. Near flawless operation of LXC on bare metal for eleven years now, simple, fast, easy to manage. Certainly avoid web server software that doesn’t have dynamic worker support(such as Unicorn, which is for Ruby). Speaking of Ruby, I remember deploying my first production Ruby stack in 2007, originally I believe it used Apache and fastcgi, and that had some weaknesses. I came across mod_fcgid, and deployed that instead, with great results. Not only could the workers scale up and down they would auto restart after some time which really helped working around memory leaks. More recently a Ruby stack I deal with uses Puma, after half a decade of complaining about how bad Unicorn was. Also I’d suggest using server software that can report internal metrics on worker utilization. Apache has had this for well over two decades with mod_status, Puma has it as well(Unicorn did not last I checked). I mention this because I was shocked how poor the internal metrics are from nginx by contrast, at least the open source version, the commercial version may be much better but isn’t cheap!
So the next time you hear someone suggest, or perhaps if you yourself think about bursting to cloud or anywhere else really, think hard about how much resources your app really uses. If your not needing to get to thousands of cores for short periods of time, then bursting to cloud is not something to consider IMO.
For a good laugh
I had completely forgotten about this, but if you haven’t seen this video from 2014, it is great, the whole video is great but he directly talks about cloud stuff starting at 5min36s into the video. Quite hilarious and so true..
Lastly
If you are in the group of super happy cloud customers paying $50,000+/mo and think that is a fair price I’d really be interested to know what you actually have provisioned and how much you are actually paying for it. Honestly not expecting anyone to come forward, but I’d be mainly interested to be proven “wrong” in some scenario. Such a scenario would not include being happy with cloud and over paying by millions per year and not caring about it, I’m perfectly happy to accept there are many such situations out there(goes back to the tagline of the site), having been close to such situations myself during “My Cloud Journey“.
(Side note: I’d really like to see what Atlassian pays for their cloud stuff, I’m guessing it would be entertaining to see the real numbers..)