The Good Tech Companies - The Economics of Public Cloud Repatriation and Why It Is Cost-prohibitive at Scale

Episode Date: September 16, 2024

This story was originally published on HackerNoon at: https://hackernoon.com/the-economics-of-public-cloud-repatriation-and-why-it-is-cost-prohibitive-at-scale. The publ...ic cloud doesn't deliver cost savings at scale. It delivers productivity gains, to a point, but it will not reduce your costs. Check more stories related to cloud at: https://hackernoon.com/c/cloud. You can also check exclusive content about #minio, #minio-blog, #public-cloud, #public-cloud-repatriation, #cloud-computing, #modern-datalake, #kubernetes, #good-company, and more. This story was written by: @minio. Learn more about this writer by checking @minio's about page, and for more stories, please visit hackernoon.com. The public cloud doesn't deliver cost savings at scale. It delivers productivity gains, to a point, but it will not reduce your costs.

Transcript
Discussion (0)
Starting point is 00:00:00 This audio is presented by Hacker Noon, where anyone can learn anything about any technology. The Economics of Public Cloud Repatriation and Why It is Cost-Prohibitive at Scale, by Minio. What has become clear over the past couple of years is that the public cloud, for all of its benefits, doesn't deliver cost savings at scale. It delivers productivity gains, to a point, but it will not reduce your costs. There is goodness in the public cloud as it offers an incredibly powerful value proposition, infrastructure available immediately, at exactly the scale needed by the business, driving efficiencies both in operations and economics.
Starting point is 00:00:36 The cloud also helps cultivate innovation as company resources are freed up to focus on new products and growth. However, the mere act of interacting with your data generates egress costs, which have been shown to be egregiously predatory. Thesis particularly true when the applications and workloads are persistent, consistent, and data-intensive, high-volume, velocity, variety of read and write calls, or involve high-performance analytics. They just are not sustainable in the public cloud as they grow. As industry experience with the cloud matures, and we see a more complete picture of cloud lifecycle on a company's economics, it's becoming evident that while cloud clearly
Starting point is 00:01:14 delivers on its promise early on in a company's journey, the pressure it puts on margins can start to outweigh the benefits, as a company scales and growth slows. Sarah Wong and Martin Casado, Andreessen Horowitz, 2021 that take, while incredibly prescient, was from 2021. In 2024, data has grown an average of approximately 20% per year according to an IDC study from 2022. The workload shave gotten bigger and scale has become the problem. Not the technology of scaling, but the cost, specifically, of scaling in the public cloud. According to David Linthicum, there are three main reasons the public cloud is being kicked to the curb. Cost. For certain workloads, it's just too expensive
Starting point is 00:01:57 to run them in the cloud. Commodity hardware prices have fallen so far in the last few years that hardware isn't the huge capex that it used to be. Failed migrations, workloads that have not been refactored optimally or adjusted to be cloud-native have ended up costing approximately 2.5x what they were originally projected to cost. Inefficient apps on-premise turned out to be inefficient in the cloud. Making them more efficient is costing too much and ending up not being worth it. Diminishing need, applications that originally needed to be spun up quickly and efficiently as well as able to scale have scaled in the cloud but now are just a machine of repetitive tasks and data storage these apps no longer benefit from the fast scalability the cloud can provide and are now just utilizing a lot of expensive storage. The need is no longer there
Starting point is 00:02:45 for a flexible, quickly scalable model. The commoditization of hardware has presented a new, cost-efficient way to run these workloads. According to a recent Barclays CIO poll, many CIOs agree. From that same A16Z article, in 2017, Dropbox detailed in its S1 a whopping $75 million in cumulative savings over the two years prior to IPO due to their infrastructure optimization overhaul, the majority of which entailed repatriating workloads from public cloud. When your cloud costs start to hover around 50% or more of your cost of revenue, like Asana, Datadog, Prerender, EO, and others, it's time to start looking at what your workloads are
Starting point is 00:03:25 doing in the public cloud. Organizational and business leadership need to be aware of this so they can pivot. Certain workloads, such as running a data analytics cube, in-memory database, or a data analytics cluster are better fits for on-prem infrastructure. But these are just a few examples. To focus in on a particular trend that will be impacted by this scale problem, let's look at AI, ML, and specifically, LLMs, large language models. If your current AI initiative has you building your own LLM or foundation model, consider the cons of doing it the public cloud one. High costs of scale, training and running LLMs at scale is expensive, and as the LLM gets bigger, so do the costs of public cloud. 2. Loss of control. You have less control
Starting point is 00:04:09 and visibility over implementation, infrastructure, and performance. 3. Vendor lock-in. If you have trained LLMs on one cloud platform, it will be difficult to port to a different platform. Furthermore, depending solely on a single cloud provider entails inherent risks, particularly concerning policy and price fluctuations. Backslash.4. Data privacy and security. I would also mention data sovereignty here. The bottom line is that you are trusting your data to a provider with servers spread in worldwide regions. Backslash. If your enterprise is dealing with petabytes or
Starting point is 00:04:45 trending to that kind of scale, the economics favor the private cloud. Yes, that means building out the infrastructure or leasing it from someone like Equinix, including real estate, HW, power, cooling, but the economics are still highly favorable. The public cloud is an amazing place to learn the cloud-native way and to get access to a portfolio of cloud-native applications, but it is not an amazing place to scale. A.N. Example of the economics So, what are the economics? For illustration, let's take a 10-petabytes modern data lake that uses Kubernetes to manage Apache Spark and Dremio for persistent and consistent analytics workloads.
Starting point is 00:05:23 These types of workloads require frequent data reads and writes from object storage for analysis, updating and refreshing, and presentation. From a cost structure perspective, we will use some assumptions for the main cost drivers these data lakes and workloads have limited utility if we can't use the data. The data provides insights, serves other applications, and may need to be processed outside of the storage environment. This requires the data to be transferred out of storage. If we assume 500 terabytes per month being accessed, that only represents only 5% of the data being accessed per month. Backslash dot. For data, object requests, puts, gets, heads, etc.
Starting point is 00:06:02 We have worked with customers of similar consistent and persistent workloads that see over 10B object requests per month. So, we can use 10B as a conservative assumption for this type of workload. Backslash. Similarly, those same customers see around the same number of encryption requests for those objects, so again using 10B as a conservative assumption for our example. Backslash dot. With those assumptions, the cost of public cloud could look something like this. Annual public cloud costs for 10 petabytes equals $7. 3 meters or $0. 061 per gigabyte, moth assumptions above are just that, and the fact that there are so many tells you how variable the
Starting point is 00:06:43 costs can be depending on the particular usage and workload factors. This creates significant challenges in trying to budget. In addition, having no tiering or any data lifecycle activity is also somewhat rare, AS organizations usually move data to colder tiers if the data becomes less active. But all of that just adds to the cost, as different tiers have different prices per gigabyte per month, as well as a cost for automatically moving objects into those tiers. Minio allows you to scale on the private cloud, colo or a data center, using the same technologies that are used on the public cloud. S3 API-compatible object storage, dense compute, high-speed networking, Kubernetes, containers and microservices.
Starting point is 00:07:26 One major difference is there are no costs for object requests, gets, puts, etc. Nor are there any limits on the number of requests, as long as the infrastructure supports it. In addition, encryption is included with the Minio Enterprise and Community versions and there are no limits on the number of encrypted objects requested. This optionality offers the ideal mix of operational costs, flexibility, and control. It is true that you will take on capex for hardware, but by starting small and taking advantage of key cloud lessons, elasticity, scaling by component, decoupling compute from storage, enterprises can minimize the initial outlay and maximize the operational savings. When paired with commodity hardware and operating in a colo or proprietary data center,
Starting point is 00:08:10 Minio can reduce those public cloud costs, as well as costs associated with managing those cloud costs, by anywhere between 50% to 70%, on Dean some cases, higher. Annual colo, mini-o costs for 10 petabytes equals $1.7 meters per year, or $0.014 per gigabyte, mo that equates to approximately 77% reduction in storage costs for 10 petabytes of storage compared to public cloud. Even for smaller storage capacity needs, 200 terabytes to 2 petabytes, the savings are worth exploring. Not to mention you get the industry's best storage performance, a built-in firewall for bucket-level security, observability that is specifically designed for object storage, and many other value-added features that would cost you extra in a public cloud. The resource factor one additional element
Starting point is 00:09:01 that is worth a quick analysis is resources, the humankind. We have heard from our customers that the number of resources required to manage public cloud infrastructures can range from 5 to 10 FTEs depending on the size of the cloud infrastructure. That includes cloud engineers, cloud team leads, DevOps engineers, and cloud PMs. Using salary ranges and medians from Glassdoor, those FTE costs can range from $700,000 minus $1.5 meters per year, fully loaded. We also hear from our customers, 76% of them, in a recent survey, that one-off Minio's key value drivers is its ease of use and manageability. That same survey found that 60% of them cited Minio's ability to deliver improved operational efficiency. Greater than Minio has reduced the cost of support and maintenance for us.
Starting point is 00:09:51 Professional services company. Greater than Minio as a product is a very good storage solution. It has reduced cost greater than of resources by more than 50%. A leading technology solution provider specializing in end-to-end DevOps offerings. Internally, we use Minio for lots of different workloads, storage needs, testing, etc. And our estimates are that Minio can be managed by 1FTE3FTE for PB plus infrastructures. That allows for massive infrastructure at scale with minimal resources. Getting started now that you've seen how and why the economics work for private cloud, I am sure you are wondering what the steps are to begin down this path. My colleagues have written about this here and here, and I suggest your cloud
Starting point is 00:10:34 teams and dev ops teams look at these blogs for the details on migrating away from the public cloud. We have seen dozens of our customers repatriate their data using commodity hardware and either their own data centers or a colo, and realize some real savings and benefits from Minio's high-performing simple object storage solution. As the above analysis demonstrates, businesses can realize significant cost savings above 50% of their existing implied annual public cloud S3 bill by repatriating data to their own hardware in a data center or a co-location service. In the above scenario, with only 10 petabytes, your business could save about $6.5 million over the next five years. The truth of the matter is that the public cloud is cost-prohibitive at scale.
Starting point is 00:11:18 The inherently elastic nature of the public cloud makes scaling there appear attractive, but it is almost always the wrong choice from an economic perspective. This is particularly true for data-intensive tasks like AI, ML, where the costs and loss of control in the public cloud can be substantial. As data scales, private cloud solutions with Minio become economically superior, offering equivalent, arguably better, technologies at reduced costs. By leveraging commodity hardware and private cloud infrastructure, companies can achieve significant cost savings and performance benefits compared to the public cloud, sometimes as much as 70%. We suggest exploring migration away from the public cloud
Starting point is 00:11:58 for your workloads and using Minio to modernize and scale your critical business applications. If you want to learn more and take advantage of our value engineering function to run your own models, please reach out to us at helloadmin.io and we can start the conversation. Thank you for listening to this Hackernoon story, read by Artificial Intelligence. Visit hackernoon.com to read, write, learn and publish.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.