Part 1: To Cloud or Not to Cloud
The number of blog posts, subscriptions, and services around reducing cloud costs has become legion in the last few years. I’m sure most people reading this likely have an email in their inbox right now that advertises how to cut cloud spend. That’s not necessarily a bad thing on its own. The problem, however, with most of these approaches is that they tend to be myopic in multiple ways:
- Coherency. Often, the advice is ad hoc line items that aren’t part of a larger Enterprise strategy.
- Profitability. Specifically, the lack of focus on getting to positive EBITA (Earnings Before Interest Taxes Amortization) and continually improving it. For example, what good is cutting costs on a single service when that cost is simply shifted to another area of the business, or even worse when that cost is pushed to your customers.
- Predictability. Too often, companies go through cost-cutting seasons and then move on to other priorities, invariably seeing regressions in their cost-reduction work. Profitability, especially for publicly traded companies, absolutely must be predictable and projectable.
At Edgio, our approach to cloud workloads is crystal clear, and that is to ensure the platforms return positive Earnings Before Interest Taxes Amortization (EBITA) so that we can deliver value to our:
- Customers by providing a low-cost, rock-solid solution that solves their problems now and can afford to adapt to their needs in the future.
- Shareholders by providing a profitable and predictable return on investment.
- Employees by having a sustainable, coherent platform to build on, support, and sell.
Before we dive in, however, I want to take a moment to touch on the current economic context and why EBITA is so important.
Why EBITA Matters So Much Now
While interest, along with taxes and amortization, is outside the scope of an EBITA discussion (and this article), any enterprise approach to cloud spend must absolutely recognize the economic context the decisions are made in. This is especially true when, like the current environment, we’re dealing with a significant inflection point.
No small amount of ink has been written, even in the last six months of 2023, about how the sudden increase in interest rates has impacted the tech industry. To summarize, from January 2009 to January 2022, the Federal Funds rate averaged 0.52%.
From January 2023 to Dec 2023, it averaged 5.02%
An increase in interest rates, by close to an order of magnitude, in such a short period of time has dramatically focused investors and the industry, around the importance of profitability. With interest rates so low for over a decade, money managers had to reach for growth (risk) to improve returns. However with interest rates at their current levels the need for growth at all costs, as well as the appetite for risk, has significantly turned. Unless you’re a deeply disruptive AI startup, the princely pursuits of burning cash to buy market share are over.
Profitability is king.
Three Cloud Pillars to Positive EBITA
So how exactly does one take on cloud, or even SaaS, spending in an EBITA-positive way (that’s not myopic)?
We take a three-pronged approach that we’ll cover in this series of articles.
First, in “To Cloud or not to Cloud” we will review an approach for analyzing workloads to determine if it makes sense to actually run in the cloud. In this piece, we use the “SAAS” Framework (covered below) to identify what workloads actually belong in the cloud versus what we run in our own Data Centers, or even on other SaaS providers.
Second, in “re:Architect via FinOps,” we will look towards the workloads themselves to make sure they are architected for the cloud, optimized for cost control, and accountable to the bottom line.
Third, in “All the Cloud Coupons,” we will review the options to ensure every discount mechanism available is being leveraged. If we’re aiming for profitability, then let’s not leave a single red cent on the table. In this section, we will look at the various discount programs available for all clouds, though we will dive a little deeper on AWS.
To Cloud or Not To Cloud
Long before we suffer the slings and arrows of discount options, or take arms optimizing our architecture for the cloud and its costs, we need to first determine if the workload even belongs there.
To illustrate what I’m talking about, let me share an experience that I’ve encountered multiple times throughout my career. The below example is around git, however, I’ve seen it with all forms of infrastructure…observability, CICD, widget servers, etc.
Git Thee to a DC Nunnery
Let’s say you’re a platform engineer and you need an enterprise-grade git server (Github, Gitlab, Bitbucket, etc). You want to keep an eye on EBITA, so you opt for the cheaper on-prem instance, with licensing at $1,000 a month, instead of shelling out the money for the SaaS option at $5,000 a month. You install your server on one of the boxes in your data center (DC) and everything is working great.
Fast forward a few years, and now you have a new CFO, or your organization has changed ownership. Either way, your company has decided to move to an OpEx model, and everything in the data center needs to be moved to the cloud.
Next thing you know, your git server is now costing you almost $10k a month, the $1k for licensing plus another $8k for running that server in the cloud (instance costs, data transfer, IP’s, compliance monitoring, etc). Your annual costs have just jumped from $12k to $108k without additional value delivered or any offsetting revenue generated.
That is not how you become EBITA positive.
Let’s look at how to avoid this and other, even worse, decisions.
The “SAAS” Framework
The SAAS Framework isn’t an over-engineered decision tree aiming to solve all things for all use cases. Rather, it is a simple four-category mnemonic lens to view your business needs and whether the workloads that support them should be in the cloud. It’s about what you see in your cloud needs, with the goal obviously being cash.
When analyzing workloads for whether they belong in the cloud, on-premise, or even on another vendor’s cloud (SaaS), your first pass should be around capacity planning and the application’s scaling needs.
Primary question, do you have enough spare capacity (server, network, rack, power, etc) in your own data center/colo’s (colocation facility) to handle the peak of your workflow’s traffic/demand…without impacting margin or revenue?
If you have the spare capacity and there are no other issues involved (covered below), then this is an easy decision, just run those workloads where you have the lowest cost profile, usually in your data center.
For most situations, however, it is a bit more involved due to your average organization not having significant spare capacity, just sitting in inventory waiting to be used. As such, an opportunity cost analysis is needed to determine how much revenue or margin we’re cannibalizing by allowing the new application to use the capacity previously claimed by other workloads. On top of that, you will need to take into account the idle cost of the resources that support peak load running during non-peak times.
When you have a mature product serving a mature market (or captured market like internal platforms), it’s pretty straightforward to project your lifecycle management costs, along with your traffic/utilization peaks, and compare that ROI for cloud vs data center. When either your product or market are less predictable, the complexity increases.
To crystalize, let’s go back to the git server example for contrasting use cases.
- If the git server wasn’t used in production (i.e. didn’t cannibalize revenue traffic).
- If it had stable traffic/usage.
- If you had spare hardware & resources to run it on.
…then it makes absolutely no sense to run it in the cloud.
- It was integrated with production (and a necessary component of COGS).
- Its usage peak was 10x the trough.
- You would need to procure hardware, network, or rack & power to run on-prem.
It would be worth running the numbers to see if the ROI supports either cloud or SaaS-hosted.
Scale New Products to the Cloud
At this point, I want to make a specific call out about product maturity. The process of developing a new product, finding the right product market fit, and building it to maturity has enough risk on its own. You don’t want to gamble existing revenue or margin on taking a new product to market. Run those workloads in the cloud until the product is more mature, and then repatriate it into your data center when the ROI justifies it.
In the previous section, we covered whether the org has enough physical and virtual resources to support workloads profitably. This section is about whether you have the actual ability, human resources, and expertise to do the same.
For some workloads, there simply are no feasible options other than the cloud. A great example are platforms that require physical proximity to end users, like edge computing and Content Delivery Networks (CDNs). AI, and specifically Large Language Models (LLM), is another use case where the workload simply isn’t going to be supported by leasing half a rack in your local colo.
Expertise & Engineering Capacity
Along with geographical limitations, covered above, there are certainly situations where your organization doesn’t have the subject matter experts to support running your workloads on-premise.
Going back to our git server example, let’s tweak an assumption. What if it wasn’t git we were talking about but rather an entire enterprise logging subsystem? At Edgio, I am blessed with the amazing team I have, as the level of skill we have supporting our observability platforms is some of the best I’ve seen. They can architect, develop, manage, and support logging at hyperscale. I haven’t always been that lucky and, at multiple times, have had to farm logging out to a SaaS vendor or the cloud because, in-house, we didn’t have the experience to support a logging subsystem running at four or five nines (99.999%) uptime.
Note that last sentence because that is a critical calculation around what workloads you run in-house and what workloads you send to a SaaS vendor or the cloud. A hard rule of thumb, at least for production workloads, is that you need to have the expertise and the bandwidth to run the resource with four or five nines of uptime. If you don’t have the expertise or the number of people to support your workloads at that level, ship it to SaaS or the cloud.
Architecture (and Automation)
First, the architecture of your application/platform/etc tremendously matters when considering on-prem or the cloud. AWS, Microsoft, and GCP all have their own take on what cloud-native means, but at its core, it’s a modular, and not monolithic, approach to your architecture so that you are paying only for the resources absolutely necessary for the workload.
If your application architecture is heavily monolithic, just be aware that you will have headwinds moving those workloads to the cloud and keeping margins. At the same time, moving to modular architecture can significantly reduce your overall hardware costs, even if you never consider moving to the cloud.
Inversely, however, is the situation with automation. Many companies simply do not have a strong automation/orchestration foundation in their data centers, which makes complex workloads and advanced approaches impossible. Some workloads simply require a whole other level of automation to run profitably, as without it the amount of human bodies you need to throw at the problem makes it far too expensive of a solution.
Security and compliance are a foundational aspect of any decision around your workloads, cloud or not. Even with FedRamp, gov cloud, HIPAA architecture, and the rest, some organizations’ compliance and regulatory requirements leave them little option to leverage the cloud for certain workloads.
Even if it is possible to move to the cloud, like mentioned in the section above, some organizations don’t have the skillset in their workforce to build a compliant solution in the cloud. In those cases, caution in the analysis must be urged.
At the same time, we’ve found that, due to compliance and other non-functional requirements ( NFRs), for some workloads, utilizing things like serverless in the cloud can actually be more profitable than running in-house. Architecture and use case requirements make all the difference here, but for the sake of profitability, it is worth modeling out your situation.
At this point, we’ve covered why positive EBITA matters and some ways to approach your cloud decisions in a way that maximizes profitability.
Now that we’ve set the stage on the “why” and its approach, in the next article in this series, we’re going to dig in deep on the “how”. We’ll cover the various approaches, reasoning, and pitfalls around using FinOps to re-architect your workloads for the cloud and, more importantly…for profitability.