Is Decentralized Infra a Solution in Search of a Problem?
Four arguments for decentralized infra, 75% of which are tangible & material
Absolutely not. Decentralized Infra brings salient & material benefits compared to centralized cloud providers (e.g. AWS, Azure, GCP, etc.) and will manifest over time.
Let’s first define what Decentralized Infra even means — very simply, de-infra tries to create a decentralized & distributed version of what you see under AWS/GCP/Azure Cloud Offerings — a screenshot from the AWS website below.
So what’s the incremental value of recreating all this? The existing centralized cloud services seem to work great!
This was one of the major focuses of my Primary Research on Decentralized Storage & Infra a couple of weeks back. Below is the side-by-side comparison of the pros and cons of centralized vs. decentralized with a focus on the data storage providers, but the takeaways are largely generalizable to the broader de-infra space.
Let’s TLDR this.
The benefits of De-infra over Ce-infra are can be grouped into four buckets:
Philosophical (won’t deep dive here)
Tangible & material
Lay the bedrock for new possibilities (DataDAO, data NFTs, Self-sovereign data monetization)
Unlock closed doors, so to speak
#1, I won’t waste your time here
We all know a decentralized web3 should not be built on a centralized infrastructure because it introduces single points of failure and censorship risks.
#2, De-infra has tangible & material benefits.
A. it’s cheaper and here’s the math
I will use an example from the de-storage space, but the same logic can be applied to other categories of infra, such as compute, retrieval, CDN, etc.
To understand how, let’s first compare the costs & revenue of storage providers in web2 (AWS/GCP/Azure) vs. web3 (Filecoin storage providers) in the most broad-stroke way possible.
Centralized Storage Provider (e.g.AWS)
Numbers taken from Amazon’s 2021 10k report
$64 Billion in CAPEX (tech infrastructure) allocated to AWS, excluding land lease
$44 Billion in OPEX allocated to AWS operations
payroll costs of ~40k employees (usually 15%-30% of revenue based on industry benchmark)
Overhead
Marketing & more
Revenue Source: charge storage clients (mainly enterprises)
VS. Decentralized Infra Provider
CAPEX: mining hardware equipment
OPEX: just yourself (assuming you are not a mining farm) + electricity + personnel costs if you do run a mining farm
Revenue Sources:
Fees charged to clients: fee clients may pay to incentivize deals to clear — still much cheaper than centralized cloud. E.g. storage fee on Filecoin is 99.998% cheaper than Ce-storage, calculated based on comparing the lowest tier of storage cost (for data archival/cold storage) on Google Cloud ($1/TB/Month) — AWS is more expensive — with Filecoin archival/cold storage ($0.0002/TB/month)
Block Rewards: token rewards for securing the network & providing service
Gas Premium: Fees paid by network users for message prioritization
Data storage on Filecoin will ALWAYS be orders of magnitude cheaper than AWS/GCP because decentralized storage providers
Inherently have fewer costs to cover than the centralized behemoth
Cannot charge you premium by locking you in like AWS does via HTTP1. For example, all data stored with de-storage providers through Filecoin’s coordination layer is IPFS/content-addressed, meaning if your storage provider decides to set a new price way above the market rate, you can take your data with you and go to a different storage provider anytime.
[!!Most important and what truly differentiates the monetization model in web3 from web2] Decentralized infra providers have more revenue sources (#2 + #3) other than charging client service fees (#1).
In fact, when the token price goes up, it can actually get CHEAPER for storage clients. Using the same Filecoin example, the more real data that miners/storage providers host, the more block reward they earn. In turn, to earn more tokens/block rewards, storage providers are incentivized to charge clients less — in some cases even negative fees by paying their clients — to win the auction to store large amount of data→ get more block rewards.
B. De-infra can be faster/more performant (WIP)
This is a harder argument to make at the moment, as decentralized infra is in the beginning of the S curve and not ALL de-infra providers have superior performance than centralized cloud, YET.
But there are instances where de-infra match or outperforms centralized Cloud — for example, Storj claims that their video sharing/streaming speed is comparable or even faster than AWS because of its globally distributed 15,000 nodes, making it faster to reassemble the content and deliver to end customers. Pocket Network also claims to have superior performance at a cheaper cost compared to centralized RPC node providers like Alchemy because of the globally distributed nodes.
#3. De-infra is the bedrock for new categories of innovation
Namely, DataDAO, Self-sovereign Data Monetization, Data NFT, and more.
What can’t the above be achieved today on centralized infra?
Case in point, this article was published yesterday to expose the banning of scientists’ access to important databases if their research topics are deemed “forbidden” or sensitive by National Institutes of Health. As much as I relate to the outrage from the research community, my initial reaction was that this would have never been possible if the databases were built on IPFS & stored via Filecoin. When you store with centralized cloud providers, authorities can simply block off access to the data center location where your data resides. But for data built on content-addressed IPFS and stored in a distributed fashion, as long as there’s a single copy somewhere on the web (searchable by the unique content ID), it can be infinitely and propagated against any censorship efforts.
To take a step further, what if we can tokenize such valuable datasets as data NFTs, form Data DAOs around them, and let the communities vote on
the right to access & privacy
the duration of use
computation policy & requests
Researchers can draft a proposal for a topic & length of access, which will be jointly reviewed and approved by relevant communities. The datasets/data NFTs will be stored via IPFS and assigned a unique CID (CID), which makes any modification or abuse of the dataset detectable, transparent, auditable, and traceable.
A question here is how to make the computation & privacy policies of these dataNFTs enforceable on-chain — which probably warrants a separate discussion. For example, how to balance computability (will data download be allowed?) with privacy (if users can download a local copy, what would be the mechanism to stop them from distributing to unwanted third parties?)
#4. Unlock closed doors, so to speak
Closed doors symbolize censorship & content moderation in this context — and I don’t mean just the governance and political censorship.
Take content moderation as an example. Facebook/Meta says it has spent $13 billion on safety and security efforts since 2016 — sure, good for them. But who’s to say they are the ones making rules about which type of content is suitable for a certain audience group?
What if we can create new category of dapps that can test out a much more transparent and nimble way of moderating content built on a pool of open data, and let dapps or consumers experiment with different models to filter based on their granular needs and wants?
And don’t even get me started on how I lost access to my Facebook account, along with an entire High School worth of pictures & memories, because my account was hacked and there was no place to recover. How will de-infra change this? If my pictures were IPFS based and pinned to the web3, I will always be able to locate and recover them, even if Facebook goes out of business.
However, with the above said, I do recognize and acknowledge the obvious shortfalls of the existing de-infra offerings due to the nascency of the space.
Below is a good overview of the challenges in the de-storage context, and more generalized adoption challenges will be discussed in a separate post.
For those of you unfamiliar with IPFS vs. HTTP, essentially HTTP points to a location or data center that stores your data. You guessed it — access to such location is strictly controlled by Cloud providers like AWS/GCP/Azure — this is why they can lock you in. IPFS instead points directly to a content ID (CID), which means you can take your data on IPFS and switch provider anytime you want — hence SPs on Filecoin can’t charge premium over storing because everything is determined by supply vs. demand.