I am extremely Not A Database Person but I understand that the rationale for Kub...

dijit · 2025-11-24T12:22:24 1763986944

You might not be a database person, but you’re spot on.

A well managed HA postgresql (active/passive) is going to run circles around etcd for kube controlplane operations.

The caveat here is increased risk of downtime, and a much higher management overhead, which is why its not the default.

travem · 2025-11-24T17:57:38 1764007058

There are also distributed databases that use RAFT but can still scale while delivering distributed consensus don’t is not a challenge that can’t be solved. For example, TiDB handles millions of QPS while delivering ACID transactions, e.g. https://vivekbansal.substack.com/p/system-design-study-how-f...

Sayrus · 2025-11-24T12:49:09 1763988549

GKE uses Spanner as an etcd replacement.

ZeroCool2u · 2025-11-24T13:14:39 1763990079

But, and I'm honestly asking, you as a GKE user don't have to manage that spanner instance, right? So, you should in theory be able to just throw higher loads at it and spanner should be autoscaling?

DougBTX · 2025-11-24T13:48:39 1763992119

Yes, from the article:

> To support the cluster’s massive scale, we relied on a proprietary key-value store based on Google’s Spanner distributed database... We didn’t witness any bottlenecks with respect to the new storage system and it showed no signs of it not being able to support higher scales.

ZeroCool2u · 2025-11-24T14:53:41 1763996021

Yeah, I guess my question was a bit more nuanced. What I was curious about was if they were fully relying on normal autoscaling that any customer would get or were they manually scaling the spanner instance in anticipation of the load? I guess it's unlikely we're going to get that level of detailed info from this article though.