For those not aware, if you create too many resources you can easily use up all ...

scoodah · 2025-11-24T20:24:31 1764015871

This is why I’ve always thought Tekton was a strange project. It feels inevitable that if you buy into Tekton CI/CD you will hit issues with etcd scaling due to the sheer number of resources you can wind up with.

prescriptivist · 2025-11-25T03:37:20 1764041840

What boundaries does this 8GB etcd limit cut across? We've been using Tekton for years now but each pipeline exists in its own namespace and that namespace is deleted after each build. Presumably that kind of wholesale cleanup process keeps the DB size in check, because we've never had a problem with Etcd size...

We have multiple hundreds of resources allocated for each build and do hundreds of builds a day. The current cluster has been doing this for a couple of years now.

scoodah · 2025-11-26T04:03:44 1764129824

Yeah I mean if you’re deleting namespaces after each run then sure, that may solve it. They have a pruner now that you can enable too to set up retention periods for pipeline runs.

There’s also some issues with large Results, though I think you have to manually enable that. From their site

> CAUTION: the larger you make the size, more likely will the CRD reach its max limit enforced by the etcd server leading to bad user experience.

And then if you use Chains you’re opening up a whole other can of worms.

I contracted with a large institution that was moving all of their cicd to Tekton and they hit scaling issues with etcd pretty early in the process and had to get Red Hat to address some of them. If they couldn’t get them addressed by RH they were going to scrap the whole project.

iwontberude · 2025-11-24T23:30:27 1764027027

Yeah, quite unfortunate. But maybe there is hope. Apparently k3s uses Kine which is an etcd translation layer for relational databases and there is another project called Netsy which persists into s3 https://nadrama.com/netsy. Some interesting ideas. Hopefully native postgres support gets added since its so ubiquitous and performant.

dilyevsky · 2025-11-25T07:23:38 1764055418

It's not hardcoded and you can increase it via flag.

iwontberude · 2025-11-26T16:10:00 1764173400

There is a hard coded warning which says safety not guaranteed after 8GB. I have tried increasing this after a database has become full and it didn’t start. It’s definitely not a recovery strategy for a full etcd by itself, maybe as part of a way to eek out a little larger margin of safety.

dilyevsky · 2025-11-26T17:49:39 1764179379

This warning seems to be outdated. We had run etcd at much larger volumes without issues (at least without issues related to its size). Alibaba has been running 100G etcd clusters for a while now, probably others too

iwontberude · 2025-11-27T19:56:10 1764273370

Thank you for the update