In my experience, when presented with a failing test it would simply try to make...

0x457 · 2025-05-23T00:42:16 1747960936

I once saw probably 10 iterations to fix a broken test, then it decided that we don't need this test at all, and it tried to just remove it.

IMO, you either write tests and let it write implementation or write implementation and let it write tests. Maybe use something to write tests, then forbid "implementor" to modify them.