It thought it would win. KataGo was trained on multiple scoring methods at the s... | Hacker News

Hacker News new | past | comments | ask | show | jobs | submit

login

espadrine on Nov 3, 2022 | parent | context | favorite | on: Adversarial Policies Beat Professional-Level Go AI...

It thought it would win.

KataGo was trained on multiple scoring methods at the same time: it is an input to the algorithm[0]. The model learnt that it would win when passing, and it seems it never had the opportunity to detect that it would not win under Tromp-Taylor when passing, because its opponent in self-play, KataGo, then either passed and lost (under other rules) or resigned.

[0]: https://github.com/lightvector/KataGo/blob/master/cpp/config...

pmontra on Nov 3, 2022 [–]

So it's a bug in the training method, probably a very minor one because nobody already exploited it. The only really interesting thing in here is that it took another AI to find that bug.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact