The lack of ECC memory is almost certainly not a factor. If you can train at FP8... | Hacker News

Hacker News new | past | comments | ask | show | jobs | submit

login

nl on Jan 18, 2023 | parent | context | favorite | on: Which GPU(s) to get for deep learning

The lack of ECC memory is almost certainly not a factor. If you can train at FP8 your model will recover from a single flipped bit somewhere.

Loranubi on Jan 19, 2023 [–]

I mean you could even view bit flips as a regularization technique like dropout...

dahart on Jan 19, 2023 | [–]

Yeah I hear it’s common practice now to avoid synchronizing GPU training kernels in order to speed things up, and it has positive regularization benefits and little downside.

Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact