Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

What? No...

Or, more accurately: if you need "dozens of trillions" that implies a false positive rate so low, it's practically of no concern.

You'd want to look up the poisson distribution for this. But, to get at this intuitively: say you have a bunch of eggs, some of which may be spoiled. How many would you have to crack open, to get a meaningful idea of how many are still fine, and how many are not?

The absolute number depends on the fraction that are off. But independent of that, you'd usually start trusting your sample when you've seen 5 to 10 spoiled ones.

So Apple runs the hash algorithm on random photos. They find 20 false positives in the first ten million. Given that error rate, how many positives would it require for the average photo collection of 10,000 to be certain at at 1:a trillion level that it's not just coincidence?

Throw it into, for example, https://keisan.casio.com/exec/system/1180573179 with lambda = 0.2 (you're expecting one false positive for every 50,000 at the error rate we assumed, or 0.2 for 10,000), and n = 10 (we've found 10 positives in this photo library) to see the chances of that, 2.35x10^-14, or 2.35 / 100 trillion.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: