Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> the search is quadratic because it has to repeat that O(n) work at every position

The problem is that this is one of the regexes that backtracking engines have a bad time with.

With a NFA implementation it can be done in O(regexlen * haystacklen) time, though that only holds for true regular expressions (no backreferences).

https://swtch.com/~rsc/regexp/regexp1.html

And then for re.search, since the NFA wants to just do it once, it should run it with the pattern as

  ^.*?(\d+\s+).*$
(where *? is a non-greedy repeat)
 help



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: