Hacker News new | past | comments | ask | show | jobs | submit login

Ah, this brings back the memory of my O(n^2) fiasco. I wrote a low level file system storage driver in the past. In the caching layer, there's a need to sort by LRU time for cache eviction. It's a minor thing not run often and I wanted to move quickly. Also since it's low level kernel mode code, I wanted the code to be simple and correct. The number of cache entries was not big. Bubble sort was adequate for small N, to get the job done quickly and simple enough to ensure correctness. Plus I could replace it later if needed. So it's done and forgotten.

Things were working fine and performance was good. Then one day Windows Explorer suddenly hung with 100% CPU for couple seconds. This was one of the worst kind of bugs. There's no crash to pinpoint the problem. Things still work most of the times, just slowed down intermittently. Luckily I was able to catch a slowdown and deliberately crashed the process in time. The call trace stopped in the bubble sort function. I immediately kicked myself - it's the classic case of O(n^2) blowup. The cache entries had been scaled up to couple thousands items and the exponential O(n^2) blowup to tens of million of iterations was having a real impact. I switched to merge sort and performance was back to normal.

Edit: I picked merge sort because the worst case was O(n log n), unlike quick sort whose worst case was O(n^2). Once burnt, needed to be extra careful with edge cases.




I got the honor to file a bug against Together's design tool. We had a project that was a monorepo with a few dozen distinct compilation units in it and some had complained that the more modules you added to the project the longer it would take to open. One person was up to half an hour and didn't even have the whole thing.

I had a slow week babysitting something else so I just let it chew away in the background, while doing other things like documentation and requirements work, noting down the time, adding one more module and running it again. The last one I was willing to do took 18 hours to load. It was still running when I got back to the office the following day.

To this day, I can't recall ever seeing any production performance bug with greater than n cubed complexity. This bug progressed at n to the fifth. Truly, a thing of singular beauty.

Thankfully they had a workaround that dropped it to something like n^2 (30 minutes became less than 5) and a bug fix not too long after.


n^2 is polynomial, not exponential. If it was truly exponential (which a sort should never be unless you've decided to solve an arbitrary 3-SAT problem while sorting a list), then it would've blown up in your face much sooner.


> which a sort should never be

There's certainly a most awesome sort algorithm which is exponential...

https://www.dangermouse.net/esoteric/bogobogosort.html

They are not even sure what the complexity is but it's like O(n!^(n-k)) or O(n*(n!)^n).


I'm aware of bogosort and its crazy variants, but let's not be ridiculous here. You could make any "sorting" algorithm arbitrarily bad if you'd like by doing such silly but useless things.


I propose Boltzmann sort: wait around until the heat death of the universe, and a Boltzmann brain will emerge from the cosmos, and sort your values while contemplating infinity.


Ah, a fully general O(1) algorithm. Amazing.


The very idea of O(n*(n!)^n) is going to give me nightmares.


You're correct n^2 is polynomial, not exponential. It was just an exaggerated figure of speech. BTW, O(n) and O(n log n) are also polynomial; it just doesn't have the punch to it.


So the more precise term would be quadratic.

"Exponential" is indeed misused.

Half the time I hear it, it is lower, like quadratic or cubic. The other half the time is higher, like combinatorial.


Combinatorial? Seriously? The difference between exponential and cominatorial is essentially purely theoretical when it comes to computational complexity. In practice, even the difference between n and n log n is barely relevant, and the exponential and combinatorial are the exponentiation thereof. Nobody ever, ever deals with processes large enough where the difference both matters, and the process completes. Even a tiny scaling factor in the exponent and a moderate base means the point at which a unscaled combinatorial passes it will take more steps that you might as well be waiting for the heat death of the universe.

But even without any scaling factors, don't underestimate the size of the numbers involved. If you had two threads, one simply counting up to some n! upper bound, and the other up to 10^n - how long do you think the minimum problem size would take for the "slow" combinatorial to actually be slower than the exponentiation? Hint: at 4GHz and 1 op/cycle... Let's just say I'm not holding my breath that our species will still be around then. And even a tiny scaling factor to the mix...

I mean, if you're trying to estimate computational complexity at least, this nuance seems pointless.

But quadratic vs. exponential really matters, with plausible parameters.


With n = 26, the combinatorial process takes 4 times as long. 4 x 10^26 "counts". Googles SHA1 collision attack two years ago used roughly 10^19 computations. So we're off by a factor of 10^7.

I'm curious if the rise of quantum computers will make the differences between exponential and combinatorial meaningful in a practical sense.


That's probably why it's misused so much.


What does exponential look like then?


2^n (by the virtue of Big-O, m^n is in O(2^n) for any fixed m).


This is untrue: There's no constant k such that k*2^n > 3^n for all n. In general O(a^n) is strictly stronger than O(b^n) if a > b.

This is why you sometimes see complexities that are e.g. O(1.3894732894^n) in wikipedia articles on the best known cases for various algorithms.


> In general O(a^n) is strictly stronger than O(b^n) if a > b.

Typo: O(a^n) is a stronger guarantee than O(b^n) if a < b. It means nothing if a > b.


I took strictly stronger to mean a^n + b^n is O(a^n) so the a^n term holds more weight. The comment didn't mention anything about guarantees.


Ah, that would make sense.

I actually thought the most likely intended meaning was that an algorithm in O(a^n) must take asymptotically longer than one in O(b^n) as n goes to infinity (if a > b), which isn't true. (For example, when a > b, then every algorithm in O(b^n) is also in O(a^n), but obviously no algorithm can asymptotically require more time than itself.)


Thanks for the correction.


Sounds like perhaps you confused with the other direction: that the base of log(n) doesn't matter


This isn't quite right. For example, 3^n is not O(2^n) (see, e.g., https://stackoverflow.com/questions/19081673/big-o-notation-...)


This is not true, consider taking the limit 3^n/2^n as n goes to infinity, the limit is infinity, hence 3^n is not O(2^n)


More like for any fixed m <= 2 :)


I once coded a bubble sort in an application where I expected the worst case to be n=4. Later that assumption was invalidated, but I had long ago moved on to something else. I hope the person who inherited my code doesn't still hate me.


Why bubble sort? I never understood its popularity. Selection sort feels a lot more intuitive to me and it's the same complexity.


Bubble sort is popular because it's provably optimal... under extremely restrictive circumstances: https://stackoverflow.com/a/3274203 Those circumstances stopped being semi-relevant when even the cheapest computers started using (floppy) disk drives instead of tape drives, but somehow bubble sort is still being taught.


It's probably because the implementation of the algorithm is the simplest, because you don't need to modify the existing data structure. You don't need to merge or create new arrays. You just repeatedly loop and swap two elements when needed. Super easy by all metrics.


You don’t need to modify the data structure for most quadratic sorts though. Or even for quick sort.


Most people get the partition algorithm wrong on their first try if they write it themselves.


Bubble sort is the first one I learned, so I remember it the best. I never spent any time comparing any of the O(n^2) sorts so it never occurred to me to try a different one.


And here am I, just using what's in the standard library. of the language I'm writing. Never occured to me to try a different one.


This was quite a while ago, and the library sort was QSort which had a lot of overhead due to its calling convention. Today I'd have no problem using std::sort instead.


Ha. Have a beer!


Quick sort can be made to be O(n log n) with careful choice of pivot - e.g. https://en.wikipedia.org/wiki/Median_of_medians


Generally it is preferable to choose the pivot randomly.


Sure, and the wikipedia page makes mention of that. But it's not super relevant to the point I'm making about the worst-case complexity of quick sort (which is often misunderstood to be O(n^2)).


Quadratic, not exponential! :)


Thanks for the correct term. The newsspeak of exaggerated figure of speech has got to me. :)


Would it be correct to say it's a quadratic complexity causing the run time to grow exponentially? Or is the word exponential still wrong here. My Mathematics terminology is rather.. dusty.


Quadratic complexity causes the runtime to grow quadratically ;-)

- Quadratic = x^2, e.g. 0,1,4,9,16,25,36,49,64,81

- Exponential = n^x, for example 2^x, e.g. 1,2,4,8,16,32,64,128,256

Already very bad for small x even if n=2, but for n higher than 2 you can imagine you will run out of time very, very quickly ;-P


I always thought of Quadratic to be exponential but specifically 'N to the 2nd'. I suppose that does actually make no sense whatsoever. Thanks.


That is wrong, because quadratic means that the variable is in the base, while exponential will tell you, that the variable is in the exponent (just as the name tells).


Ah, what you're looking for is "polynomial but specifically 'N to the 2nd'" :)


That makes a lot more sense! Actually rings a bell from when I ran in to the same wrongly-remembered wording in uni.

I'm not sure why, but getting back to the right terminology/vocabulary just to get back in to the basics of mathematics is a lot harder for me than it was 20 years ago.


If you casually want to get into it without the commitment, I've found that even passively watching the Tim Roughgarden videos from Coursera (probably can be found crossposted to YouTube by third parties) to be stimulating in the past.

If you seek to acquire formality and rigor, I can't help you there; I figure shortcuts will be nonexistant ;)


Yep, still wrong. Exponent != Exponential


I really should re-read some books. Thanks!


Don't fret it. Imo people are being unnecessarily curt when trying to flex their pedantry here, which we all love to do, myself included.

Of course, poly still means there is an exponential in the time complexity.

It's really nothing beyond the main takeaway that the usual norm to describe something as having "exponential" growth is that the number of computations increases in order of magnitude every time you add but a single item to the list of inputs.


You could also go with radix sort (O(N)), since your insertion times are presumably some kind of integer.


Yes, radix is faster. But at that point I was looking for stability and simplicity over all else. Merge sort was stable and simple to implement, with decent performance.


but merge sort is not in-place (at least not any remotely practical variant); heap sort is simple, in-place and O(n log n), although it is not a stable sort as merge sort.


>and the exponential O(n^2)

You need to look up what exponential means.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: