A gem > Dawson’s first law of computing: O(n^2) is the sweet spot of badly scali...

rcthompson · on Dec 9, 2019

I think the other reason O(n^2) is a "sweet spot" is that it often arises from one O(n) algorithm calling another O(n) algorithm in each iteration, resulting in O(n^2) overall. Very often it's ultimately because the inner O(n) algorithm should have been implemented as O(1), but nobody bothered because it was never intended to be called in a loop.

masklinn · on Dec 9, 2019

> it often arises from one O(n) algorithm calling another O(n) algorithm in each iteration, resulting in O(n^2) overall.

https://accidentallyquadratic.tumblr.com was / is pretty much an entire blog dedicated to this.

crankylinuxuser · on Dec 9, 2019

This is absolutely the better analysis. Many times, especially in agile-world, you get away with things as fast and as reasonable as you can. And that usually means O(n) as it is perfectly acceptable for a single call... But when an O(n) calls another O(n) is where you run into trouble.

But at the beginning, nobody was planning for that call to be fast - implicit requirements lead it that way.

In some ways, being able to identify and detect resource usage is what is nice about waterfall. Identification of critical API calls and respective timings would be integral to continue building the GUI elements. But we all know how waterfall is poo-pooed these days.

pjc50 · on Dec 9, 2019

> Identification of critical API calls and respective timings would be integral to continue building the GUI elements. But we all know how waterfall is poo-pooed these days.

If you've worked out which API calls are happening and how often, you've already written most of the application. It's just that it might be on paper or in pseudocode.

Waterfall was abandoned because getting to that level of detail takes far too long for management to accept and it's easier - sometimes orders of magnitude faster - to write the thing as actual code, try it, and then see what needs improving. And see what was wrong with the requirements.

retrovm · on Dec 9, 2019

That presupposes that you can start with a big pile of bad APIs and somehow incrementally approach something worth having. That's not consistent with my experience. In my experience the innermost APIs, the ones that get written down first, are effectively cast in stone and dictate the quality of the whole product, forever. The first draft of a system is the one that should have all its APIs worked out with a pencil before any code is written. You get these accidental complexity explosions precisely because someone thought it was easy and obvious to write a function returning std::vector<string> (or whatever) and didn't consider from the caller's point of view that making the function take an output iterator might be less complex.

jacobolus · on Dec 9, 2019

You are right. It is typically more effective to write one implementation as a hacky prototype, play with it a bit, throw it away, and then rewrite a production implementation from scratch, so that the mistakes of a design made by someone inexperienced don’t get baked in forever.

Unfortunately there are cultural/psychological factors which often preclude or discourage this method, even if it would save time and produce better code in the medium term than either exhaustive planning up front uninformed by practice OR just iterating the initial broken version to continually meet new requirements.

BlueTemplar · on Dec 9, 2019

Lol, ninjaed, see above.

BlueTemplar · on Dec 9, 2019

I've been taught waterfall a couple of years ago. We were supposed to do our project using waterfall (but at least with weekly meetings with the "customers").

My suggestion of building a first test version to, amongst other things, at least get our feets wet with the programming language (which most of us had hardly any experience with), then throw it completely away and restart "from scratch" (but with a better idea as of what we were supposed to do) was rejected.

TimTheTinker · on Dec 9, 2019

Waterfall refers more to the fact that further work (development, testing, or deployment) has to wait for someone else to finish something else first. Multiply that by how many “steps” something has to go through before being released. This implies everyone is taking work in large chunks instead of small iterations, and that too much is being planned too early instead of learning as you go.

Taking things iteratively, small iterations, delaying decisions, learning as you go — that’s what agile development is. And all the stuff you mention above is possible when developing with agility.

ComputerGuru · on Dec 9, 2019

> Many times, especially in agile-world, you get away with things as fast and as reasonable as you can.

Which has led to the ridiculous software bloat we have today. This always MVP, break fast, break often garbage needs to die.

chousuke · on Dec 9, 2019

I feel like the concept of an MVP is often abused. It shouldn't be a thing that somehow works with bubblegum and luck; it should still be "solid". The point ought to be to leave out features, not quality.

NateEag · on Dec 9, 2019

When you leave out features and quality, you have a proof of concept.

Those are valuable things to build, IMO. Just make sure you leave out enough features that it cannot be pushed into production once someone else sees it running.

ludamad · on Dec 9, 2019

Right, if we consider O(n) "ideal" (for many domains, this is true), we can consider processing per item as a more convenient metric. If you go above, say, O(log n) processing per item, you're in trouble proportionate to the unintended success of this portion of the code. Very easy to write a sloppy utility with a nice API, only for it to be 'the perfect fit' for someone else's tight deadline, become an idiom in the code, finally see a 10,000 entry example much later

cross_wiber · on Dec 9, 2019

That's a good point. It explains why you can very often unintentionally end up with an O(n^2) algorithm in the first place, and Dawson's law then explains why no one bothers to fix it until it's too late.

brucedawson · on Dec 9, 2019

Hey, it's Dawson's first law of computing. I have more planned, and it's important to distinguish them.

savingGrace · on Dec 9, 2019

I often see O(n^2) algorithms that can be reduced to O(2n) at the very least. One of the best things I gained from school was the red flag that fires off in my mind any time I see a loop nested in a loop.

shantly · on Dec 9, 2019

A little experience gets you that same red flag instinct in a hurry, too.

[EDIT] it also makes you hesitate & second-guess and worry and experiment a bunch when contemplating using a recursive algorithm, which is deeply counterproductive in interviews where the expected behavior is so often "apply recursion, instantly and without hesitation" :-)

nikanj · on Dec 9, 2019

Unfortunately interviews seem to be all about dogma, and if you even mention performance you will get dinged for premature optimization.

It’s like a postmodern religion where premature optimization is the cardinal sin, and you are supposed to burn as many cycles as possible to demonstrate that you are of good faith

WorldMaker · on Dec 10, 2019

On the flipside, I've met different sorts of interviewers that ding someone for writing something intentionally unoptimized in the spirit of avoiding premature optimization for algorithmic clarity, because don't they have rote knowledge that "x is faster". Dogma on every side of the fence. (Also, continuing proof that technical interviews will never be objective evaluations of skill.)

stevefan1999 · on Dec 9, 2019

you shouldn't call it O(2n), as there is already a constant inside of O(n), because it becomes 0 <= f(x) <= c*2n given a random f(x)

Tyr42 · on Dec 9, 2019

Yet, we understand exactly what he means. Sometimes it's useful to show an un-reduced intermediate step to facilitate understanding, even if you could reduce it further.

vanderZwan · on Dec 9, 2019

Eh, you're technically correct but when used as a point of relative comparison to an O(n²) algorithm I think there's some merit to it, because it actually gives some sense of constant overhead within the context.

protomikron · on Dec 9, 2019

> Eh, you're technically correct [...]

The best kind of correct. Calling it O(n) then is the best way to express the performance improvement - I think that's the whole point of Big-O notation.

However I agree that for smaller n one should not underestimate the constant factor impact.

aflag · on Dec 9, 2019

The point was that you can often convert nested loops into two loops that are not nested. O(2n) conveys that, O(n) does not.

BlackFingolfin · on Dec 9, 2019

Actually O(2n) did not convey this for me at all, all it did for me was cause confusion. If this ("turn it into two not nested loops") was indeed intended, I'd suggest to spell it out, not abuse notation

thaumasiotes · on Dec 9, 2019

> If this ("turn it into two not nested loops") was indeed intended, I'd suggest to spell it out, not abuse notation

O(2n) is not an abuse of notation. It's just as well defined as O(n). The fact that O(2n) is a subset of O(n) is a theorem, not part of the definition of the notation.

enedil · on Dec 9, 2019

It's a subset, but also a superset.

thaumasiotes · on Dec 9, 2019

Yes? But to replace O(2n) with O(n), you only need that it's a subset.

enedil · on Dec 10, 2019

You have your point.

aflag · on Dec 9, 2019

Interesting, I hear people spelling out constants all the time when using the O notation to put emphasis where they want. I guess that's not really as universal as I thought, but I still think it's a good way to express.

thomasahle · on Dec 9, 2019

It can cause more confusion though, if say you write O(kn), but the oh-notation hides a factor k^3, then it would be better to just write O(n). Or write O_k(n) to signify a hidden dependency on k. Or best of all, just write out k^4 n without any ohs at all.

TravHatesMe · on Dec 9, 2019

What if the constant were massive? Then the coefficient might not be significant in practise. O(2n) might be worse than O(n+99999999), they both reduce to O(n) though. It is a classification.

Personally I would prefer to use different semantics to express that. It seems big O notation often carries a separate meaning in parlance.

I thought big O notation was about classifying function growth irrespective of the coefficient/constant. Does it not grow at all with respect to n? great you are O(1). Does it grow linearly? cool you are O(n). Logarithmically? Quadratically? In my mind, this kind of analysis aligns with big O notation.

tshaddox · on Dec 9, 2019

If constant factors are significant in practice and the size of the input is not significant, then big O notation is not the right tool for the analysis, and that's perfectly fine. For instance, we generally don't talk about big O notation when comparing the performance of SSDs and spinning HDs.

Big O notation is for describing how the performance of an algorithm changes as the size of its input changes. If the size of the input is not a significant concern, then it's totally fine to not use big O notation for analyzing the problem.

bsder · on Dec 9, 2019

Asymptotically, that's true.

In reality, O(n) and O(2n) can be quite different for small n. As can O(n)+k0 and O(2n)+k1. Or worse, O(n^2)+k2 where sufficiently large k0 and k1 make the quadratic system better because it's k2 constant is so much smaller. Setup time matters.

Nowadays, you rarely have enough elements that the asymptotic behavior is the defining performance characteristic.

esoterica · on Dec 10, 2019

O notation is asymptotic by definition. The sentence “O(n) for small n” is completely meaningless. O(n) cannot never be different from O(2n) because they refer to the exact same set of algorithms.

mantap · on Dec 10, 2019

Theoretically yes but practically no. Theoretically Big O notation is defined as a limit that goes to infinity but in practice there's no such thing as infinity and saying that an algorithm's runtime is in O(n) implies certain things about its behaviour for finite n, that's kind of the point. And it's perfectly possible for an algorithm to be defined piecewise on the input size and so saying it has runtime O(n) for small n is a statement that has practical significance.

naniwaduni · on Dec 11, 2019

For a moment I assumed that this was another of those cases where copy-pasting to HN lowered a subscript.

Then my brain asked why you'd want to reduce a quadratic algorithm to exponential.

TravHatesMe · on Dec 9, 2019

O(2n) = O(n)

leetcrew · on Dec 9, 2019

sometimes it was correct to choose O(n) instead of O(1). if you know you're only going to have very few elements, linear search on a vector can be faster than a hashtable lookup or rb-tree traversal and use less memory for the data structure. later on some clown decides to expose your private function for their own convenience and starts calling it with thousands of elements.

mrec · on Dec 9, 2019

You're not wrong, but IMHO the proper place for that sort of optimization is inside the dictionary class/module/whatever. It knows why the optimization is there and the conditions under which it makes sense.

For application code it's probably better to express clearly what you're trying to do, in this case keyed lookup. Throwing in a linear search at that level isn't just dangerous, it's potentially confusing to people reading the code later.

cogman10 · on Dec 9, 2019

Depends. Sometimes the key isn't easily hash or comparable. For example, I know of a few case where the keys looks something like "if (a > b and c < d) or (a < b and d == b)".

Now, that doesn't mean you can't hash a, b, c, and d. It just means that the logic around doing the lookups is nontrivial.

When you are talking about < 100 elements, sometimes the consideration is "Welp, this is < 100, so lets just n^2 it".

dataflow · on Dec 9, 2019

> where the keys looks something like "if (a > b and c < d) or (a < b and d == b)"

Confused, what do you mean when you say the key is an "if" statement?

z92 · on Dec 9, 2019

He probably meant, the lookup in the list is a conditional lookup based on a logic, instead of a simple key string.

dataflow · on Dec 9, 2019

I'm confused what the variables are in this example—what's already in the container vs. what is being looked up. If we conceptually don't even have a proper mapping to begin with, then it doesn't make sense to ask whether it's hashable or comparable or not in the first place. If we do—then what is the "key" in this example?

cogman10 · on Dec 9, 2019

Let me preface this by saying I work in finance and rules are weird.

This comes up frequently when looking for fuzzy "duplicates".

For example, sometimes you get data from 2 sources, one will send it through with 0.01 precision. Another will send it through with 0.001 precision. You can't hash or compare that.

Things get more tricky when different finance institutes used different terms for the same entity.

That doesn't mean you can't avoid n^2 duplicate searches. It just means you have to be smart about how you group things.

dataflow · on Dec 9, 2019

Right, so you have an application where you don't genuinely have a dictionary -- e.g., inserting items in a different order can result in a very different outcome even when all the keys are distinct, or even worse, inserting two keys and then removing a third can give entirely different results depending on the order in which you inserted the first two. (Or other arbitrary things depending on your application.) That can of course come up; it's just been quite confusing for me to see a fuzzy search scenario like this presented as a counterexample when (at least to me) the discussion seemed to be about proper mappings.

bentcorner · on Dec 10, 2019

You can always assert on insertion that n <= c. Or track the max size and assert on destruction.

wesammikhail · on Dec 10, 2019

A useful heuristic from my CS professor: always treat every tab (indentation / complexity) to be equivalent to a cart that you are attaching your horse. Add enough carts (complexity) and you will soon be asking yourself why your horse isnt a nuclear reactor instead?

jtolmar · on Dec 9, 2019

Programming languages have native support for O(n) algorithms in the form of for loops. You can compose as many of those as you want to get a huge polynomial, but you have to go out of your way and do something a bit weird to get an exponential run time.

saagarjha · on Dec 9, 2019

Recursion is a pretty good way to exponential quickly.

kortilla · on Dec 9, 2019

What? It’s trivial to go exponential.

  for i in n:
      for j in n:
          whoops(i, j)

reggieband · on Dec 9, 2019

That is polynomial time (specifically quadratic).

See this stack overflow on polynomial vs. exponential: https://stackoverflow.com/questions/4317414/polynomial-time-...

_wldu · on Dec 9, 2019

Two for loops, one nested under the other, both upto N is clearly O(N^2) in the worst case. Three for loops (in the same manner) is O(N^3).

a1369209993 · on Dec 10, 2019

Actually, that should be:

  whoops(n):
    for i in n:
      whoops(n-1)

Your version is merely quadratic.

trehalose · on Dec 10, 2019

I don't quite understand your code, but if I'm guessing correctly that n is an integer and "for i in n" means in human terms "for all non-negative integers less than n",

that is not exponential, that is actually factorial, which is superexponential. Uh, or close to factorial, I'm not sure exactly. Testing it in python right now, incrementing a counter each time whoops is called, I'm seeing a relationship that looks like whoops(n) == n * whoops(n-1) + 1 That is interesting.

a1369209993 · on Dec 10, 2019

> that is not exponential, that is actually factorial, which is superexponential

Fair point; I should have given:

  whoops(n):
    if(!n) return
    for i in 2: whoops(n-1)

I think it's clear that it is "trivial to go exponential", though. And for that matter, also trivial to go superexponential apparently.

esotericn · on Dec 9, 2019

That's n^2.

lmm · on Dec 10, 2019

I don't think that's a real insight. You could equally say that O(n^3) should be a sweet spot because you can nest a loop 3 deep, and so on for n^4, n^5 and so on. But we don't see these cases making it out into the wild because they're caught in testing.

Tade0 · on Dec 9, 2019

Or those two should have been combined into O(nlogn) via sorting.

52-6F-62 · on Dec 9, 2019

Or if you've already made allowances for the memory usage, you can also check to see if you can write your O(n^2) nested loops as O(n) + O(n) depending on the shape of your data.

dataflow · on Dec 9, 2019

Speaking of O(n^2) algorithms (if it's OK for me to promote my own writing on this recently?):

I recently illustrated how they can even occur as a result of algorithms being slow by a constant factor.

I thought it might be useful for others, so I posted it here: https://news.ycombinator.com/item?id=21745911