Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Meanwhile Python has received this same feature request many times over the years, and the answer is always that it would break existing code for little major benefit https://discuss.python.org/t/make-lambdas-proper-closures/10...

Given how much of an uproar there was over changing the string type in the Python 2 -> 3 transition, I can't imagine this change would ever end up in Python before a 4.0.

Cue someone arguing about how bad Python is because it won't fix these things, and then arguing about how bad Python is because their scripts from 2003 stopped working...




It's worth noting that it's much less of a problem in Python due to the lack of ergonomic closures/lambdas. You have to construct rather esoteric looking code for it to be a problem.

    add_n = []
    for n in range(10):
        add_n.append(lambda x: x + n)
    add_n[9](10)  # 19
    add_n[0](10)  # 19
This isn't to say it's *not* a footgun (and it has bit me in Python before), but it's much worse in Go due to the idiomatic use of goroutines in a loop:

    for i := 0; i < 10; i++ {
        go func() { fmt.Printf("num: %d\n", i) }()
    }


In Python you are much more likely to hit that problem not with closures constructed with an explicit 'lambda', but with generator-comprehension expressions.

    (((i, j) for i in "abc") for j in range(3))
The values of the above depends on in which order you evaluate the whole thing.

(Do take what I wrote with a grain of salt. Either the above is already a problem, or perhaps you also need to mix in list-comprehension expressions, too, to surface the bug.)


Yeah, this one is weird:

  gs1 = (((i, j) for i in "abc") for j in range(3))
  gs2 = [((i, j) for i in "abc") for j in range(3)]

  print(list(map(list, gs1)))
  print(list(map(list, gs2)))
Results:

  [[('a', 0), ('b', 0), ('c', 0)], [('a', 1), ('b', 1), ('c', 1)], [('a', 2), ('b', 2), ('c', 2)]]
  [[('a', 2), ('b', 2), ('c', 2)], [('a', 2), ('b', 2), ('c', 2)], [('a', 2), ('b', 2), ('c', 2)]]
That's a nice "wat" right there. I believe the explanation is that in gs2, the range() is iterated through immediately, so j is always set to 2 before you have a chance to access any of the inner generators. Whereas in gs1 the range() is still being iterated over as you access each inner generator, so when you access the first generator j=1, then j=2, etc.

Equivalents:

  def make_gs1():
      for j in range(2):
          yield ((i, j) for i in "abc")

  def make_gs2():
      gs = []
      for j in range(2):
          gs.append(((i, j) for i in "abc"))
      return gs
Late binding applies in both cases of course, but in the first case it doesn't matter, whereas in the latter case it matters.

I think early binding would produce the same result in both cases.


or you could just be eager and use lists:

    >>> [[(i, j) for i in "abc"] for j in range(3)]
    [[('a', 0), ('b', 0), ('c', 0)], [('a', 1), ('b', 1), ('c', 1)], [('a', 2), ('b', 2), ('c', 2)]]


Right, creating generators in a loop is not usually something you want to do, but it's meant to demonstrate the complexity that arises from late binding rather than demonstrate something you would actually want to do in a real program.


Ignoring the strange nature of this code in the first place the more pythonic way to do it would be

    from functools import partial
    from operators import add

    add_n = [partial(add, n)) for n in range(10)]

    assert add_n[5](4) == 9
Look ma, no closures.


'partial' creates a closure for you.


Unless you're talking philosophically how classes and closures are actually isomorphic then no, it doesn't. None of the variables in the outer scope are captured in the class instance.

https://github.com/python/cpython/blob/main/Lib/functools.py...

Here's a simplified version of that code that demonstrates the pattern.

    class partial:
      def __init__(self, func, *args, **kwargs):
        self.func = func
        self.args = args
        self.kwargs = kwargs

      def __call__(self, *args, **kwargs):
        return self.func(*self.args, *args, **(self.kwargs | kwargs))

      p = partial(add, 5)  # -> an instance of partial with self.args = (5,)
      res = p(4)  # -> calls __call__ which merges the args and calls add(5, 4)


I was talking 'philosophically' in that sense. The partial object does create a new scope that binds a few of those variables.

But you are also right that the mechanisms in Python are different (on some suitable mid-level of abstraction) for those two.


Everyone else solved this problem by using list comprehensions instead. Rob has surely heard of those.


Of the two comprehension syntaxes in Haskell, Python picked the wrong one. Do notation (or, equivalently, Scala-style for/yield) feels much more consistent and easy to use - in particular the clauses are in the same order as a regular for loop, rather than the middle-endian order used by list comprehensions.


Haskell has both do-notation and list comprehension.

Comprehension in both Python and Haskell (for both lists and other structures) use the same order in both language, as far as I remember.


> Haskell has both do-notation and list comprehension.

Right, and do-notation is the one everyone uses, because it's better. Python picked the wrong one.

> Comprehension in both Python and Haskell (for both lists and other structures) use the same order in both language, as far as I remember.

It may be the same order as Haskell but it's a terrible confusing order. In particular if you want to go from a nested list comprehension to a flat one (or vice versa) then you have to completely rearrange the order it's written in, whereas if you go from nested do-blocks to flat do-blocks then it all makes sense.


I see what you mean, but I don't find the order that confusing in neither Haskell or Python.

However, I can imagine a feature that we could add to Python to fix this: make it possible for statements to have a value. Perhaps something like this:

    my_generator = \
      for i in "abc":
        for b in range(3):
          print("foo")
          yield (i, b)
or perhaps have the last statement in block be its value (just like Rust or Ruby or Haskell do with the last statement in a block), and make the value of a for-loop be a generator of the individual values:

    my_list = list(
      for i in "abc":
        for b in range(3):
          (i, b))
Though there's a bit of confusion here whether the latter examples should be a flat structure or a nested one. You could probably use a similiar mechanism as the existing 'yield from' to explicitly ask for the flat version, and otherwise get the nested one:

    my_list = list(
      for i in "abc":
        yield from for b in range(3):
          (i, b))
Making Python statements have values looks to me like the more generally useful change than tweaking comprehensions. You'd probably not need comprehension at all in that case. Especially since you can already write loop header and body on a single line like

    for i in range(10): print(i)
if they are short enough.


  for i in range(10): print(i)
But what would that return? [None] * 10?

The limited whitespace-based syntax limits the potential for fun inline statement things, but it also completely dodges the question of what any particular statement should evaluate to when used as an expression.


> But what would that return? [None] * 10?

Yes, I guess something like that. That was just meant as an example of how existing Python allows you to write loops on one line. It's not a good example for a meaningful comprehension in our alternative made-up Python dialect.

> The limited whitespace-based syntax limits the potential for fun inline statement things, [...]

Python already mostly allows you to use parens to override the indentation. They would just need to generalise that a bit. Btw, Haskell already does that:

Officially, Haskell has a syntax with curly braces and semicolons; and they define the indentation based syntax as syntactic sugar that desugars to ; and {}. But almost everyone uses indentation based syntax. The exception are perhaps code generators and when posting on a website that messes with indentation.

(And, because it's Haskell, the {}; syntax is just another layer of syntactic sugar for 'weird-operator'-based based syntax like >>=.)


When I was starting in Python years ago I had to turn my brain inside out to learn how to write list comprehensions. Sometimes I wonder what it's like to be a normal person with a normal non-programmer brain, having forgotten it entirely these last many years.


But Python doesn't have any concept of a monad, so what would do-notation even be in Python? And who is the "everyone" using do-notation? I don't see any analogous syntax in Lua, Javascript, Ruby, or Perl.

In Python there is a nice tower of abstractions for iteration, but nothing more general than that, so it makes perfect sense IMO to use the syntax that directly evokes iteration.

The existing syntax is meant to mirror the syntax of a nested for loop. I agree that maybe it's confusing, but if you want to go from a multi-for comprehension to an actual nested for loop, then you don't have to invert the order.


> But Python doesn't have any concept of a monad, so what would do-notation even be in Python?

It could work on the same things that Python's current list comprehensions work on. I'm just suggesting a different syntax. Comprehensions in Haskell originally worked for all monads too.

> And who is the "everyone" using do-notation? I don't see any analogous syntax in Lua, Javascript, Ruby, or Perl.

I meant that within Haskell, everyone uses do notation rather than comprehensions.

> The existing syntax is meant to mirror the syntax of a nested for loop. I agree that maybe it's confusing, but if you want to go from a multi-for comprehension to an actual nested for loop, then you don't have to invert the order.

You have to invert half of it, which I find more confusing than having to completely invert it. do-notation style syntax (e.g. Scala-style for/yield) would keep the order completely aligned.


> Comprehensions in Haskell originally worked for all monads too.

And that behaviour is still accessible via a compiler extension.


Idris 2 still has both "monad comprehensions" and an applicative equivalent called "idiom brackets".

https://idris2.readthedocs.io/en/latest/tutorial/interfaces....

https://idris2.readthedocs.io/en/latest/tutorial/interfaces....



How does list comprehension change anything here? This has the same problem:

    add_n = [lambda x: x + n for n in range(10)]
    add_n[9](10)  # 19
    add_n[0](10)  # 19


I’m not sure what they mean by list comprehensions, either, but for completeness’s sake, I must point out that this is solvable by adding `n` as a keyword argument defaulting to `n`:

    add_n = [lambda x, n=n: x + n for n in range(10)]
    add_n[9](10)  # 19
    add_n[0](10)  # 10


This is the way

Also pylsp warns you about this


I don't think anyone is puzzled by the Go snippet being wrong.

The bigger problem in Go is the for with range loop:

  pointersToV := make([]*val, len(values))
  for i, v := range values {
    go func() { fmt.Printf("num: %v\n", v) } () //race condition
    pointersToV[i] = &v //will contain len(values) copies of a pointer to the last item in values
  }
This is the one they are changing.

Edit: it looks like they're actually changing both of these, which is more unexpected to me. I think the C# behavior makes more sense, where only the foreach loop has a new binding of the variable in each iteration, but the normal for loop has a single lifetime.


It's actually worse in Python since there's no support for variable lifetimes within a function, so the `v2` workaround is still broken. (the default-argument workaround "works" but is scary)

This makes it clear: the underlying problem is NOT about for loops - it's closures that are broken.


It's not broken, it's a different design. Maybe worse in a lot of cases, but it's not broken. It's working as intended.


You could say the design is broken, but the implementation is working as intended by the design.


Doesn’t go vet complain about your code? I’m not at my computer right now so can’t check.


> Tools have been written to identify these mistakes, but it is hard to analyze whether references to a variable outlive its iteration or not. These tools must choose between false negatives and false positives. The loopclosure analyzer used by go vet and gopls opts for false negatives, only reporting when it is sure there is a problem but missing others.

So it will warn in certain situations, but not all of them


Why would it? It's perfectly correct code, it's just not doing what you'd expect.

It might complain about the race condition, to be fair, but the same issue can be reproduced without goroutines and it would be completely correct code per the semantics.


In many languages "if x = 3" is perfectly valid code, but almost certainly not what the person intended "if x == 3". It's very smart to warn someone in a scenario like this.


I don't really write C too much, but I thought `if err = functionWithErrorReturn() { handleError(err) }` was a somewhat common idiom.


It's a common enough idiom from "stone age" bare bones K&R C, absolutely.

It's also one of the great foot-guns of C programming as there are so many other almost but not that idioms and it's never clear on casual inspection whether the result of an assignment was meant to be examined or the result of a comparison.

With the evolution of C and C sanity tools that rightfully flag such statements for double checking and the desire to not have spurious flagging, etc. it's more common in later C code to see (say)

    if ((err = someFunction()) != NOERROR) { errorHandle(err) }
that optimises down to the same intermediate code where NOERROR is 0, sure, but it makes it very clear what is going on, an intended assignment and then an intended comparison.

As with all idoms the general practice in the larger codebase and house code standard rules apply - there are other ways of doing similar things.


I've run into this once. IIRC the workaround was to add a n=n arg to the lambda


Somehow, Go managed to not break old code and also fix the problem.

I think this is a good case of Python not fixing things, given that a fix exists that solves both problems.


> To ensure backwards compatibility with existing code, the new semantics will only apply in packages contained in modules that declare go 1.22 or later in their go.mod files.


Python could very easily have a similar mechanism. Hell even CMake manages to do this right, and they got "if" wrong.

The Python devs sometimes seem stubbornly attached to bugs. Another one: to reliably get Python 3 on Linux and Mac you have to run `python3`. But on Windows there's no `python3.exe`.

Will they add one? Hell no. It might confuse people or something.

Except... if you install Python from the Microsoft Store it does have `python3.exe`.


> Except... if you install Python from the Microsoft Store it does have `python3.exe`.

It's worse. If you don't install Python from the Microsoft Store there will still be a `python3.exe`. But running it just opens Microsoft Store.

Imagine how confused one could be when someone typed `python3 a.py` over a SSH session and nothing happened.


I’ve not run “python3” in years on my Mac, and I’m almost certain I never type it into Linux machines either; either I’m losing my mind, or there are some ludicrous takes in this thread.


You are surely losing your mind then. Python3 isn't something esoteric.


Entirely possible, but my point was I just type “python” and Python 3 happens. Do modern OS even come with Python 2 anymore?

I’m not claiming any mystery about Python, just disputing how the modern version is invoked.


Just tried "python" and "python3" on various Linux distros, which output respectively:

On an Ubuntu 20.04 desktop VM:

  python  => Python 2.7.18 (default, Jul  1 2022, 12:27:04)
  python3 => Python 3.8.10 (default, May 26 2023, 14:05:08)
On an Ubuntu 19.04 server:

  python  => -bash: python: command not found
  python3 => Python 3.7.5 (default, Apr 19 2020, 20:18:17)
On an Ubuntu 20.10 server:

  python  => -bash: python: command not found
  python3 => Python 3.8.10 (default, Jun  2 2021, 10:49:15)
I no longer have access to some RHEL7 and RHEL8 machines used for work recently, but if I recall correctly they do this by default:

Red Hat Enterprise Linux 7:

  python  => Some version of Python 2
  python3 => Some version of Python 3
Red Hat Enterprise Linux 8:

  python  => -bash: python: command not found # (use "python2" for Python 2)
  python3 => Some version of Python 3
You can change the default behaviour of unversioned "python" to version 2 or 3 on all the above systems, I think, so if you're running a Linux distro when "python" gets you Python 3, that configuration might have been done already.

MacOS 10.15 (Catalina) does something interesting:

  python  => WARNING: Python 2.7 is not recommended.
             This version is included in macOS for compatibility with legacy software.
             Future versions of macOS will not include Python 2.7.
             Instead, it is recommended that you transition to using 'python3' from within Terminal.
             Python 2.7.16 (default, Jun  5 2020, 22:59:21)
  python3 => Python 3.8.2 (default, Jul 14 2020, 05:39:05)


To be fair, few of these would qualify as "modern". Ubuntu 19.04 and 20.10, macOS 10.15 are all out of support, and RHEL 7 is almost ten years old and nearing the end of its support.


> I just type “python” and Python 3 happens.

That was the old way. Python now recommends against installing Python3 in a way that does that, and most modern *nix don't.


13.5.2 has /usr/bin/python3 (it’s 3.9.6) but not python2 or just python. Not sure when they changed, and YMMV with Homebrew.


I suspect my confusion stemmed from mostly invoking `ipython` which doesn't include the 3 suffix (ok, part of the confusion may've been pub-related too :D).


Depending on the package manager / distribution, 'python' might be symlinked to either Python 2 or Python 3. If you don't have Python 3 installed, it might very well point to Python 2. These days it will almost certainly prefer Python 3, but I am also in the habit of actually typing 'python3' instead of 'python' because of what I assume are issues I've had in the past.


> to reliably get Python 3 on Linux and Mac you have to run `python3`.

This is not true on my Fedora 38 system, same with current Kali linux. Although, it is the case with Ubuntu 22.04.3.


_reliably_, as in on the vast majority of machines.


Is that really python’s fault? It seems like it’s the distro making a design decision.


Well, no, not python's fault -- clearly the distros', and they probably should be blamed. But a PEP saying python2 and python3 should invoke the correct interpreter would help motivate the distributions.

(This is isomorphic to the usual victim-blaming discussion. Fault and blame vs some ability to make a difference; it's a shame that correctly pointing out a better strategy is both used to attack victims and attacked for attacking victims in the cases when that wasn't intended.)


In fact there is a PEP: https://peps.python.org/pep-0394/


What do you mean? Fedora 38 doesn't have `python3`? Are you sure?


Right, and Go has the luxury of being a compiler that generates reasonably portable binaries, while Python requires the presence of an interpreter on the system at run time.


> Python requires the presence of an interpreter on the system at run time.

A runtime interpreter does not prevent Perl to do similar things via `use 5.13`

Python has `from future` with similar abilities, it would absolutely be possible to do the same as Perl and Go and fix what needs to be fixed without breaking old code. One could design a `import 3.22` and `from 3.22 import unbroken_for` and achieve the same thing.


The same trick would work with python just as well. There’s nothing about Python’s status as an interpreter which would stop them from adding a python semantic version somewhere in each python program - either in a comment at the top of each source file or in an adjacent config file. The comment could specify the version of python’s semantics to use, which would allow people to opt in to new syntax (or opt out of changes which may break the code in the future).

Eg # py 3.4


Yeah, it would just mean that the interpreter - just like the Go compiler - would need to have the equivalent of "if version > 3.4 do this, else do that". Which is fine for a while, but I can imagine it adds a lot of complexity and edge cases to the interpreter / compiler.

Which makes me think that a Go 2.0 or Python 4 will mainly be about removing branches and edge cases from the compiler more than making backwards-incompatible language changes.


This is the direction multiple languages are moving in. Go and Rust both have something like this. (In Rust they're called "editions"). I think its inevitable that compilers get larger over time. I hope most of the time they aren't too onerous to maintain - there aren't that many spec breaking changes between versions. But if need be, I could also imagine the compiler eventually deprecating old versions if it becomes too much of an issue.

Arguably C & C++ compilers do the same thing via -std=c99 and similar flags at compile time.

Anyway, nothing about this is special or different with python. I bet the transition to python 3 would have been much smoother if scripts could have opted in (or opted out) of the new syntax without breaking compatibility.


Python probably could change this with a from __future__ import, i.e. in the same way.


By letting you specify a language version requirement? Not exactly backwards compatible (because it is explicitly not, as per the article).

Python doesn’t make breaking changes in non-major versions, so as mentioned by the upthread comment the appropriate place for this change would be in Python 4.

Given the above, I’m really not sure what point you think you’re making in that final paragraph.


This seems weird to given the number of breakages and standard library changes I seem to run into every version.


Really? I find that surprising. I don’t write as much code as I used to but I’ve been writing Python for a long time and the only standard library breakages that come to mind were during the infamous 2 -> 3 days.

What sort of problems are have you faced upgrading minor versions?


The docs are full of remarks like "removed in 3.0 and reintroduced in 3.4" or "deprecated in 3.10", etc. A big one is the removal of the loop parameter in asyncio, but a lot of asyncio internals are (still?) undergoing significant changes, as getting the shutdown behavior correct is surprisingly difficult. Personally it's never cause me any issues - I'm always on board with the changes.


Asyncio was explicitly marked as provisional for years and most of the incompatible changes happened during that time. Same goes for typing. The rest of the language is very very stable.



They do and have made relatively small ones, e.g. promoting __future__ features to default, etc.


If the change /doesn't/ break old code, it's also poorly justified.

It means code doesn't care about the issue being addressed.

The feature is only justified if it changes existing code, such that bugs you didn't even know about are fixed.

I.e. people read about the issue, investigate their code bases and go, oh hey, we actually have a latent bug here which the change will fix.


There's a second motivation in my opinion. Code might work today without the change, but it could be because the author originally wrote buggy code, caught it in testing, and had to waste time tracking it down and understanding nuances that don't need to be there. Once they figured that out, they implemented an ugly workaround (adding an extra function parameter to a goroutine or shadowing the loop variable with n := n).

Good language designers want to avoid both wasting developer's time and requiring ugly workarounds. Making a change that does both, especially if it doesn't break old code, is great imo.


Whichever way you implement the semantics of the loop variable, the developer has to understand the nuances, don't you think? And those nuances have to be there; all you can do is replace them with other nuances.

If a fresh variable i is bound for each iteration, then an i++ statement in the body will not have the intended effect of generating an extra increment that skips an iteration.

If you want the other semantics, whichever one that is, the workaround is ugly.


I think you can choose nuances that minimize unexpected behavior in practice, and I think the go team did a good job here.


New code written today will use the new version and have the correct behavior from day 1.

Old code that is maintained will eventually be upgraded, which yes does come with work sometimes where you realize your code works on version X but not version X+10 and you do a combination of tests and reading patch notes to see what changed.


There is no "correct" behavior here; either one is a valid choice that can be documented and that programs can rely on and exploit.

Code doesn't care about when it's written, only what you run it on, and with what compatibility options.

E.g. one possibility is that ten-year-old code that wrongly assumed the opposite behavior, and has a bug, will start to work correctly on the altered implementation.


Python is in a larger bind because it only has function scoping and variable declaration is implicit. It does not have sub-function scopes.

So does not really have a good way to fix the issue, even by using a different keyword as JS did.

OTOH default parameters being evaluated at function definition make mitigating it relatively simple.


Yeah, block scoping is one of those "weird CS ideas" that I'm sure at some point early in Python's design was deemed too complicated for the intended audience, but is also quite a natural way to prevent some human errors. JavaScript made the same mistake and later fixed it (let/const).


I'm not a computer scientist so I can't rule whether function scope was a mistake, and can't see how block scoping would be considered too complicated, I personally think it fits much better with my mental model. Then again, Python doesn't have blocks in the traditional sense of the word IIRC, in C style languages the accolades are a pretty clear delineator.

Parts of my previous job were terrible because it had JS functions thousands of lines of code long where variables were constantly reused (and often had to be unset before another block of code). That said, that wasn't the fault of function scope per se, but of a bad but very productive developer.


TBF you can have block scoping in an indentation-based language, though it probably help to merge the too, as in Haskell: `let…in` will define variables in the `let` clause, and those variables are only accessible in the `in` clause (similarly case…of)


I love python, but it's one of the biggest annoyances. Local variables like in Lua make a lot of sense.


Python does actually have a single instance of sub-function scopes: When you say `try: ... except Exception as e: ...` the `e` variable is deleted at the end of the `except` clause. I think this is because the exception object, via the traceback, refers to the handling function's invocation record, which in turn contains a map of all the function's local variables. So if the variable worked like normal variables in Python it'd create a reference cycle and make the Python GC sad. So if you need that behaviour, you need to reassign the exception to a new name [0].

0: https://docs.python.org/3/reference/compound_stmts.html#the-...


Is it a bug? I've always depended on late-binding closures and I think even recently in a for loop, not that I'm going to go digging. You can do neat things with multiple functions sharing the same closure. If you don't want the behavior bind the variable to a new name in a new scope. From the post I get the sense that this is more problematic for languages with pointers.


IMO it's a misdesign in the same way as e.g. JavaScript's "this". Most languages figured out 40 or so years ago that scoping should be lexical.


The scope is lexical, the lookup is dynamic. What you want is for each loop iteration to create a new scope, which I would categorize as "not lexical".


By that argument a recursive function shouldn't create a new scope every time it recurses, and a language that fails Knuth's 1964 benchmark of reasonable scoping (the "man or boy test") would be fine. The loop body is lexically a block and like any other block it should have its own scope every time it runs.


Except that the for loop does not create a new scope and is not a block:

https://docs.python.org/3/reference/executionmodel.html#stru...


Also, loop bodies already did have their own scope each iteration.

I wouldn't say either behavior is non-lexical. The only thing changing is which lexical scope these variables go into.


If the loop "variable" (and IMO thinking of it as a variable is halfway to making the mistake) is in a single scope whose lifetime is all passes through the loop body, that's literally non-lexical; there is no block in the program text that corresponds to that scope. Lexically there's the containing function and the loop body, there's no intermediate scope nestled between them.


> and IMO thinking of it as a variable is halfway to making the mistake

I used plural for a reason.

> there is no block in the program text that corresponds to that scope.

The scope starts at the for. There is a bunch of state that is tied to the loop, and if you rewrote it as a less magic kind of loop you'd need to explicitly mark a scope here.

What's non-lexical about it? You could replace "for" with "{ for" to see that a scope of "all passes through the loop body" does not require anything dynamic.

And surely whether a scope is implicit or explicit doesn't change whether a scope is lexical. In C I can write "if (1) int x=2;" and that x is scoped to an implicit block that ends at the semicolon.

Would you say an if with a declaration in it is non-lexical, because both the true block and the else block can access the variable? I would just say the if has a scope, and there are two scopes inside it, all lexical. And the same of a for loop having an outer and inner scope.


The problem isn't with closures, the closure semantics are perfectly fine.

The problem is in the implementation of for-range loops, where the clear expectation is that the loop variable is scoped to each loop iteration, not to the whole loop scope (otherwise said, that the loop variable is re-bound to a new value in each loop iteration). The mental mode approximately everyone has for a loop like this:

  for _, v := range values {
    //do stuff with v
  }
is that it is equivalent to the following loop:

  for i := range values {
    v := values[i]
    //do stuff with v
  }
In Go 1.22 and later, that is exactly what the semantics will be.

In Go 1.21 or earlier, the semantics are closer to this (ignoring the empty list case for brevity):

  for i := 0, v := values[0]; i < len(values); i++, v=values[i] {
    //do stuff with v
  }
And note that this mis-design has appeared in virtually every language that has loops and closures, and has been either fixed (C# 5.0, Go 1.22) or it keeps being a footgun that people complain about (Python, Common Lisp, C++).


I don't know, my feeling is that the issue really is with how closure capture was interpreted when imperative languages started implementing lambdas. What was happening in Go seems to either amount to default capture by reference rather than value, or to the loop counters in question being unmarked reference types. The former strikes me as unintuitive given that before lambdas, reference-taking in imperative languages was universally marked (ex. &a); the latter strikes me as unintuitive because with some ugly exceptions (Java), reference types should be marked in usage (ex. *a + *b instead of a+b). Compare to C++ lambdas, where reference captures must be announced in the [] preamble with the & sigil associated with reference-taking.

(In functional languages, this problem did not arise, since most variables are immutable and those that are not are syntactically marked and decidedly second-class. In particular, you would probably not implement a loop using a mutable counter or iterator.)


Even if Go allowed both capture-by-value and capture-by-reference, this issue would have arisen when using capture-by-reference.

For example, in the following C++:

  auto v = std::vector<int>{1, 2, 3};
  auto prints = std::vector<std::function<void()>>();
  auto incrs = std::vector<std::function<void()>>();
  for (auto x : v) {
    prints.push_back([&x]()->void {std::cout<<x<<", "; })
    incrs.push_back([&x]()->void {++x;});
  }
  for (auto f : incrs) {
    f();
  }
  for (auto f : prints) {
    f();
  } //expected to print 2, 3, 4; actually prints 6, 6, 6
I would also note that this problem very much arises in functional languages - it exists in the same way in Common Lisp and Scheme, and I believe it very much applies to OCaml as well (though I'm not sure how their loops work).

Tried it out, OCaml does the expected thing:

  open List
  let funs = ref [  ] ;;
  for i = 1 to 3 do
    funs := (fun () -> print_int i) :: !funs
  done ;;

  List.iter (fun f -> f()) !funs ;; //prints 321


> this issue would have arisen when using capture-by-reference

I understand - but in those languages capture-by-reference has to be an explicit choice (by writing the &) rather than the default, which highlights the actual behaviour. The problem with the old Go solution was that it would apparently behave as capture by reference without any explicit syntactic marker that it is so, and without a more natural alternative that captures by value, in a context where from other languages you would expect that the capture would happen by value.

> Common Lisp and Scheme

I have to admit I haven't worked in either outside of a tutorial setting, but my understanding is that they are quite well-known for having design choices in variable scoping that are unusual and frowned upon in modern language design

> Ocaml

Your example shows that it captures by value as I said, right? For it to work as the old Go examples, i would have to be a ref cell whose contents are updated between iterations, which is not how the semantics of for work. If it did, you'd have to use the loop counter as !i.


In Go 1.22 as well, closures still capture-by-reference. The change is that there is now a new loop variable in each loop iteration, just like in OCaml. But two closures that refer the same loop variable (that are created in the same iteration, that is) will still see the changes each makes to that variable.

And what I was trying to show with my example was that this kind of behavior would be observable in OCaml as well, if it were to be implemented like that.


I think that's a C-centric assumption which is moot as Python's "for" does not create any new scopes. Just reading Knuth's man-or-boy test I was struck by the alien nature of the ALGOL 60 execution model, even though to Python it can be considered a distant ancestor.

https://en.m.wikipedia.org/wiki/Man_or_boy_test




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: