I am against it, because it allows arbitrary Python expressions inside format strings. It's too complicated and lets user have two different ways of doing things (not Pythonic) - one to calculate expression inside string and the other to calculate it outside (which should IMHO be preferred). This should maybe go to the standard library, but please not into the language.
I think a better approach would be to just add special formatter operators (if they aren't already there) that would just call str() or repr() or ascii() to whatever is presented to them (and maybe take some optional arguments such as length or padding).
> I am against it, because it allows arbitrary Python expressions inside format strings.
As well it should. Programming language features should be orthogonal as much as possible.
> It's too complicated and lets user have two different ways of doing things (not Pythonic) - one to calculate expression inside string and the other to calculate it outside (which should IMHO be preferred).
You must hate expression nesting, then. Look at all these ways of doing the same thing:
x = a + b * (c - d)
e = c - d
x = a + b * e
f = b * (c - d)
x = a + f
e = c - d
f = b * e
x = a + f
Clearly, expressions should be restricted to no more than one binary operator. That reduces the number of different ways of computing the expression, and forces the programmer to give a name to each sub-step, which enhances readability, clarity, and debugging-friendliness.
This is a straw man. If you want to evaluate expressions inside a long string, you can as easily write:
"The sum of a and b is " + (a+b) + "!"
This is completely analogous to your examples. The only disadvantage of this method is the extra quotes, but that's just syntax. You could think that having the literal string split is a disadvantage, but it really isn't - since in the proposal the string has to be literal anyway, so in either case it cannot be a variable.
In general, in what I would consider good language design (syntactic-wise), you either interpret expressions by default and then quote string literals (like most languages do), or you interpret literals by default and then quote expressions (this is the method regular expressions language, or templating languages, use). But you shouldn't do both, it's just a can of worms, especially for code highlighting tools (although in a language with Lisp-like design philosophy - which Python is not - why not, you can do it today with reader macros and whatnot).
I'm curious if you've ever spent a lot of time using a language that has a solution similar to the proposal, for instance Ruby, or the newer JavaScript versions. All of the theoretical stuff about good syntax design and code highlighting cans of worms rings incredibly hollow to me in the face of how fundamentally nicer this sort of syntax is. It's the first thing I forget Python doesn't have after not using it for awhile, which I then go looking for, and once again find myself surprised at its absence. I also find the "un-Pythonic to have more than one way to do it" argument interesting, as there seem to be three approaches to string formatting in idiomatic use in Python (`.format`, `%`, and string concatenation), whereas in Ruby, the direct interpolation approach is used nearly exclusively.
You do lose type-checking capabilities by giving up format specifiers (which, for instance, Rust checks at compile-time), but Python can't take much advantage of that to begin with.
For all the (little) work I have done with JS I used the same technique of string concatenation, and I don't really see an issue with writing "+ instead of { (respectively +" instead of }).
Ruby has a different philosophy than Python, was originally intended to be Perl replacement, and is more open to cryptic shortcuts like this.
While in Python, the presumably better format() syntax was a step to actually longer code, character-wise, than old C-style % interpolation. Python simply tries to be more explicit, because it is used in more domains than just text processing; similar arguments can be raised against Python's treatment of regular expressions.
Ah, well, I guess all I can really say then is that it seems like you're knocking it before you've tried it. It's not any more or less cryptic or explicit than the alternatives, it's just a different, and quite convenient, way of expressing the same thing.
I agree. Regardless of the technical merits it's a much nicer experience with languages like Ruby. I like the idea of keeping cognitive overhead at a minimum by reducing the number of methods to accomplish something but at the end of the day practicality should win.
First of all, I don't think you can do {age+1} in a string right now.
But you can use string placeholders when used with format function. That should take take of this use case. So no need to add another syntactic sugar to make it more convenient.
It is already as simple as possible. Let us not try to simplify it further.
I will be horrified and very worried about the direction that the language has taken if this pep gets accepted.
This perspective is so interesting to me. You say the version using `.format` is "as simple as possible", but the interpolated version seems clearly simpler to me, because it eliminates the redundancy in the placeholders. What's the disconnect here?
> but the interpolated version seems clearly simpler to me. What's the disconnect here?
The disconnect is this. An explicit list of variables to use for interpolation is not redundancy. Because the placeholders does not, by themselves, refer to anything. Placeholders are just holes and only have a meaning in the context of formatting functions. And even in them, they does not refer to a local variables, even when they share a common name. So when you use .format() function, you are actually saying, "here is a string with some named holes. Fill the hole named 'A' with the value from variable 'A', the hole named 'B' with the value from variable 'B'.
So hole 'A' and variable 'A' are different things even if they share the same name.
Now, the simplicity argument.
Every thing should be made as simple as possible. But not simpler. Why not? Because when you simplify further, you are paying a cost (often not apparent initially), some times in clarity, sometimes in correctness and so on. The pigeonhole principle.
In our case, we can further simplify the process by adding an implicit mapping from named placeholders to local variables. When you do so, you are adding something implicit. The costs of which might not be apparent at this point..
So this pep is strapping on something to the whole language, to slightly simplify this one use case. Which is why I said that it is trying to simplify it further than it is possible.
edit: Deleted my response because this is a bikeshed. You prefer placeholders, I prefer interpolations, it's totally subjective and there's little point debating it!
I don't think this is actually a security concern. The only place f-strings are evaluated is where they're directly included in the source; they can't be supplied by a user (unless you're using "eval," in which case the security concern applies with or without f-strings). As the PEP says:
"Because the f-strings are evaluated where the string appears in the source code, there is no additional expressiveness available with f-strings. There are also no additional security concerns: you could have also just written the same expression, not inside of an f-string."
I'd submit the security concern is more that specifying a string interpolation format without first-class thought about how to encode interpolated strings is a terrible idea in 2015, and people need to really stop doing this. See http://www.jerf.org/iri/post/2942 and the example library I use to demonstrate the point, https://github.com/thejerf/strinterp .
Of course all current methods of string interpolation in Python have that problem too.
And typing that sentence really, really makes we want to link http://xkcd.com/927/ . I'm unconvinced adding a fourth choice at this very late date can fix anything.
I will go further and say that I will even take readability over convenience.
This is a step in the reverse direction. Please don't do this. I am not sure why this is even considered. We already have ways to do this clearly. Let us not add another way to do this in a less readable way that is a lot more easier to write. That is a deadly combination.
Features in python are geared towards more readable code (I know about the stuff you can do with things like comprehensions, but hey I think their power justifies them enough). This will lead to people using this format due to initial convenience, but ends up regretting doing so.
Please remember that code is read more often than it is written. So a requiring a little verbosity if that can enhance readability even a little bit, is good. I hope these kinds of good things about python does not get removed.
I am coming from 9 years of experience with PHP. And I will say that this is not worth it. And this is actually one of the features I have come to like in Python now.
And that is not considering the implications of having expression evaluation inside strings...
This is a simple example that does seem more readable (though it also would be if you explicitly added the value=value), but I think the concerns people have are about more complex examples, and the fact that readability is also impacted if allowing arbitrary code within strings themselves.
If you don't like using the empty {} (which seems fair enough), you can already use a couple of slightly more verbose but more explicit options:
"{value} of very long text here...".format(**locals)
or (more verbose with many keys)
"{value} of very long text here...".format(value=value)
I tend to prefer the latter even with multiple keys, but also think the former is better as it makes it explicit that the string is populated with the locals rather than implicitly doing it to all strings.
The documentation at https://docs.python.org/3/library/functions.html#locals doesn't indicate that locals is not part of the language - where is it documented that using it is a hack and that python implementations need not support it?
I thinks this is where our difference in comes from.
When we are reading code, we are not usually reading the strings contained within.
We are reading variables names, we are looking at where they are used, where they are assigned, stuff like that.
I very rarely look inside the strings when we are actually reading code. When reading code, strings are just black boxes where nothing can happen.
So we can completely ignore them.
With this pep, that will change. And readability takes a big hit. Strings are no longer black boxes.
And actually you can get the " {value} if very long text" format right now using the format function, but it right now requires an explicit list/dictionary of variables.
This explicit list of variables really helps in the context of reading code. And forcing users to write down that explicit list of variables to use, is good.
How do you know where the string ends? If your editor has syntax coloring and you look for the end of the string-color block, then the expressions inside the string will also have a different color and will pop right out. If you don't use syntax coloring you have to scan for " or ', and you'll learn to scan for { too.
I usually just go to the end of the line. or if that is not possible, I think I usually look at the next closest thing that obviously is not a string and scan backwards.
I mean, usually I didn't have to do anything consciously (or do a scan) to spot the end of a string.
When I'm reading code, I'm always reading string literals contained within it. Strings that aren't integral to the code should be external resources, and I'm not even convinced it's possible to black-box them at a below-conscious level; you can selectively actively ignore them, but that takes additional effort that is only generally a net win in particularly poorly structured code.)
Also coming from a php background, I'll 100% agree with this. I always cringe when I see strings formatted like this for pretty much the same reasons. It makes it painful to figure out what the hell a string is actually going to look like when you have a bunch of code embedded inside it
disagree. it only allows Python expressions in the same context in which they are already allowed: in the code. it's not possible to create an f-string at runtime.
Security was not my primary concern, but even if the design can be made secure, it is still a concern (because bugs happen). In any case, you just stated another counter argument - it breaks orthogonality of the language (string being interpolated must be a literal), just to make one assignment not explicit.
Maybe it is popular in other languages - their call, but the fact is, it goes quite wildly against Python design philosophy.
I would also like to note that the are templating systems that let you evaluate arbitrary Python expressions. Perhaps these would be a better choice for users who feel need for this proposal.
i agree that there are multiple templating solutions, but i disagree that they fill the same niche. this is a language usability (UX even) thing, not a purity thing. the fact is people do '...'.format(locals()) for the same purpose (or '...' % locals()), which is strictly worse and may not even work on some python implementations. i also disagree that this feature goes against the zen of python: practicality beats purity, after all.
It would also render the previous ways obsolete. New-style formatting never took over completely from old-style because it wasn’t sufficiently compelling—`"%s %s" % (a, b)` versus `"{} {}".format(a, b)` doesn’t have a clear winner. But `f"{a} {b}"`? Clearly superior. With the exception of backwards compatibility matters (which will be a nuisance for far too long), there would really be no reason to keep using the old ways in most places. i18n/l10n would really be the only mainstream reason for using anything other than f-strings.
"It would also render the previous ways obsolete."
I don't think it would. If I understand this PEP correctly, the "format" method is significantly more dynamic. For instance, I don't think this new PEP would allow for cases where the template isn't stored directly in the program, or where the values to interpolate are not local variables. So you would still need to keep str.format around for those use cases.
Your example isn't very compelling, but for longer strings with more complex formatting, I'd say .format() is pretty compelling. I find that in general
'{x} blah blah blah {y}'.format(x=x, y=y)
is more readable than
'%s blah blah blah %s' % (x, y)
even if the former is a bit longer.
On top of that, there's a whole bunch of stuff you can do with .format() that just isn't possible with %.
I would be very happy to see some version of this accepted.
When it comes to native String interpolation Groovy has it, Scala has it, ES6 has it apparently; to a more limited extent Bash, PHP, Perl have it too of course.
I can't help feeling that other devs are now coming to Python expecting this kind of feature, and are disappointed to find three (harder, often less readable) ways to do it instead. Got to keep up with the Joneses, etc...
It doesn't mean this is a good feature to have - it allows objects within scope to now be directly included in a string, which isn't a secure thing to do.
Nonononono. Python does not need more string literal specifiers. For a language that avoid symbols and sigils like the plague, it already has an absurd amount of string literal syntax.
Why not make a strfmt library on pypi that provides a single fmt(s, args, kwargs) function and let people call that? Why the obsession with more builtins?
The first thing I thought of when I looked at the PEP was, "this is like a string version of register_globals=on".
A string literal whose value automatically changes with the code surrounding it sounds like a really bad idea.
I also noticed that the PEP uses str.format method as a strawman, ignoring the fact that % string interpolation is very popular and does not need replacing, which is at the core of this problem in the first place; Someone keeps trying to replace something that does not need replacing.
Furthermore, I can't help but think that this would eventually become a complete literal string DSL (if not one already) inside of Python.
> The first thing I thought of when I looked at the PEP was, "this is like a string version of register_globals=on", which is an unsettling thought to have about Python, my favorite language.
> The idea of having a string that is automatically dynamic and whose value is hardly predictable upon first glance, wholly dependent on the stability of the code surrounding it, sounds like an absolutely horrendous idea.
What?! It's not a dynamic string. It's a string concatenation expression with syntactic sugar.
This isn't PHP's register_globals. It's PHP's "{$n + 1}".
What this does, you can already do. "Foo " + bar + " baz" already exists. This is merely nicer syntax.
is not a string literal, neither is one using format or % function. All of those things return dead strings. This pep is about creating a kind of 'live' string literal, which python does not have right now (or need IMHO). So this is not merely a nicer syntax.
It is live in the sense that you can take that string literal and put it in a different context and it will result in a different evaluation, possibly yielding different value for the string and possibly different side effects (after calls to evaluate expressions).
I switch back and forth between format and %, and never use locals in the format. It's annoying, every time a string is written, to try to decide which way is better for this instance.
That said, is there a way to do ('%.3f' % x) with this?
I am having some problems trying to understand the implementation. What would the AST from evaluating "f'{a+1}'" look like? Will there be a special AST node for f-strings, or will it be pre-structured into the AST?
If it's a special node, is it the responsibility of the byte code generator to parse the string? My belief is that it's part of the parser's job, so the AST will never contain an f-string.
What does a syntax error report look like? Or traceback? Will it be able to narrow down the part of the string which causes a problem?
Can f-strings include f-strings, like:
f"{a + (f' and {b+1}')}"
I assume the answer is 'yes, and you shouldn't do that', which I can accept.
Support for arbitrary expressions inside of an f-string means that the following is allowed,
and will work, and will print something, but it won't be "1 = 2". Nor will any but heavy-weight analysis tools be able to figure out that this 'f' is a generator.
if attr=='yields' :
yield_unit = self._grab_attr_(obj,'yield_unit')
if yield_unit:
ret = '%s %s'%(ret,yield_unit) # FIXME: i18n?
return ret
The penultimate line could be rewritten, validly, as:
ret = f'{ret} {yield_unit}' # FIXME: i18n?
The introduction of a typo, from 'yield_unit' to 'yield unit', would drastically change the function, and be very hard to spot.
ret = f'{ret} {yield unit}' # FIXME: i18n?
Yes, Don't Do That, but we know that people use things like syntax highlighters to help understand the code and identify mistakes like this.
EDIT: the PEP says that the expression are "parsed with the equivalent of ast.parse(expression, '<fstring>', 'eval')". That means that 'yield' is not allowed.
Presumably it'd be handled similarly to how PHP handles {} and $ in strings. As soon as possible, you swap "{foo} {bar}" for (foo.__format__() + " " + bar.__format__())
It's not so simple. The full implementation has to do something like insert an AST into the right place, because {foo} can be an expression like f"{__import__('math').cos(len(s))}".
The Python tokenizer will pass an f-string to the AST builder, which has to pass the string to another tokenizer to generate the new AST. Because f-strings can contain f-strings, this process is recursive. The end result is a new AST that replaces the original f-string.
>In the matter of reforming things, as distinct from deforming them, there is one plain and simple principle; a principle which will probably be called a paradox. There exists in such a case a certain institution or law; let us say, for the sake of simplicity, a fence or gate erected across a road. The more modern type of reformer goes gaily up to it and says, “I don’t see the use of this; let us clear it away.” To which the more intelligent type of reformer will do well to answer: “If you don’t see the use of it, I certainly won’t let you clear it away. Go away and think. Then, when you can come back and tell me that you do see the use of it, I may allow you to destroy it
Well, now you talk about that the reform is wrong (the suggested one or the previous ones?) but I said that I agree with it. So it feels like between the post you are responding to and the Fence there are some arguments missing, right?
I think you're right that Interpy and PEP 498 have a different, earlier binding strategy than say does. I've found the late / local binding of say() and fmt() convenient. Are there use cases where that earlier binding is valuable or critical?
Your example highlights the binding strategy, but more typical would be:
There's an interesting competing PEP which allows for the way in which the expressions are interpolated into the string to be customized: https://www.python.org/dev/peps/pep-0501/
I hope this gets implemented. A year ago or so we had several customers wanting to custom format output filenames and directories in a desktop application and we settled for something which is almost exactly this, so the {identifier:format spec} idea, and ever since implementing it I wish any language had it as we found it really convenient and having no apparant disadvantages in comparision with printf-%-style in C or Python/streams in C++/{}-style in C#/Python
I don’t understand what you’re seeking. This is purely for string literals. If you’re taking a user input string like `"foo-{date}.{extension}"`, you could use `string.format(date=…, extension=…)`
Yes it's for literals only, just saying I like the inline style and readability and succinctness of it and wouldn't mind it being implemented in Python and other languages.
But this wouldn't solve your case because it could only be applied to strings that are in the source code. Your users don't have the ability to change those string. Instead, they want to provide a format string, which you fill in. That is already available, eg, via str.format(kwargs), where you control the kwargs and the user controls the str.
However, str.format() is not without its issues. Chief among them is its verbosity. For example, the text 'value' is repeated here:
>>> value = 4 * 20
>>> 'The value is {value}.'.format(value=value)
'The value is 80.'
Even in its simplest form, there is a bit of boilerplate, and the value that's inserted into the placeholder is sometimes far removed from where the placeholder is situated:
>>> 'The value is {}.'.format(value)
'The value is 80.'
You can even hack it into a string class if you don't mind using even more scary hacks like monkey patching built in classes.
def I(s):
import inspect
frame = inspect.currentframe()
caller_locals = frame.f_back.f_locals
return s.format(**caller_locals)
def main():
a = 12
b = 10
print I('A is {a} and B is {b}')
if __name__ == '__main__':
main()
This is a very handy feature of PHP and would be useful in Python. I think readability will be solved by syntax highlighting: expressions in an f-string would be highlighted like normal expressions, rather than like string content. This is what is already done for PHP.
I'm really happy to see this. I know it's petty, but this was the biggest reason I decided to focus on learning the ins and outs of Ruby instead of Python.
The fact they had to hack in a workaround so != works is a point against it. And they acknowledge you can use repr()/str()/ascii() directly.
They want to keep it for str.format() compatibility, but I'm unconvinced. It hurts readability, and is redundant (There should be one-- and preferably only one --obvious way to do it.)
Something like that would be very easy to add to a linter, even if not to the official PEP8. (Although I totally agree, only use 1 method, whichever it is...)
I think a better approach would be to just add special formatter operators (if they aren't already there) that would just call str() or repr() or ascii() to whatever is presented to them (and maybe take some optional arguments such as length or padding).