First example scrolling through that post: How is static keymaps: [[[u16; 3]; 2]...

pcwalton · on Nov 21, 2019

Types going after identifiers avoids the need for the lexer hack, which causes all sorts of problems in C (such as "typename"). A colon nicely separates the two; I prefer something there as opposed to "x int" like in Go.

You have to nest square brackets to avoid ambiguity. Is &int[] an array of references or an reference to an array?

Rusky · on Nov 21, 2019

Rust has an equivalent to `typename`, and it even requires it more often- the turbofish.

pcwalton · on Nov 21, 2019

The turbofish rule is much easier to learn: just use ::<> when explicitly providing types for a call. The C++ concept of a "dependent qualified name" is a lot harder to explain.

What's important isn't how often you need to help the compiler: it's how easy the rules are. The turbofish is unfortunate, but it's nowhere near as bad as typename.

Rusky · on Nov 22, 2019

But C++ could take the same route of consistency regardless of the lexer hack.

The turbofish is a strict superset of `typename`, and everywhere C++ lets you skip it it could simply require it instead.

pcwalton · on Nov 22, 2019

I don't think that's true. Consider a modified version of [1]:

    template<typename T> class X {
        void foo() {
            typename T::A* pa;
        }
    }

The problem here is that C++ can't parse this without knowing whether T::A is a type or not. Otherwise it might be "T::A multiplied by pa". This is the lexer hack in action.

Rust, by contrast, has no such limitation [2]:

    trait SomeTrait {
        type A;
    }
    struct X<T> {
        f: T,
    }
    impl<T> X<T> where T: SomeTrait {
        fn foo() {
            let pa: *mut T::A;
        }
    }

This compiles and runs just fine with no need for a turbofish on T::A, because Rust has no lexer hack.

[1]: https://en.cppreference.com/w/cpp/language/dependent_name

[2]: https://play.rust-lang.org/?version=stable&mode=debug&editio...

Rusky · on Nov 22, 2019

That's true but antiparallel to my point:

C++ typename could have the same consistency as Rust's turbofish- its complicated rules are not necessitated by the lexer hack.

(In a sense, the complicated rules are what enable the lexer hack.)

pjmlp · on Nov 22, 2019

C++20 has simplified the need for typename use cases.

http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p063...

discardable_dan · on Nov 21, 2019

"The syntax is this way to make lexing it easier" is not a good argument for syntax. Ever. Lex it into tokens, parse it using semantic analysis, and be done. Plenty of compilers have been doing this for a long while now, and plenty of work has been done to make this a non-problem. Choosing syntax because it's slightly-easier to implement but slightly-harder to use is not a recipe for adoption.

pcwalton · on Nov 22, 2019

The problems with the lexer hack are user-facing problems, not compiler-writer-facing problems. They include typename, order of declarations being significant, weird function pointer syntax, and the most vexing parse.

discardable_dan · on Nov 22, 2019

I'm not advocating for the lexer hack. There are non-hack-y alternatives, hiding this pain from users. The options of "the lexer hack" or "identifiers first" is a false dichotomy. There are many ways to lex and then semantically analyze programs, and I do not understand why you are arguing as if that is not true.

steveklabnik · on Nov 21, 2019

What’s regular for computers is also more regular for humans. You’re absolutely right that taken to an extreme, doing things for computers isn’t great, but neither is making a super complex grammar

discardable_dan · on Nov 22, 2019

> What’s regular for computers is also more regular for humans.

What does this even mean? Can you define "regular"?

> neither is making a super complex grammar

This isn't about grammatic complexity, it's about the location of a type relative to the associated identifier.

steveklabnik · on Nov 23, 2019

Regular has a technical meaning here, that is, Chomsky’s grammar hierarchy. It’s where the “regular” in “regular expression” comes from. That said, I’m using it in an imprecise way here to mean “simpler to process.” (This is because regular languages are simpler to process than say, context-sensitive languages.)

Location of the type is about grammar complexity. Rust’s grammar plays into its type inference capabilities, and the pattern syntax. There’s an underlying uniformity with syntax elsewhere.

gbear605 · on Nov 21, 2019

The example of typename shows that it’s a problem that can’t be overcome by the compiler, so it’s trading off one bad syntax for another, not trading off bad syntax for difficulty to implement.

discardable_dan · on Nov 22, 2019

Wikipedia disagrees with you: https://en.wikipedia.org/wiki/The_lexer_hack

petschge · on Nov 21, 2019

I am sure there is good theoretical arguments. But they are hard on the humans. Ideally I would want something like

  constdata keymaps[1,2,3, u16]

That is easy to read, gets rid of all the extra line noise and directly tells me everything I need to know about memory layout and performance.

  1.) it is constant, known and compile time and can be put into a read-only segment (or possibly flash rom on an embedded system).

  2.) it is named keymaps. The name is important and should come early

  3.) it is an array. arrays and primitive datatype have many important differences and programming languages should not try to hide that.

  4.) it has dimensions 1 by 2 by 3 (in that order). Listing the "3" first in Rust when  the first dimension only has extend 1 might have good reasons but is damn hard to read if you have more than 2 dimensions. Especially if you end up with things like 3 by 3 by 4 by 3. Which of the inner two is larger?

  5.) Having the type of the element last makes sense, because in terms of memory layout that just means that we have 2 consecutive bytes. I also makes it easier to which from "a 1 by 2 by 3 array of u16" to "a 1 by 2 array of (three vectors of u16)".

Now you will probably give me reasons why I can't have that. But when I am coding I don't hard how hard it is on the compiler writers (as long as I can express things unambiguously), but want to have it as easy as possible so I have brain cycles to spare to think about data layout and algorithms.

cdirkx · on Nov 21, 2019

You can have anything you want with macros :)

I just wrote a macro `array` [1] that allows you to write

  #[no_mangle]
  static keymaps: array!{ u16[1, 2, 3] } = [
    [
      [1, 2, 3],
      [4, 5, 6],
    ],
  ];

or alternatively using a type alias

  type Keymaps = array!{ u16[1, 2, 3] };

  #[no_mangle]
  static keymaps: Keymaps = ...

On a more serious note, in general Rust favors explicit simple syntax: the only syntax related to arrays you need to learn is `[TYPE; LENGTH]` which is the way to write an array of type TYPE and length LENGTH, pretty straightforward. `[[[usize; 3]; 2]; 1]` is simply a composition of such arrays, as multidimensional arrays are just arrays of arrays.

C has a few more variants: the implicit length of `keymaps[]`, the `[0] = ...` initializer , the alternative `keymaps[1,2,3]` syntax. This is nice syntactic sugar, but you don't technically need it. Although if you really don't like the raw Rust syntax, you can always use macros like shown or a library like multiarray [2].

In a way I would say this makes Rust easier to learn: there are only a few symbols and patterns you need to learn to recognize, and the rest is compositions.

[1] WARNING: macro definitions are very symbol heavy, and thus even more unreadable.

https://play.rust-lang.org/?version=nightly&mode=debug&editi...

[2] https://docs.rs/multiarray/0.1.3/multiarray/

pcwalton · on Nov 22, 2019

The consequences of the lexer hack are hard on humans (typename, most vexing parse, order of declarations being significant, weird function type syntax), not just compiler writers.

Ar-Curunir · on Nov 22, 2019

You can already define a custom type which will allow you to have a nice syntax for multidimensional arrays: `Matrix<1,2,3>`. It solves your issue of nesting brackets, and you can impl arbitrary indexing for it.

kazagistar · on Nov 22, 2019

Unfortunately, rust does not have numeric types outside the special case baked in arrays, so it cannot do that yet afaik. There is a ticket for it, but it needs work.

steveklabnik · on Nov 22, 2019

You can sorta do it kinda today: https://crates.io/crates/typenum

But it will be much nicer and better once const generics lands, it's true.

Ar-Curunir · on Nov 22, 2019

You can do it in nightly already.

cdirkx · on Nov 21, 2019

Also, normally the array in Rust would also be a constant, with the keyword `const` instead of `static`: the reason it is static however, is so the C program can access it.

petschge · on Nov 22, 2019

I don't mind the const vs static. I mind just about everything else.

inferiorhuman · on Nov 21, 2019

Sure I can parse Rust. But it is definitely more complicated, and more "noisy".

I'm dabbling in some microcontroller stuff currently. The one thing I've noticed is that the Arduino (C++) environment seems to rely on magic. Lots of mysterious constants, registers, etc and it's not entirely clear what's what.

Meanwhile using rust in this environment is very explicit. It's quite a bit more verbose than the C++ version. I'm also sure some of this is due to me having to write the implementation itself but for me it's a lot easier to understand what's going on when things are nicely typed. It's the difference between:

  pinMode(LED_BUILTIN, OUTPUT)

which can be expanded to:

  PIO_Configure(
             g_APinDescription[ulPin].pPort,
              (g_pinStatus[ulPin] & 0xF0) >> 4 ? PIO_OUTPUT_1 : PIO_OUTPUT_0,
             g_APinDescription[ulPin].ulPin,
             g_APinDescription[ulPin].ulPinConfiguration ) ;

            g_pinStatus[ulPin] = (g_pinStatus[ulPin] & 0xF0) | PIN_STATUS_DIGITAL_OUTPUT;

and

  let mut pioc = p.PIOC.split(&mut pmc);
  let mut blue = pioc
      .pc25
      .into_peripheral_b(&mut pioc.absr)
      .into_push_pull_output(&mut pioc.mddr, &mut pioc.oer);

expanded to (I used a proc macro here):

  absr.absr().write(|w| w.#accessor().set_bit());
  oer.oer().write_with_zero(|w| w.#accessor().set_bit());

Specifically I really like that I can access the parts of the register by name and that access is typed. You can't write to a read-only register, you can't modify a write-only register, and if you don't have a default value defined you call something else (e.g. write_with_zero) that makes it clear what you're doing.

Edit: Another thing I really prefer over the rust vs Arduino/C++ API is that state is encoded in types. So you may have a GPIO pin PC25. But that type takes a type parameter indicating state e.g. PC25<PeripheralB<Output<PushPull>>>. Yeah that's verbose but it's also very explicit. If you have something (e.g. UART/USART driver) that needs a pin to be configured in a specific manner you'll have to go through some non-trivial effort to pass an incorrectly configured pin in. As a result if your program compiles you can be more confident that it will do what you expect.

liamdiprose · on Nov 22, 2019

I love the typed API svd2rust generates as well. Generally you can just let autocomplete do the driving, the only things needing a brain are the abbreviated register names manufacturers use and the order of operations needed.

I wonder if Rust would be better suited for Arduino/embedded beginners. Rust is quite painless when you just want to glue a few crates together. I'm sure everyone would rather debug a compiler error than some invalid memory issue happening on the microcontroller.

Avamander · on Nov 22, 2019

The "mysterious constants, registers, etc" are very readable once you get familiar with the agreed abbreviations and the MCU you're programming for (and how bit shifts work in the code example you gave). I really disagree that Rust is somehow more readable, especially not the example you brought. How can you say `pioc.oer` is somehow more understandable than any piece of that C code?

inferiorhuman · on Nov 22, 2019

How can you say `pioc.oer` is somehow more understandable than any piece of that C code?

Easily. pioc.oer tells you that you're using the PIOC peripheral, oer tells you're your manipulating the oer register specifically, and the mutable reference (&mut) indicates that you're modifying it (and the borrow checker ensures you're not going to be modifying it in two places at once).

Additionally the rust embedded folks have a practice of returning a "constrained" structure from an initialization function. Typically the configuration function will take ownership of the peripheral and then return a restricted wrapper around it. This means that if you're doing something that will result in immutable registers you'll get back a structure that doesn't allow you to modify those registers. So, for instance, trying to configure the watchdog timer twice on the MCU I'm using will not compile because you don't even have that original object around anymore. If the program compiles you're probably OK.

Nowhere in that C code is any of that referenced. There's no idea which peripheral is being manipulated. ulPin and ulPinConfiguration both expand to integers, so there's no guarantee you've even gotten the parameters in the right order. Likewise the shifting and masking is exposing unnecessary implementation details, and things that compile don't necessarily do what you think they may do (e.g. modifying immutable registers).

Avamander · on Nov 22, 2019

> There's no idea which peripheral is being manipulated.

The Arduino-y function usually maps internal pin mappings (and to the pin numbers on the given PCB. It's quite clear if you know what you're compiling the code for. Not to mention that how a pin-remapping function works usually doesn't matter, the resulting abstraction is very nice to use.

> pioc.oer tells you that you're using the PIOC peripheral, oer tells you're your manipulating the oer register specifically

And why should I care about that information when I already have the abstraction written before? The Rust code is much much worse in terms of ease-of-use in this case - manually having to look up how pins on the board map to internal registers is cumbersome. Some of the confusion about the names might also stem from that I expect consistent capitalization when dealing with registers, why would `pioc` be lowercase if it's in reality a register being modified? It's actually weird. I won't even begin how horrible-looking the "expanded" form is, compared to the 2-line C equivalent.

One more thing I just now realized, the Rust team made an incredibly bad decision picking the symbols, for example for mutable references. I don't see anyone with any good amount of C experience ever wanting to use Rust if they have to re-learn what `&` really means - useless waste of time for most. It's akin to designing a new safer bike but switching the handlebar direction. And in the end, the amount of symbols in Rust, combined with how annoying they're to type on non-US layouts, combined with the (wrongly) carried over connotations from other languages makes it a terrible replacement for what it's advertised for.

> So, for instance, trying to configure the watchdog timer twice on the MCU I'm using will not compile because you don't even have that original object around anymore.

That is very cumbersome and illogical. Reconfiguration is quite common.

> There's no idea which peripheral is being manipulated.

I think that's just your unfamiliarity with the platform and the example you chose. If you'd write the exact same code you brought as an example in C it'd be much clearer than the Rust code and just two lines. I'd love to see the asm of the Rust code.

inferiorhuman · on Nov 22, 2019

Your criticism (e.g. OMG wrong case, OMG C uses & to mean something else) seems mostly centered around the fact that rust isn't C and less around the merits of rust itself. But I'll bite...

Some of the confusion about the names might also stem from that I expect consistent capitalization when dealing with registers, why would `pioc` be lowercase if it's in reality a register being modified?

Typically in rust screaming snake case is reserved for constants. In this case, pioc is not a register (so there you go). In the context of the embedded stuff the peripherals typically get screaming snake case names at the top level struct. In this example I've configured it and assigned it to a local variable named pioc.

That is very cumbersome and illogical. Reconfiguration is quite common.

In the example I gave it's not possible. After an initial write to the watchdog's configuration register all subsequent writes are ignored by the MCU. That's the whole point of having compile time checks. If it compiles, it's probably OK. If it doesn't compile you're probably doing something that won't work or won't do what you expect.

If you were to take the example something that can be modified you'd still have the functions laying around to modify the register.

I think that's just your unfamiliarity with the platform and the example you chose.

I think you'd probably want to guess again. Which peripheral is being manipulated? What happens if I get a magic number wrong and that function operates on the wrong peripheral?

Avamander · on Nov 23, 2019

> seems mostly centered around the fact that rust isn't C and less around the merits of rust itself.

If a language is advertised as a replacement for C/C++ then one can reasonably expect there to be little that works counter-intuitively coming from the to-be-replaced language.

> In this case, pioc is not a register (so there you go).

"It's in reality a register being modified" is not the same as "it's a register".

> After an initial write to the watchdog's configuration register all subsequent writes are ignored by the MCU. If it compiles, it's probably OK.

Hardware isn't perfect. You brought up the immutability of the watchdog timer as a benefit, it really isn't, that's all I wanted to say. I also doubt that just a compiling piece of Rust can usually handle a hardware failure or an error.

> Which peripheral is being manipulated? What happens if I get a magic number wrong and that function operates on the wrong peripheral?

If you already have an Arduino-y abstraction then it doesn't matter in which language it's written, the same opaqueness would happen if you can't look up the pin mapping from documentation. If you really need to see which peripheral is being manipulated then it's not difficult to write two very simple lines of C to do the same thing just as clearly. You're comparing two very different things, it's just not a very good comparison.

rovolo · on Nov 22, 2019

I'm not sure this is the reason, but function types look better if you put the return type after the parameter types. Compare C to Kotlin:

    float (* f)(int);
    val f : (int) -> float;
    float[] (* map)(int[], float (*)(int));
    val map : (int[], (int) -> float) -> float[]

You could put the return type before the function, but then the type will look inconsistent with the function declaration:

    float[] map(int[] arr, (int) -> float f)
    fun map(arr : int[], f : (int) -> float) : float[]

So if you're passing functions around, it tends to look nicer if you always put the type after the variable/argument/function

pitaj · on Nov 21, 2019

Types go after identifiers because then the types can be left out when using type inference:

    let nums = Vec::new();
    nums.push(0_u32);
    // nums: Vec<u32>

zem · on Nov 22, 2019

i find the rust version way more readable; it's clear that it's an array of 1 (array of 2 (array of 3 u16s)), which corresponds to the way "multidimensional" arrays are actually laid out in c and friends.

petschge · on Nov 22, 2019

and why can't we write it like you just did, with outer to inner dimensions going from left to right, i.e. in the same order the indices go when we actually use elements?

pitaj · on Nov 22, 2019

Array filling shares syntax with array typing and array declaration:

    [0; 10] // ten-element array filled with zeros
    [0, 0, 0, 0, 0, 0, 0, 0, 0, 0] // equivalent array with ten zeroes typed out
    [i32; 10] // type of ten-element i32 array

Because the "filling" syntax is optional, it makes sense to place it after the fill value. The array typing syntax follows. Essentially you specify the value and say "copy this X times" (it only works with types that implement Copy IIRC)

https://play.rust-lang.org/?version=stable&mode=debug&editio...