The Linux Kernel Looks to “Bite the Bullet” in Enabling Microsoft C Extensions

netbsdusers · 2025-11-10T11:53:23 1762775603

If it's about "prettier code" then I think a number one candidate would be making bitfields more viable for use. It could make driver code much cleaner and safer.

Windows is only targeting little-endian systems which makes life easier (and in any case they trust MSVC to do the right thing) so Windows drivers make much use of them (just look at the driver samples on Microsoft's GitHub page.)

Linux is a little afraid to rely on GCC/Clang doing the right thing and in any case bitfields are underpowered for a system which targets multiple endians. So Linux uses systems of macros instead for dealing with what Windows C uses bitfields. The usual pattern is a system of macros for shifting and masking. This is considerably uglier and easier to make a mess of. It would be a real improvement in quality-of-life if this were not so.

You can also look at Managarm (which benefits from C++ here) for another approach to making this less fraught: https://github.com/managarm/managarm/blob/a698f585e14c0183df...

reactordev · 2025-11-10T12:10:15 1762776615

Does anyone big endian anymore?

mort96 · 2025-11-10T12:19:15 1762777155

PowerPC "supports" both, but I believe it's typically run in big endian mode. Same with MIPS AFAIK.

(Mini rant: CPU people seem to think that you can avoid endianness issues by just supporting both little and big endian, not realizing the mess they're creating higher up the stack. The OS's ABI needs to be either big endian or little endian. Switchable endianness at runtime solves nothing and causes a horrendous mess.)

sumtechguy · 2025-11-10T15:14:23 1762787663

Think windows only supported one machine type declared with big endian from what I can see in the docs with the PE format. https://learn.microsoft.com/en-us/windows/win32/debug/pe-for...

There may have been others in the NE format. Also pretty sure the older power pc mac 7/8 machines were big endian.

qdotme · 2025-11-10T12:57:13 1762779433

You could actually support both at runtime with both ABIs being available. This is done routinely on x86_64 with x86 ABI for compatibility (both sets of system libraries are installed), for a while I used to run 3 ABIs (including x32 - the 64bit with short pointers) for memory savings with interpreted languages.

IRIX iirc supported all 4 variants of MIPS; HP-UX did something weird too! I’d say for some computations one or the other endianness is preferred and can be switched at runtime.

Back in the day it also saved on a lot of network stack overheads - the kernel can switch endianness at will, and did so.

mort96 · 2025-11-10T15:10:17 1762787417

Are you advocating that Linux systems on PowerPC should have two variants of every single shared library, one using the big endian ABI for big endian programs and one using the little endian ABI for little endian programs?

Because that's how 32-bit x86 support is handled. There are two variants of every library. These days, Linux distros don't even provide 32-bit libraries by default, and Ubuntu has even moved to remove most of the 32-bit libraries from their repositories in recent years.

Apple removed 32-bit x86 support entirely a few years back so that they didn't have to ship two copies of all libraries anymore.

What you're proposing as a way to support both little and big endian ABIs is the old status quo that the whole world has been trying (successfully) to move away from for the past decade due to its significant downsides.

And this is to say nothing of all the protocols out there which are intended for communication within one computer and therefore assume native endianness.

fredoralive · 2025-11-10T12:54:37 1762779277

Linux has mostly transitioned to little endian on PowerPC. AIX remains big endian.

mort96 · 2025-11-10T15:13:58 1762787638

Oh, that's excellent news!

mghackerlady · 2025-11-10T14:23:57 1762784637

Also, Net and Open BSDs support big endian PowerPC (NetBSD supports big endian arm even)

mort96 · 2025-11-10T15:27:59 1762788479

What's the reason someone would willingly choose to run a big endian ARM OS?

yjftsjthsd-h · 2025-11-10T14:59:10 1762786750

> NetBSD supports big endian arm even

AFAIK, this is probably the easiest way to test BE on hardware (if you need that for some reason) - NetBSD on a Raspberry Pi running in BE mode is easy to use.

zamalek · 2025-11-10T13:05:16 1762779916

Linus hates big endian, and has some choice words to say about switchable [1]. This incarnation of Linus is certainly my favorite :)

[1]: https://lore.kernel.org/lkml/CAHk-%3DwgYcOiFvsJzFb%2BHfB4n6W...

mort96 · 2025-11-10T15:16:12 1762787772

Linus hates introducing a ton of complexity and opportunity for bugs for no upside. Pre-emptively adding runtime endianness switching to RISC-V when there's not even market demand for it 100% falls into that category. Adding runtime endianness switching to the RISC-V ISA also falls into that category.

Supporting big endian for big-endian-only CPUs does not fall into that category.

galangalalgol · 2025-11-10T13:27:59 1762781279

Linux still supports BE for several targets, his point, I think, was that no one ises risc-v as BE except maybe in an academic setting. I don't think llvm or gcc will even target BE, so not sure how they were going to conpule those mods anyway

netbsdusers · 2025-11-10T14:36:18 1762785378

Some firm called CodeThink have been instigating it for RISC-V lately: https://www.codethink.co.uk/articles/risc-v-big-endian-suppo...

fredoralive · 2025-11-10T14:50:39 1762786239

The idea of more big endianness in Linux wasn’t particularly welcomed by Linus Torvalds however: https://lore.kernel.org/lkml/CAHk-%3DwgYcOiFvsJzFb%2BHfB4n6W...

mort96 · 2025-11-10T15:20:15 1762788015

We just have to pray that the relevant standardization bodies recognise this for the terrible idea that it is and don't ratify it.

stonemetal12 · 2025-11-10T14:41:06 1762785666

ARM does both.

claudex · 2025-11-10T12:38:47 1762778327

z/Processor is big endian

1718627440 · 2025-11-10T12:50:57 1762779057

Note that these are not the Microsoft "C Extensions", but the "Microsoft C Extensions" of the GNU Compiler Toolchain. I doubt MSVC supports -fms-extensions.

fuhsnn · 2025-11-10T08:56:56 1762765016

> though some may feel the wrong way around Microsoft C behavior being permitted

The same extension can be enabled with `-fplan9-extensions`, might be more appealing to some!

tleb_ · 2025-11-10T09:58:19 1762768699

-fplan9-extensions adds even more, it is not an alias: https://gcc.gnu.org/onlinedocs/gcc-15.2.0/gcc/Unnamed-Fields...

One of the link of past discussions was from Apr 2018 and discusses it. At that time GCC -fplan9-extensions support was too recent (gcc-4.6) to be considered. https://lore.kernel.org/lkml/20180419152817.GD25406@bombadil...

Now the reasoning isn't present in the patch but it probably is because they want step increments and -fms-extensions is a small-ish first step. Maybe -fplan9-extensions could make sense later, in a few years.

molticrystal · 2025-11-10T13:53:55 1762782835

Plan 9 extensions would only require enough examples to justify and might not take years. Though your taking years assessment would be right if there's a dearth of kernel spots to add up where automatic pointer conversion for anonymous fields, or using the typedef name to access them, offer some improvement, not necessarily even a huge improvement.

Since with the Microsoft extension, it was just waiting until enough examples were woven into the discussion to overcome the back and forth that was preventing "biting the bullet".

Quarrel · 2025-11-10T10:05:10 1762769110

It certainly seems to me that using this would eliminate 75% or so of the objections to it.

For this use case, at least, it feels like a CS version of racism. MSFT is bad, so no MSFT.

It largely clears up an idiosyncrasy from the evolution of C.

(but, as someone that briefly worked on plan9 in 1995/96, I like your idea :)

wahern · 2025-11-10T11:32:54 1762774374

Can you confirm whether or not anonymous member structures originated with the Plan 9 C compiler? I know I first learned of them from the Plan 9 compiler documentation, but that was long after they were already in GCC. I can't find when they were added to Microsoft's C compiler, but I'm guessing GCC's "-fms-extensions" flag is so named simply because it originated as a compatibility option for the MinGW project, and doesn't by itself imply they were a Microsoft invention. GCC gained -fms-extensions and anonymous member structures in 1999, and MinGW is first mentioned in GCC in 1997. (Which maybe suggests Microsoft C gained anonymous structure members between 1997 and 1999?)

Relatedly, do you know if anonymous member unions originate with C++, Plan 9 C, or elsewhere?

dwattttt · 2025-11-10T11:52:09 1762775529

Archives of published MS SDKs show they were using the feature in NT 3.1's public headers in 1993, so it's at least that old.

https://archive.org/details/win32-sdk-final-release-nt-31

tleb_ · 2025-11-10T10:29:02 1762770542

Do you have references to objections? I couldn't find any on the lkml threads.

bitwize · 2025-11-10T14:42:26 1762785746

I can't wait for the kernel to support HolyC code.

unwind · 2025-11-10T08:24:08 1762763048

Huh. I thought the article was vague on what exactly these extensions permit, so I'd thought I'd look up the GNU documentation. Surprisingly, it [1] was rather vague too!

The only concrete example is:

Accept some non-standard constructs used in Microsoft header files.

In C++ code, this allows member names in structures to be similar to previous types declarations.

    typedef int UOW;
    struct ABC {
      UOW UOW;
    };

[1]: https://gcc.gnu.org/onlinedocs/gcc/C-Dialect-Options.html#in...

messe · 2025-11-10T08:34:06 1762763646

The important one is "Unnamed Structure and Union Fields"[1], in particular unnamed structs and union fields without a tag.

ISO C11 and onward allows for this:

    struct {
      int a;
      union {
        int b;
        float c;
      };
      int d;
    } foo;

In the above, you can access b as foo.b. In ISO C11, the inner struct/union must be defined without a tag. Meaning that this is invalid:

    struct {
      int a;
      union bar {
        int b;
        float c;
      };
      int d;
    } foo;

As is this: union bar { int b; float c; };

    struct {
      int a;
      union bar;
      int d;
    } foo;

-fms-extensions makes both of the above valid. You might be wondering why this is uesful. The most common use is for nicer struct embedding/pseudo-inheritance:

    struct parent {
      int i;
      void *p;
    };

    void parent_do_something(struct parent *p);

    struct child {
      struct parent;
      const char *s;
    };

    struct child *c;
    struct parent *p = (struct child *)c; // valid
    parent_do_something(p);
    c.i++; // valid

[1]: https://gcc.gnu.org/onlinedocs/gcc/Unnamed-Fields.html

RobotToaster · 2025-11-10T11:03:51 1762772631

Am I right to think this is really unobjectionable, and is only being objected to because MS "invented" it?

zinekeller · 2025-11-10T11:40:12 1762774812

border-box says hi [1]

[1]: https://www.paulirish.com/2012/box-sizing-border-box-ftw/

(Funnily, tables always default to border-box, so the objections in CSS standardization at the time is really silly.)

creshal · 2025-11-10T09:18:13 1762766293

Why is this still not standardized?

wahern · 2025-11-10T09:55:21 1762768521

The original proposal at https://www.open-std.org/jtc1/sc22/wg14/www/docs/n1406.pdf explains why.

> Some implementations have permitted anonymous member-structures and -unions in extended C to contain tags, which allows tricks such as the following.

  struct point { float x, y, z; };
  struct location {
    char *name;
    struct point; // inheritance in extended C, but
                  // forward declaration in C++
  };

> This proposal does not support that practice, for two reasons. First, it introduces a gratuitous difference between C and C++, since C++ implementations must treat the declaration of point within location as a forward reference to the type location::point rather than a definition of an unnamed member. Second, this feature does not seem to be used widely in applications, perhaps because it compiles differently in extended C vs. C++.

arguflow · 2025-11-10T08:40:51 1762764051

A really good example of it is in this lore thread here [1]. He explains it better than me so I'll just link it here

[1]: https://lore.kernel.org/lkml/200706301813.58435.agruen@suse....

dooglius · 2025-11-10T13:43:57 1762782237

It seems like the article is trying to suggest there is some drama here when the change is completely anodyne.

> barring any objections from prominent Linux kernel developers or Linus Torvalds himself.

Just like any other patch, is there any reason to think someone would be likely to object here?

MangoToupe · 2025-11-10T11:41:32 1762774892

This really speaks more to the inadequacy of C rather than Microsoft.

yinkindog · 2025-11-10T12:11:39 1762776699

Extremely tangential: I maintain some of Rasmus's code. I've never met the man. I'd heard that kernel programmers were the "rockstar programmers of rockstar programmers", but I only grok it now.

His code is so clear, clean, concise, commented it feels divine in comparison to the drivel I subject myself to daily.

piotrpdev · 2025-11-10T12:26:33 1762777593

Mind linking to some example(s)? Would love to see :-)

nurettin · 2025-11-10T12:54:37 1762779277

Tinfoil Hat Time: Microsoft is dropping windows and OS development, MS/Linux in the future.

You've heard it here first.

not_a_bot_4sho · 2025-11-10T14:51:28 1762786288

The future is now (actually a few years old):

https://github.com/microsoft/azurelinux

1718627440 · 2025-11-10T14:20:53 1762784453

But will it be Win32/Linux or GNU/NT ?

mrlonglong · 2025-11-10T09:15:23 1762766123

Microsoft "embrace, extend and takeover" comes to mind here. Caveat emptor.

juliangmp · 2025-11-10T10:41:24 1762771284

Compiler extensions have existed ever since C was standardized. I don't like Microsoft either but this really isn't a case of them acting malicious.

pezgrande · 2025-11-10T09:51:54 1762768314

Isn't this a case of Evil Linux embracing M$ in order to extinguish it?