> You could put a function's definition in every source file that needs it but that's a terrible idea since the definition has to be the same everywhere if you want anything to work. Instead of having the same definition everywhere, we put the definition in a common file and include it where it is necessary. This common file is what we known as a header.
In C and C++ parlance "definition" should be "declaration" and "implementation" should be "definition".[1] The terminology is important if you don't want to get confused when learning more about C and C++. This is compounded by the fact that some languages describe these roles in the author's original terms. (Perhaps the author's terminology reflects his own confusion in this regard?)
[1] This is indisputable given the surrounding context, but I didn't want to paste 3-4 whole paragraphs.
Thank you for the correction, you are entirely correct and it is a big oversight on my part. I tried to use as few technical terms as possible to make the article more approachable but ended up doing something worse: I misused a technical term which is misleading.
I will correct this as soon as I have some time to do so.
In extremely old code, I sometimes see people preferring to manually write declarations of functions (even libc functions) in every source file instead of including a header.
To add to the confusion, in certain cases, it is possible to put a function's definition in a header file (for example if it's a function template, or in an anonymous namespace, or the static keyword is used to indicate internal linkage). So it is possibly to write this function definition manually in every translation unit.
Otherwise, the ODR rule requires functions to be defined exactly once.
> One and only one definition of every non-inline function or variable that is odr-used is required to appear in the entire program (including any standard and user-defined libraries). The compiler is not required to diagnose this violation, but the behavior of the program that violates it is undefined.
Interestingly a form the "one definition rule" even applies to some of those functions that can appear in multiple translation units, specifically inline functions and template functions (not static functions or those in an unnamed namespace). In those case it says that they must be identical in all the translation units that they're defined in, so it's more like a "unique definition rule" for them.
This sounds like it would be easy – just put the definition a header file. But even if the text of a function is identical in different translation units, it can still be ODR-different between them if the symbols that they look up are different due to other header files included before them declaring different thing or doing things like "using namespace std". Argument-dependent lookup is especially dangerous here. As with other ODR violations, this causes undefined behaviour and compilers/linkers aren't required to issue a diagnostic (and they usually don't!). I believe C++20 modules will solve this problem.
The functions can also be different if different compilation units were compiled using different compiler settings (or event different compilers).
In Titus Winters' recent Pacific++ talk[1], he pointed out that even something as simple as including an 'assert' statement will violate the ODR if some compilation units are compiled with 'debug' settings and some aren't. This can easily happen with build systems that cache compiled object files if changes in compilations flags don't automatically invalidate the cache.
> even if the text of a function is identical in different translation units, it can still be ODR-different between them if the symbols that they look up are different due to other header files included before them declaring different thing
Exactly. This is why you should never put anonymous namespaces and definitions of objects with static linkage in an header file.
As a non c++ programmer, what I don't understand is why do people need to write header files by hand. Surely it should be auto-generated from the source code?
Remember you are writing your comment in 2018, what was feasible in 1975 was very different. Programmers back then were smart enough to pull the job off, but the computers they had were limited such that it wasn't worth it. First because the computers were limited programs had to be smaller so the benefit wasn't as great. Second computers were slower so it was worth spending extra human effort once to save a lot of computer effort.
C++ has been working on modules to fix this for years now. It turns out to be harder than you would think. Everyone agrees with the basic problems statement, but there is disagreement on the details. All sides have good points in favor of their approach, and there are some places where you have both as they are incompatible. Progress is being made (and a lot of the incompatibilities turned out to be solvable but it took years of thinking to come up with how)
Header-generator tools are out there, but I'm not sure how usable they are. It's rarely done though. It's a bit of a pain having to 'write everything twice', but there's more to header files than re-declaring what you've done in your source files.
As berti said: enums, constants, macros, simple struct type declarations, typedefs, and templates, are examples of hand-written code that might belong entirely in the header file.
It would be possible to keep these constructs in a hand-written header file while auto-generating the rest, but C++ developers aren't convinced this is worth doing.
Headers are used control visibility of declarations (note though that visibility of symbols themselves is controlled by other means), and typically contain more than just function declarations (structs, enums, etc.). There are basically two common patterns to split public and private declarations:
- public header for consumers of the interface, private header for internal use only
- public header for consumers, private declarations directly in the .cpp/.c file
A C++ header file is much more than just serving as an interface to an implementation. Sure it can be used in that way, but it doesn't have to.
Have you checked out the standard library's header files? If you do, you'll see they are full of templates, and constexpr functions. In some sense, the header file is where most of the code is.
Have you also checked out the concept of a single-header dependency? Due to the myriad different ways to manage dependencies, those dependencies that consist of just a header file is extremely convenient to have. This is the logical conclusion when you can put increasingly complicated things in header files.
If anybody is curious about how templates work, there are many ways, but the two most historically popular are:
Prelinker:
1. Compile each file, noting which templates are needed in a section in the object file
2. Have a special program called the "prelinker" run before the linker that reads each .o file, and then somehow instantiates each template (which usually, but not always requires reparsing the C++ file)
Weak Symbols:
1. When compiling instantiate every needed template, but mark it somehow in the object file as being weak, so that the linker only pulls in one instantiation for each definition.
The prelinker used to be more popular, as if you e.g. instantiate the same template in every single file, your compiler does tons more work with the weak symbols approach, but now weak symbols are popular both because they are much simpler to implement and the fact that compilation is usually parallel, while linker is typically not means that walk clock times may even be faster.
Nobody, AFAIK. Even with an explicit "export template", it's basically impossible because of of the interaction of all the features of the C and C++ and CPP parts of the language.
(Precompiled headers are/were are thing, but they're very brittle. Of course, one assumes you already know this, I'm trying to provide additional exposition.)
People started to use templates for metaprogramming and from that point on the scope for "reuse" of templates isn't really there. (Reusing parsing might be plausible, but it's really difficult because parsing is extremely context-sensitive because of SFINAE, #defines, etc.)
Some might comment that "modules" is "export template" all over again, but this time there are actually 2-3 implementations of 2-3 of the proposals and everyone is confident that the remaining minor problems can be resolved satisfactorily... and they're all exchanging experiences to help each other!
"Precompiled headers are/were are thing, but they're very brittle."
In compilers other than Visual Studio, yes (or so I'm mostly told - I don't have all that much experience with them), but msvc has had them since at least VS6 (late 1990's) when I first started using them, and they work very well and have saved me many, many hours since then. Maybe once or twice I had to delete the pch file in all that time, and that was most likely more of a issue of the GUI mangling the saved internal state than the actual compiler.
I've heard pushback against precompiled headers from Unix land for 2 decades, I'm not really sure where it comes from. I have the impression it's mostly cognitive dissonance - 'msvc has it and gcc doesn't, therefore it must be bad because gcc is 'better' than msvc'. It's similar to #pragma once - in use, it's objectively better in every possible way than include guards are, and gcc fanboys still dismissed it back when gcc didn't have them.
FWIW, gcc has had pragma once for ages. On the other hand recently we had issues with MSVC not recognizing that an header and its simlink were the same. GCC and clang had no problems.
There is a reason pragma once is not standardized. Defining when to include lines refer to the same file is extremely hard.
Prelinkers are dying, but 20 years ago they were the normal way of doing things. 30 years ago, Cfront had to use them because it was relying on existing unix linkers that did not have weak symbol support.
I am still not sold on templates. They look like they might help the Google's and Microsoft's but for most code bases they seem to be forcing a dual system of dependencies without much benefit.
EDG's frontend used to be prelinker based. I would guess that it supports the weak symbols method now, since they were involved in developing the itanium abi:
Yep, and EDG was the only frontend to implement "export template" (and Comeau was the only backend to implement it, so far as I know). I think that's the only reason why they implemented it this way - outside of template export, there's no particular reason to do templates like that. The other technique, with folding duplicate sections in object files, is necessary for inline functions anyway, so might as well use it for templates...
That's backwards if my memory is correct. Borland compiled a copy of the template for each file[1] you compiled, but cfront tried to compile each template only once.
Cfront also basically never worked well.
1: the technical term is "compilation unit" since #include means you're never compiling just one file
Headers allow you to ship a binary without the full source code. If you want to build a linux kernel module you don't need all of the linux sources, just the headers.
In principle you're right, but the notion that header file exposes just the "interface" is completely false. Class definition, private variables and functions, etc. are all exposed in header file.
Header files are not a way to only expose the interface. You give up a lot more in C++.
I've never had to deal with pesky header files until I started developing C and it immediately struct me as a royal pain in the ass. Even after couple of years of developing C/C++, I find the whole concept of header files archaic. Include preprocessor directive literally copy pastes stuff with no intelligence what-so-ever. The user is now burdened to ensure #includes are guarded to what I call a patchy half-baked hacked up solution - #IFDEF/#DEFINE/#ENDIF and #pragma in C++.
It should be handled automatically by the compiler/preprocessor or IDE and I believe it is now being addressed in the C++17/20 spec with the advent of "modules". This thing should have been written up way back in 1989.
> In principle you're right, but the notion that header file exposes just the "interface" is completely false.
Sorry, but it is not "completely false". Doing it properly requires a carefully designed interface to hide internal data structures, and splitting out the end user headers from the internal headers, but it works.
> I've never had to deal with pesky header files until I started developing C and it immediately struct me as a royal pain in the ass.
It's like saying, "I've never had to deal with pesky .py files until I started developing Python."
> Doing it properly requires a carefully designed interface to hide internal data structures, and splitting out the end user headers from the internal headers, but it works.
Out of curiosity, how? (If pointers or other mechanisms for memory indirection are allowed then it's pretty easy, so let's agree to ban those.)
Not a rhetorical question, despite the parenthetical.
Because I want full encapsulation without giving up by-value semantics. I want my clients to be able to take objects of some opaque record type that I've defined, and put them entirely on the stack, or in some contiguous block of memory entirely of their making, without ever getting to know the constituent fields of the record. I want it all, I want to have my cake and eat it too.
When the parent (er, great grandparent?) says their parent is wrong to deny that headers only reveal the interface, I think it's a little disingenuous to base that on an unrevealed assumption that PImpl is in play. PImpl is basically the the idiom that begot Java. One reason I might opt for C++ over Java, is a desire for finer control over the location of memory -- but it's important for me to know that to actually get that, I'll probably have to sacrifice information hiding. It's a trade-off. Yes, on some level it's better to have the option to make that trade-off, but the product here is "encapsulation or value semantics", not "encapsulation and value semantics".
Mind, I'm not a C++ developer. Maybe these days link-time heroic optimization makes the "right" decisions and collapses these kinds of indirections in all the sorts of situations you'd want it to. I write a lot more Java, and my understanding is that HotSpot gets up to a lot of heroics pertaining to this stuff these days -- I've noticed HotSpot will churn through a workload involving processing a collection of records far more quickly if you can arrange for it to stream through an array, even if you'd expect the records to be scattered randomly throughout memory.
this code can crash on platforms where unaligned access is not allowed. you need an alignas on public Bar. and of course the catastrophic bugs if the size of opaque is not kept in sync properly. the point is: the module system should be doing something like this for you behind the scenes.
> I want my clients to be able to take objects of some opaque record type that I've defined, and put them entirely on the stack
if you want to put the object on the stack, the compiler has to know the size of the object to reserve enough space on the stack. How can it know the size of the object if it does not have its full definition somewhere ?
Module symbol table for example, where only the compiler can actually see the complete information about a type, although the consumer code can only access what is exposed as public.
This is fine if you can recompile a program against a new version of the library, but if you just want to relink it doesn't work well. This is actually while common when a .so is replaced with a newer version in a system without rebulding the world.
Some languages have, like ADA I think, have first class support for runtime sized, stack allocated types, so it might work there.
Sometimes you need to. You cannot entirely stack allocate an object that uses PIMPL. Also you cannot allocate an array of such objects compactly in memory.
On the other hand, if you want to be able to evolve the class member variables but still maintain a stable ABI, you need to hide the memory layout, for example with PIMPL.
But this is a C++ limitation. For example Objective-C* (and also soon Swift) allows modifying the class layout, adding properties etc, without changing the ABI.
They must have some level of indirection in the resulting code to accomplish that like virtual inheritance. There is a price and this can be done in C++ too, but you have to opt into the cost.
Actually not, it works a bit differently.
In Objective-C 1, there was no indirection and one had to explicitly use the equivalent of PIMPL to hide private members from the header or avoid the fragile base class problem.
In Objective-C 2 the object meta-data contains a table of instance variable offsets. The dynamic linker can modify this table at load time so you can freely add both instance variables and methods to new revisions of a class.
So what is the deal? Well, when the holder object itself is heap allocated, pimpl is inefficient because every access will require dereferencing two pointers.
Also you cannot put protected or virtual members in the internal pimpl class (then there would be no point to have those in the first place).
That being said, it is not like Objective-C is some pinnacle of performance - you cannot allocate objects on the heap, and the compiler doesn't perform any devirtualization. So for performance critical code you have to drop down to C or...C++ :)
You can split "secret" data structures into another header, and them simply not give that secret header to the customer. If you give the binary and a header with some structures missing, the user can still use whatever you did give them.
I mean header files are C and C++ source files, so saying you didn't have to deal with them until using C and C++ is a tautology. Likewise, you won't have to deal with .py files until you're using Python.
> In principle you're right, but the notion that header file exposes just the "interface" is completely false. Class definition, private variables and functions, etc. are all exposed in header file.
If you don't want to expose class internals, just use the PIMPL idiom. It's an extra indirection to protect your own abstraction, so naturally C++ decides to opt for performance by default.
That’s an annoying workaround at best. It’s not necessary in C. It wouldn’t be necessary in C++ if you could declare incomplete class types but still declare their member functions. This would have zero performance impact and improve encapsulation.
This is more common than `typedef void*` in my experience, because it actually provides some minimal amount of type safety.
You can also do this in C++ of course, but you can't use member functions this way. My real complaint here is that the Pimpl idiom in C++ is more cumbersome than a simple forward declaration and free functions, which is available in C.
In principle I agree. An IDE should be able to automatically generate something like a header file regardless of language.
But still, no IDE or tool I've seen does this better. A header file gives you a very nice overview, and it really isn't any hassle to speak of to keep them up to date.
It takes getting used to (as with everything when starting out with a new language), but before you know it you might even start to miss them in other languages.
Header files absolutely are a hassle to keep up to date. Besides, they're a terrible experience. You have to pay attention to include order! You have to write forward declarations! You have to write include guards, in 2018!
They're also bad as an overview. They may have been good in the 1980s, but nowadays a proper documentation generator gives you better formatting, search features, and cross-references. Inline documentation is particularly obnoxious in header files: I like seeing function-level documentation alongside interface and implementation, but if I put the documentation in both the .cpp and the .h file it's duplicated in two places and easily gets out of date.
The order of headers mattering is very rare though and not indicitive of normal C++ code. It's rare, but not unheard of unfortunately as most C code is also available to C++.
#pragma once is not in any C++ standard, although basically all modern compilers support it. The reason it is not in the standard is that its behaviour in the presence of symlinks may differ depending on the filesystem and compiler. Is it the unique combination of filename + content that should be included once or is it the file path? etc.
> The reason it is not in the standard is that its behaviour in the presence of symlinks may differ depending on the filesystem and compiler
I have the feeling that the difference matters to almost nobody and should be avoidable by the few people it actually affects. I had more issues with colliding include guards than I had with symlinks and I still end up replacing CLASSNAME_H with PROJECT_CLASSNAME_H in our own headers every now and then since the autogenerated guards are too naive.
> You have to pay attention to include order! You have to write forward declarations! You have to write include guards, in 2018!
Include order is a valid complaint, the others truly are non issues.
I mean, we have other more pressing issues. Such as the fact that garbage collection is still a thing in 2018.
> ...but nowadays a proper documentation generator gives you better...
Examples of that?
> I like seeing function-level documentation alongside interface and implementation, but if I put the documentation in both the .cpp and the .h file it's duplicated in two places and easily gets out of date.
This is the only thing that I dislike about header files. And it is a constant reminder how bad IDEs are at handling this.
Doxygen can generate a list of all class members, including those from all ancestors. If you wanna do that checking header files you'll have to keep jumping between files going from parent to parent class.
... doxygen, no. I'd rather take header files than doxygen.
I hope he meant something else since he said "nowadays". Doxygen is ok if you haven't setup your build environment. But for when you actually are in the code (and somewhat familiar with it) I very much prefer just reading the header files than switching to a browser.
> Header files absolutely are a hassle to keep up to date.
This happens because there is some redundancy between the header file and the source. You can often arrange your code so that there is no redundancy (or almost zero).
> But still, no IDE or tool I've seen does this better.
Why aren't Delphi, Oberon, and Modula-3 immensely better? Each file is an independent module except the one with main. Then the compiler quickly figures out in one pass what all the function signatures are. The minor drawback with this is that the one-pass method requires you to pay attention to declaration order, though a two-pass method renders that approach irrelevant. The IDE has all the symbolic goodness you need, and no goddamn header files. Also it's still easier than dealing with managing the dependency graph yourself as is necessary with C file include order.
I say this as a guy who enjoys C and has done way more work in C than Delphi, but give credit where it's due.
Because there is no separation between API and implementation? Header files are great, you don't have to wade through implementation details or documentation to figure out what is API and what isn't, even more so with good use of pimpl. You hardly need separate documentation tools, because the header (when structured well) is just the signatures and documentation. Templates muddy this a bit, but you put those in .inl files and it all works conceptually very pure again.
Headers (or at least 'separate specification of API') are/is great, and I miss it dearly in other languages.
Sure, there are a bunch of tools for pretty much every language, but I don't want to deal with that, nor switch between browser and editor for it. I know people complain all the time about headers, I was just saying I love the separation, even if it's an historical accident.
I started with Apple ][ Basic, but then moved to Turbo Pascal 3.0, 4.0 and 5.0 and enjoyed quite a lot the "Unit" system. But now came the realization that it was not possible for one to do cyclic references - e.g. "Unit A" refers to symbols from "B", and vice versa... Not that it's a good idea, but "C" linkage is okay with this, and you can split things much easier, thus probably ruining things in the long term... but sometimes it can help...
Ha, I must've forgot about it completely! Thank you! And yes, I've never read the manual (unlike the tons of "C" books that followed). Pascal was simply thought orally, and on your own (back then), and that was the case for quite a lot of apps....
And all I've seen are quite inferior to the header file. They are a nice addition and can be used for quick navigation. But it does not diminish the usefulness of the header file.
Separation of module interface and module implementation is useful, but headers are a horrible way to do that. For an example of how it can be done right, look at Borland Pascal dialects (with "interface" and "implementation" section in each unit, where "interface" can be extracted if desired), or Ada, or ML.
Modula-2 does it better. Each module split in two files - definition and implementation. Unlike C++ header files, def files are compiled rather that inlined, so they don't have a potential to produce different result for each source file that references them. The result is 100% reliable dependency graph across all source files, no need for a makefile, and blazingly fast compilation (each def file is only parsed and compiled once).
That has not been my experience in 35 years of dealing with headers in C and C++ in large-scale software projects. Ruby, Swift, and even Python (or Pascal if we want to go way back) are significantly easier to deal with for anything sophisticated.
Yeah, it can only be swapped out at link time (without crazy magic). That satisfies two very common cases though, when there is only one implementation (and the interface is a sacrifice to the OOP gods) and when the second implementation is for unit testing only.
Those 2 plus other crazy rules (like only one interface defined per file) mean that many projects will end up with just as many interface files as a c++ project would header files.
The section describing how a function call is made appears to be slightly incorrect.
The return value of the `add` function, in most ABI definitions, would be stored in a register. After that, the `main` function may then copy that value to its own space it has reserved on the stack.
This is at odds with the description in the article, which seems to describe `add` passing its return value to `main` via the stack.
(This is assuming no optimizations - all this would most likely be inlined anyway, with no function call.)
I've been looking for something like this to send to some C++ newbies. This is almost what I need but not exactly.
Is there something similar that explains how libraries work?
Libraries are just a bundle of .o files (e.g. run "ar t /usr/lib/x86_64-linux-gnu/libpython2.7.a" on an ubuntu with devtools installed and you'll see all the .o files that are in libpython2.7).
The special thing about them though is that the linker will not include any .o files for which no symbols are referenced.
You can think of the typical linker algorithm as follows:
If I am missing a symbol X scan through all not used objects in each library until you find it, then include that .o file. Now check if any symbols are missing again and repeat until done.
> Now check if any symbols are missing again and repeat until done.
That properly describes linking using --start-group/--end-group. Without those flags, the process is closer to "look in the first .a file for definitions of all currently undefined symbols; then look in the next .a file for definitions of remaining undefined symbols; etc.". The difference becomes apparent if you link chains of libraries in the wrong order, or if you have cyclic dependencies between libraries; normally they will not be resolved unless you use the grouping flags. (But really, you should avoid making such cycles in the first place!)
Static libraries are more or less that yes (at least, that mental model is sufficient for pretty much all day to day use of them), but dynamic libraries aren't, at all.
As an aside, I found it very weird that the OP claimed that 'nobody uses static linking'. Wut? Static linking is everywhere (so is dynamic linking, of course, but implying that one is merely a quaint remainder from the past is just so odd).
There has been a heavy bias towards shared objects in glibc and the GNU userspace utilities for a long time. Today that mostly means dynamic libraries (there are also statically linked shared objects, but those are mostly a quaint reminder of the past).
I don't hate dynamic linking as much as I used to, but it is still a peeve of mine. Linux is one of the most backwards compatible kernels, but the heavy usage of dynamic linking means that a linux program from 20 years ago either won't work at all, or will be buggy.
A statically linked program from 20 years ago will work (but possibly won't have sound; though it's possible to get either alsa or pulse audio to emulate OSS and then even sound will work).
That's a very simplistic and error-prone description of what a library is supposed to be. Even in high-level descriptions, it's very important to be aware of the fundamental differences between static and dynamic libraries, and how they are integrated in the build process, which has a fundamental role in basic tasks such as deploying applications.
There is a really good paper by Ulrich Drepper on dynamic linking with ELF and glibc (+ a bunch of other important stuff if you're writing libraries) [1], but it's way too low level for what GP wants. I'm sure there must be something simpler around but I haven't seen it.
I was hoping in the context of discussing this article, it would be clear what a .o file is. Discussing libraries without that knowledge is hard, so i just have to assume the person asking the question knows it.
.a is an extension for one type of library, sorry if that confused you, but it can be ignored for the purposes of this explanation.
And as we know this has a side effect on C++ with globally constructed (but unreferenced) objects. These would get "dropped" essentially when linking, and they come from a library.
> Is there something similar that explains how libraries work?
Check out this HN discussion on an ebook on linking and loading. The book itself is a treat, and the discussion around the book is very informative as well.
"Basically, the compiler has a state which can be modified by these directives. Since every .c file is treated independently, every .c file that is being compiled has its own state. The headers that are included modify that file's state. The pre-processor works at a string level and replaces the tags in the source file by the result of basic functions based on the state of the compiler."
I almost sort of get what the author means here but then I don't really. I mean, there is no 'state' for the compiler that is modified by precompiler directives, so this is probably an analogy or simplification he's making here, but I don't really understand how he gets to the mental image of 'compiler state'. Why not just say it like it is: the preprocessor generates long-assed .i (or whatever) 'files' in memory before the actual compiler compiles them, the content of which can be different between compilation units, because preprocessor preconditions might vary between compilation units?
It's neither an analogy nor simplification. When he speaks about "compiler state" he means "preprocessor symbol table state". When preprocessor processes a file its state is mutated -- symbols get defined, redefined or undefined.
What you propose as a replacement (an in-memory file) does not provide any insight into why the same file preprocessed twice may end up looking different or why order of included files matter.
Well in that case, I guess it's a definition thing. When I teach C++, I find it much more useful to make a clear separation between 'preprocessor' and 'compiler', and not make the preprocessor part of the compiler and then make the... uh... 'actual compiler' also part of the compiler.
When you take the preprocessed state of a compilation unit, by having the preprocessor write it out to disk, and show someone what the effects are of passing one or the other -D flag, or change the order of includes - that directly and concretely shows what is going on. And then this preprocessed file is passed on to the actual compiler. There is a clear separation between stages, easy to understand, and useful to boot when the time comes you have to debug an issue related to it and you want to look at the preprocessed file to see what's going.
Bit of a side-question, but somewhat related. Is anyone working on "whole program compilation"? I don't mean whole program optimisation, I mean an attempt to read all files for a given target in memory at the same time and then generate all translation units in one go (all in memory? and maybe linking them in memory too?). Clearly, there would be caveats (strange header inclusion techniques relying on macros to modify text of include files would break, gigantic use of memory and so forth), but for those willing to take the risk, presumably this should result in faster builds right?
In fact, ideally you'd even generate all binaries for a project in one go but that may be taking it a step too far :-)
At any rate, I searched google scholar for any experience reports on this and found nothing. It must have a more technical name I guess...
The terms you're looking for are "single compilation unit" or "unity build".
It's used sometimes, I think mostly to help the compiler optimise better.
Build times for a full rebuild may be faster, but may not, since traditional builds can use many CPU cores. However, it stops incremental builds from working - if you modify one source file, you have to recompile everything.
Or a compile server / incremental compilation. Tom Tromey worked on supporting this in GCC years ago, and blogged about the roadblocks that he met on the way. I don't remember the details, but eventually the project was abandoned.
It might still be interesting to read through this stuff--throw "tromey gcc compile server" at a search engine and see what comes up.
We do "kind of that": Our source is split across various cpp files to keep it well organized. Our build scripts then generates a single cpp file with ~30 #includes.
Compilation of the 11Mb object file takes about 30s and requires at most 1GB memory (the lib is not that huge, 20kloc), but I think when we started on it, the build took a few minutes for the lib alone. So we save some dev time, but building all unit test executables still takes an additional 1m30s. So that's only a minor improvement. But I think the real gain is a much better optimization (the architecture of the lib is great to maintain and bugs are at least critical or might even have a lethal impact; there is a lot of potential for inlining/LTO).
I don’t think that article’s accurate. At least not anymore. Modern C++ compilers do less while compiling, and much more while linking. This allows them to inline more stuff and apply some other optimizations.
I hope someday the build and linking process could be standardized, but I don't believe it will happen, because many members of the committee come from Microsoft, Google and other tech giant who want to sell their compiler (or give it for free, but still).
There are too much Interests and the standardization would kill many of them
In C and C++ parlance "definition" should be "declaration" and "implementation" should be "definition".[1] The terminology is important if you don't want to get confused when learning more about C and C++. This is compounded by the fact that some languages describe these roles in the author's original terms. (Perhaps the author's terminology reflects his own confusion in this regard?)
[1] This is indisputable given the surrounding context, but I didn't want to paste 3-4 whole paragraphs.