What science can tell us about C and C++'s security (2020)

jmull · on Aug 2, 2021

> In conclusion, the empirical research supports the proposition that using memory-safe programming languages for these projects would result in a game-changing reduction in total number of vulnerabilities

There are some missing pieces here. A software rewrite introduces a lot of new bugs. Take one of these OSs. They’ve been fixing security bugs for decades, ~30% of which are non-memory safety related. A rewrite in a truly memory safe language will have zero memory safety-related bug but will include a substantial number of new, non-memory-safety security bugs, the kind it took decades to remove from the original code.

I think there would be a long term payoff, but considering the resources required, it’s unclear if a rewrite effort could even succeed at all. Realistically, you’re going to end up forking, with some development happening on a rewrite fork with low feature velocity and other development happening on a higher feature velocity fork with no rewrite effort. With low features and years with actually more security bugs, I’m not sure the rewrite fork would even survive.

“using memory-safe programming languages for these projects” is easy to say, but what’s the actual path that can be followed from here to there?

alkonaut · on Aug 2, 2021

“Using” memory safe languages as in selecting a language for green field projects vs “using” to mean “rewrite in” are two very different propositions. I don’t think many would consider rewriting a million line code base in a different language without other motivation than fixing memory safety. However all large code bases have some modularity and modules are added and rewritten. At that point the decision to use a different language is easier. That this now happens in Linux and elsewhere is a clear sign of the times.

pylua · on Aug 2, 2021

I principle I agree with this . However , if companies really believe there is zero tolerance for security issues , the only way forward is to, no matter the cost, proceed forward with a way that will minimize security problems. Otherwise we are just paying lip service to security .

andi999 · on Aug 2, 2021

If you want zero security issues, probably Ada with spark is the way to go.

AnimalMuppet · on Aug 2, 2021

What you say is true, and I'm not sure that "rewrite everything" is a valid approach. But if you're starting something today, you might take this data into account when choosing the language...

ExtraE · on Aug 2, 2021

Yes. I don't think rewrite everything was what the original author was arguing for, either.

AnimalMuppet · on Aug 2, 2021

But someone will argue for it, so the warning not to is worthwhile.

threatripper · on Aug 2, 2021

I don't quite agree with you. Rewriting software in general removes a lot of security related bugs because the new design is more clear. Of course some new ones will be introduced and need to be weeded out but I think those kind of bugs are the minority and can be weeded out in a fraction of the time it took for the original software.

On the other hand it introduces a lot of usability bugs. Many things are they way they are because of many bug reports of many users that all have their own workflows and usage patterns. These kinds of bugs are harder to weed out. Also they can lead to early refusal by the users of the old software.

Then there is the Version-2 effect of packing in too many new features and deliberately introducing fundamental changes that break everything at once. This, in my opinion, is the worst about a complete rewrite.

jstimpfle · on Aug 2, 2021

c.f. "Second System Effect". How many examples of successful rewrites of large systems can you come up with?

The new design will only be better if there were actual lessons learned, which probably at least requires that it be done by the same people.

Also, compared to incremental development I can attest that a big-bang rewrite takes a lot of grit to do all the grunt work again, replacing perfectly working and now rather boring infrastructure. It's very hard to suffer through it, even if some other parts could be substantially better.

mamcx · on Aug 2, 2021

> How many examples of successful rewrites of large systems can you come up with?

Considering the size of the software industry... a LOT.

I have done a DOZEN of rewrites from langs to langs and each (bar one) have been successfully.

But success is not proclaimed, how failures are...

---

A good rewrite is just a good refactoring but with a higher chance of success. A bad rewrite is mostly an amplification of larger issues in the org/group, lack of experience (yeah, your first one is likely to fail!) or lack of clear goals.

Also, rewriting a terrible designed software to another language without fix the terrible designs WILL be terrible.

---

On this matter, considering the calibre of the people that build C/C++ things at the core AND the calibre of Rust/ADA and other few tools + the enormous resources of the companies behind them, a rewrite must be successfully. BUt is not, and when you zoom at the postmortens, how much is truly to blame for the langs or tools? Not much.

* And if the tools have a big part of blame, is beause was not the correct tool for the job... so, is part of the larger issues in the org/group"

jstimpfle · on Aug 2, 2021

> I have done a DOZEN of rewrites from langs to langs and each (bar one) have been successfully.

How large were these systems? How many developers? How long did the rewrite take?

> A good rewrite is just a good refactoring but with a higher chance of success

Are you saying it's not a restart from scratch? In that case I don't call that a rewrite. Maybe a "rewrite of this or that module" which is more likely to succeed.

mamcx · on Aug 2, 2021

Most have been mainly me. I normally was hired and get stuck doing this kind of stuff until the other team was supporting the old project. Along the way was the one that push source control, tickets system, crude CI to the larger team...

Maybe being solo or being part of a small team is good after all. A rewrite is normally called because of a BIG mess, and most of that is simplify stuff.

My niche is "enterprise" software, where is HIGHLY dependent in niche details from non-developers, so having not clarity of what the software actually do, why or how, and neither the people that use it understand it. Zero or useless documentation, and all the bad red flags you think off.

2 have been large-ish (I think from the POV of most people in my country are large but not from the POV of people that work in companies like google).

One was in a startup (at the time one of the largest paying customers of Google cloud). The other was in succession of failed attempts between maybe 2 or 3 groups of developers (the one before me was a big consultant company with a large team that left me with a BIIIIG mess!).

The longest rewrite was 2 years (from FoxPro to .NET, an ERP, 1-5 devs, only me all the time p.d: just the core, the rest was later). This was a near failure that was turned around.

The one were we failed was by lack of properly understand the severity of downtime for the customers and not provide a transparent migration, in real time, of the data. The cost in days - for the testing, upgrade and all that) cost them more money than they pay for the new software.

The company I work back them get broke working for the government in another contract and the customer was not willing to recover from it. Is certainly the most sad history of my career because it was not that impossible to turn the ship around - the software was ready, only need to retrofit how move the data without downtime-, but everything go under in no time...

mamcx · on Aug 2, 2021

> A good rewrite is just a good refactoring but with a higher chance of success >Are you saying it's not a restart from scratch? In that case I don't call that a rewrite. Maybe a "rewrite of this or that module" which is more likely to succeed.

Not, the rewrites have ben from scratch, including from LANG-A to LANG-B.

What i mean is that the properties of a successfully refactoring/rewrite are similar. Only that with a rewrite you can go DEPPER and WIDER in what you change/improve.

For example, when I rewrite certain program that uses Access/Excel just using PostgreSQL was a massive plus, not just on speed and reliability. A refactoring probably will only go for internal cleaning but with the rewrite I slash a lot of moving parts.

btilly · on Aug 2, 2021

Ironically the second system effect was named by Fred Brooks in The Mythical Man-Month. It is ironic because his background was heading development of a large, and ultimately successful, second system. Namely IBM's OS/360.

Moving on, Brooks correctly identified that third systems do much better than second systems. (The pattern is that the first system suffers from your general incompetence, the second system tries to over-abstract and shove too many features in there, and the third system is the first where you can strike a pragmatic balance.)

Still Mozilla/Firefox was a rewrite of Netscape, Python 3 a rewrite of Python 2, Doom a rewrite of Wolfenstein 3D, and so on.

Many companies have a history of their third version of software being significantly better, meaning that they survived the first and second systems to get there. For example the first successful version of Windows was Windows 3.1.

There is another kind of successful rewrite, which is a copycat product. For example MS Excel is a rewrite of Lotus 1-2-3 and Microsoft Word is a rewrite of WordPerfect. And, to keep this from just being picking on Microsoft, Google's Dalvik was a rewrite of the JVM. (And then Oracle sued...)

AnimalMuppet · on Aug 2, 2021

> Rewriting software in general removes a lot of security related bugs because the new design is more clear.

That seems unduly optimistic. A re-write may keep the same design because it's known to work. Or it may try to make a better one. That new design can easily fail to actually be better. You can have the version 2 effect in the design, not just in the features.

Zababa · on Aug 2, 2021

That might be a very naive view of things, but can't you extract a test suite of the current system (including non-regression tests) to use it to develop the new system? Or do it part by part, ship of Theseus style?

Const-me · on Aug 2, 2021

Not sure that gonna work.

Operating systems abstract away hardware differences, and providing portable APIs to userspace processes. Internally, OS kernels are spending non-trivial complexity dealing with random hardware shenanigans. Can't unit test them.

babatuba · on Aug 2, 2021

I totally agree with this sentiment. It is probably better to make robust code-quality software for scanning and eliminating these vulnerabilities. Taking C as an example, it is a simple language so probably also the easiest language in common low-level use to be able to parse and detect serious issues.

The benefits to that are more advantageous given the sheer amount of code on systems.

abainbridge · on Aug 2, 2021

> I posit that the second set stays the same size: there’s no reason or evidence to think that porting C++ to a memory-safe language results in additional SQL injection.

I think I disagree, although I prefer C to C++ these days. I posit that memory safe programming languages are more complex than C and sometimes cause code that would be simple in C to be written in a more complex way to allow the memory safety to be maintained. I posit that complexity is one of the causes of security vulnerabilities.

ksml · on Aug 3, 2021

Can you give an example?

abainbridge · on Aug 8, 2021

I can't think of a good one. Graph-like data structures is a common example of something that's a bit difficult in Rust - see https://stackoverflow.com/questions/34747464/implement-graph.... But the answers there seem reasonable and I'm not a Rust expert. I guess my point is that nobody would even need to ask this question if the implementation was in C.

gizmo · on Aug 2, 2021

> In conclusion, the empirical research supports the proposition that using memory-safe programming languages for these projects would result in a game-changing reduction in total number of vulnerabilities.

I think this is too strong a conclusion. How many of the major data leaks and ransomware attacks exploit memory safety issues? Not many. The bulk of them target misconfigurations, vulnerabilities caused by bad text-based protocols, logic errors in software, and social engineering.

That's not to say memory safety doesn't matter, and you can get very pernicious and subtle bugs when you get too clever in C-based languages. That said, the languages that boast about their memory safety are written in C/C++ (python, ruby, Java, llvm) and run on operating systems that provide process isolation with memory safety written in C/C++ on top of hypervisors which are also written in C/C++.

You can argue, as the article does, that use of C/C++ inevitably results in many memory safety issues, and that therefore we should use memory safe languages. Except this doesn't take into account the entire categories of vulnerabilities that have been entirely eliminated because of good C/C++ abstractions like process isolation, virtual memory, filesystems, tcp-ip, hypervisors, and so on. But we take these luxuries for granted, and the benefits they confer become invisible.

I think there is a much more mundane lesson here. Good abstractions prevent entire classes of vulnerabilities, and bad abstractions are leaky no matter how careful you are. C and C++ are pretty bad languages insofar they give you limited options for building good abstractions, but with careful programming it can be done and much of the best and most reliable and most complex software is written these low level memory unsafe languages and all major security advancements we've actually made in the real world are still implemented in memory unsafe languages.

pjmlp · on Aug 2, 2021

> That said, the languages that boast about their memory safety are written in C/C++ (python, ruby, Java, llvm) and run on operating systems that provide process isolation with memory safety written in C/C++ on top of hypervisors which are also written in C/C++.

Not all of them, and it is more a consequence of building on top of existing tooling instead of doing everything from scratch.

D was originally implemented in C++, nowadays the reference compiler is pure D.

Go was originally implemented in C, nowadays the reference compiler is pure Go.

C# and VB.NET were originally implemented in C++, nowadays the compilers are pure C# and VB.NET, while Microsoft keeps porting the runtime from C++ to C#, with each release.

Java compilers like JikesRVM, MaximeVM and GraalVM are implemented fully in Java.

Just like bringing new parties into power, first accept the existing forces, then rebuilt the system once enough parliament seats have been assured.

jstimpfle · on Aug 2, 2021

> I think there is a much more mundane lesson here. Good abstractions prevent entire classes of vulnerabilities

This is how I try to write C, in the end there there is very little "unsafe" code in it - few complicated pointer dereferences, few dynamic allocations, almost no transfer of ownership anywhere. No callbacks, and requiring e.g. a ref-counted handle is very rare.

Mostly function calls, pointers only valid during function call (or make a copy, e.g. ID strings), creating the data in the module that understands it, using suitable allocation strategies and centralized resource release. Asynchronous messaging instead of hard to follow callbacks (which break the flow of execution).

The current code I have has grown to be devoid of dynamic memory allocations simply for performance/latency reasons. It now has good MISRA compliance without ever trying hard. There have been a few bugs (including very silly and obvious concurrency bugs) but due to the app being so "fixed" those have always turned up pretty quickly (so far) and have been easy to spot and fix.

For larger projects, I feel like the value of bottled abstractions is more and more decreasing. To connect things in an efficient and maintainable way all these layers have to be unwrapped.

dcow · on Aug 2, 2021

The Rust compiler essentially automates all this. Nobody is saying you can't write good C. The argument is that the data suggests a memory-safe language eliminates an entire realm of possibilities available in C. This is an attempt to push the general impression out of "hypothetically safe languages are safer" (which tends to be the talking point ad nauseam in language flame wars) to "we've established theory that safe languages empirically result in fewer CVEs by at least 65%". Safe Rust doesn't really let you introduce concurrency bugs either, which you admit show up even in well abstracted C.

kelnos · on Aug 2, 2021

> Nobody is saying you can't write good C.

I would say that, honestly.

Or at least: I don't think anybody can reliably write non-trivial programs in C that 100% of the time avoid the kinds of memory-safety bugs that Rust protects against at compile-time. And that's what matters. Humans are fallible, and everyone makes mistakes sometimes.

UncleMeat · on Aug 3, 2021

I 100% agree with this. I don't believe that a single human on the planet can write a nontrivial C program that processes untrusted input and contains no security vulnerabilities on their first attempt.

And this is a much much much easier task than having a team develop and maintain such a system over time.

jstimpfle · on Aug 2, 2021

> Safe Rust doesn't really let you introduce concurrency bugs either, which you admit show up even in well abstracted C.

It was really obvious stuff that was immediately reproduced though. Related to a refactor taking into account some learnings that restricted concurrency almost to a single implementation file.

My conjecture is that if code is littered with synchronization primitives, the structure is wrong. A system that automates the painful part of getting the use of synchronization primitives right is possibly rewarding bad structure, at least in the short term.

dcow · on Aug 2, 2021

I think it really depends on the project/application. More often than not you should probably have a framework for handling concurrency (whether you've written it yourself or someone else), sure. But sometimes you find yourself writing/maintaining the framework, too.

I've experienced times where, when writing concurrent code, I thought what I should be doing was okay because I would have done the same thing in C but Rust stepped in and said "actually I can't allow you to do that, Dave". I don't know if you've ever experienced something similar but it's pretty awesome when it happens.

jstimpfle · on Aug 2, 2021

I don't disagree, TBH I find stuff like memory models kind of mind bending. Concurrency is easy to get wrong, that's why it might be best to avoid (centralize) it. That's what Golang did I think.

I'm not up to date with a lot of the high-tech SMP stuff that is found in e.g. Linux - I'm sure they have a lot of direct memory sharing, I found sth in their Red-black tree for example, and for sure there a lot of spinlocks, schedule calls, and what not. I wonder if there has been a trend of turning back the explicit-synchronization dial a little, even there? It would make sense to me, memory has been becoming slower and slower (in terms of latency not performance, and relative to increasing CPU throughput), and there seems to be a trend for more asynchronous (non-blocking, message passing) stuff like io_uring.

tene · on Aug 2, 2021

Could you explain a bit more about how you see Golang improving this?

For me, Go is trickier to use concurrently than Rust, and requires keeping track of more details you need to manage yourself without compiler assistance or checking.

For example, the default data structures like Map are not thread-safe, but you'll only discover that you've used a Map from multiple threads without synchronization when you actually have a concurrent modification at runtime, which might not show up at low load during testing.

You're also responsible for explicitly locking (and unlocking) any synchronization primitives before accessing a synchronized resource, and ensuring you don't hold any references to synchronized data past the unlock.

With Rust, this is all encoded in the APIs, and verified by either the type checker or the borrow checker.

Since I need to deal with Golang at work, I've been trying to get less frustrated with it, so I'm really interested in hearing perspectives about what people appreciate about Golang, or what it does right or does well.

jstimpfle · on Aug 2, 2021

I've never used Golang myself, but their mantra is "share data by communicating, instead of communicating by sharing data".

This "communicating" is batteries-included through Golang's channels.

dcow · on Aug 2, 2021

In practice, in my experience at least, pure channel implementations are rare. It's rather hard to do (although I'll admit that was one of the most fun things for me about Go: "how can I solve this problem purely using channels?"). Sadly much Go code is far from the mantra. I've even heard seasoned Go programmers pass along the "wisdom" if you're using channels you're doing it wrong" (implying you haven't built the correct abstraction). Most Go I've encountered sadly uses shared memory and waitgroups and run_once. I think there are performance reasons for this too or something "channels are slow".

mike_hock · on Aug 2, 2021

Unfortunately, a large amount of code (even programs that are specifically built as a security feature), aren't written like that, so CVEs about banal buffer overruns and double frees that would have been trivial to prevent with proper abstractions keep popping up and somehow we think this is OK.

easterncalculus · on Aug 2, 2021

> The bulk of them target misconfigurations, vulnerabilities caused by bad text-based protocols, logic errors in software, and social engineering.

Unfortunately, often these are not called "vulnerabilities", or rather when people talk about "vulnerabilities" they are referring to bugs like these which are not often the main attack vectors used in real data breaches.

This is a semantic terminology thing and not really useful to most people, and is part of the reason why the conclusion is a little overblown, because even though it might cut down on the number of CVEs, the reality is that it probably won't have much impact on the amount of data breaches and ransomware attacks. It's much more commonly started by phishing and with less frequency well-known vulnerabilities of all classes - memory corruption being only a fraction of that fraction. It's just not super relevant to security anymore, at least in comparison to what is costing people billions of dollars a year.

pclmulqdq · on Aug 2, 2021

Unfortunately, CVEs come up when security researchers find something interesting to publish, whereas actual breaches usually come from a hacker exploiting some sysadmin's poor configuration.

NotSwift · on Aug 2, 2021

From An Empirical Study of Vulnerabilities in Cryptographic Libraries[0]:

> Among our most interesting findings is that only 27.2% of > vulnerabilities in cryptographic libraries are cryptographic > issues while 37.2% of vulnerabilities are memory safety is- > sues, indicating that systems-level bugs are a greater secu- > rity concern than the actual cryptographic procedures.

These libraries were developed by security experts and even they made some serious errors.

[0]https://arxiv.org/abs/2107.04940

l33t2328 · on Aug 2, 2021

> that use of C/C++ inevitably results in many memory safety issues, and that therefore we should use memory safe languages. Except this doesn't take into account the entire categories of vulnerabilities that have been entirely eliminated because of good C/C++ abstractions like process isolation, virtual memory, filesystems, tcp-ip, hypervisors, and so on.

I don’t see your point here. Yes, C/C++ abstractions like process isolation, virtual memory, etc. are good, but languages like Rust don’t eliminate them.

babatuba · on Aug 2, 2021

> C and C++ are pretty bad languages insofar they give you limited options for building good abstractions

Which abstractions can you not create in C++? The only clumsy abstraction I can think about is algebraic data types—and C++ is not a functional language. C++ has an extremely powerful type system for any abstraction you can imagine. But that is also part of the problem :)

Personally, I think C is a better language for lower level development, as it is so simple that upon creating well established practices they are much easier to follow. The idea that codebases become over 10 million lines of code in each of these cases maybe is a sign that there are other architectural issues of scalability. I'd be curious to compare something like OpenBSD and look at _WHY_ those codebases have less vulnerabilities? I think there's a lot more to this than language choice. Change velocity is of course one of them—maybe we need to slow down? It looks like OpenBSD had only 1 vulnerability last year:

https://stack.watch/product/openbsd/openbsd/

I would like to see a simple language like C with Rust memory safety guarantees. These modern languages literally try to shoe-horn EVERY feature from EVERY language so they can appeal to the lowest-common-denominator. But what do you get with that? A systems-level Java?

dcow · on Aug 2, 2021

I like this style of discourse. Here's some data, here's how you could prove me wrong, let's talk when you discover additional data. It's too easy to say something like, "well I see your data but since companies care about features not bugs and we can't rewrite everything in safe languages ... and I know 3 people who can write safe C ...". A statement like this does not disprove the author and in a way actually detracts from the discussion. Normally you'd need a moderator to keep people on the rails. It's neat to see the person arguing lead with their impression of what could further the discussion and invite others to participate logically.

pjmlp · on Aug 2, 2021

Lets also not forget that AT&T tried to fix C with Cyclone, and lint was already available in 1979, so they definitely knew what child they have placed into the world.

account4mypc · on Aug 2, 2021

I wonder if there is another way to partially explain these results:

- many (most?) programmers were taught C/C++ for several decades

- some programmers taught themselves rust because they were driven and interested

- these rust programmers are probably better programmers on average

- they wrote (or would write) better code that is somewhat elegant. and they have the advantage of starting with code that has had its memory bugs fixed over the last few decades

So I wonder if you had taught the masses rust starting in the 1980s... would the masses have found other way to write buggy/crappy code?

(I realize this could sound a bit condescending; fwif I did not teach myself rust so i'm not counting myself in the 'better' group)

vzaliva · on Aug 2, 2021

The `eval` part is total rubbish. The author draws a connection from memory safety to eval. For example both Rust and Haskell, are "memory safe" languages without `eval`. He is probably thinking about intepreted languages like Lisp.

thidr0 · on Aug 2, 2021

At least in C++11 and later, many classes of these memory bugs are eliminated with more modern container and pointer types. It’s not uncommon to have a company policy of not using “new” or “delete” anywhere.

pclmulqdq · on Aug 2, 2021

Equating C with modern C++ is a common sleight of hand for rust evangelists. Most modern C++ projects with a fresh codebase have almost 0 use of new or delete. It turns out that C++ is a lot better than it used to be 10 years ago.

pjmlp · on Aug 2, 2021

The problem is which modern C++ projects?

Android source code is definitely not one of them, and yet Google as ISO C++ contributor should know all about modern C++, right?

Ah, what about Microsoft and their UWP code samples for C++ developers, or the C++/WinRT based libraries?

As advocates from C++ Core Guidelines, surely those samples will be perfect examples from modern C++, right?

Or what about Bloomberg, with heavy contributors like John Lakos?

Maybe they are still in the process of adopting C++11 and C++14, while writing a book about language adoption issues.

I like C++ a lot, but we really need a compiler switch to turn off compiling Vintage C++, it would be marvelous.

pclmulqdq · on Aug 2, 2021

Every single codebase you have just cited here (except for the nebulous "advocates from C++ core guidelines") is older than the Rust language. Look at high frequency trading code or maybe recent game engines for good C++ projects - I don't know if any of these are open source.

I 100% agree with breaking C compatibility, breaking ABI compatibility, and turning off vintage C++.

pjmlp · on Aug 3, 2021

C++/WinRT is from 2015.

C++ Oboe SDK for Android was initially released in 2018.

Android Games Developer SDK was just released last month.

What game engines? Lumberyard , Unreal and Godot certainly not.

Vulkan C++ bindings from Khronos or Dawn for WebGPU? Also not.

The only modern C++ I can find in real life are at CppCon and C++Now talks, and books promoting it.

Even Bjarne Stroustroup has recently on a interview mentioned how he is disappointed how C++ Core Guidelines keep being largely ignored by the community.

There is even, yet again, a paper from him trying to advocate for better C++ on the next C++ mailing.

est31 · on Aug 2, 2021

C++ is definitely better but it's still not memory safe. Compared to Rust, you still have little tracking which thread has access to which variable at which time. Even in modern C++, you still have to care about iterator invalidation.

mhh__ · on Aug 2, 2021

It's far from done, but the GCC static analyzer actually can find iterator invalidation!

UncleMeat · on Aug 3, 2021

Some of it. Because a sound analysis would throw FPs up too frequently, they make the logical decision to use an unsound analysis. This is helpful, but cannot prevent the entire class of issues.

UncleMeat · on Aug 3, 2021

Having no instances of new or delete does not, in any way, prevent the entire class of memory vulnerabilities. Running off the end of a buffer when processing untrusted data is just as easy. Heck, you can still absolutely get UAF issues even if you never allocate on the heap simply by holding a reference to a stack allocated object past its lifetime. Given how weird the rules around lifetime extension are, this can happy is really really subtle ways.

C++11 is not a safe language. Not even close. It is much much much better than what came before, but it is not safe.

elteto · on Aug 2, 2021

While unique/shared_ptr alleviate some of these issues the STL is still full of UB though, and that can’t be fixed easily.

mcguire · on Aug 2, 2021

What proportion of C++ code is "modern"?

FartyMcFarter · on Aug 2, 2021

It's still quite easy to have memory safety bugs in modern C++ though.

For example, std::string_view is basically a pointer, as soon as it points to a string that went out of scope you're in trouble if you use it again.

pjmlp · on Aug 2, 2021

Another one is std::span, that contrary to Microsoft's gsl::span, WG21 decided it was a good idea not to do bounds checking.

jstimpfle · on Aug 2, 2021

The C++98 (Win32/MFC) codebase that I ocassionally touch has a lot of ill-designed abstractions in it, and is full of potential memory problems, but at least one can halfway see what's happening, and a full rebuild of the 30 years old codebase can be done in < 10 minutes.

Not sure if it's worse than the impossible to grok and slow to build C++11+ codebases that I've seen - everything is wrapped in unique_ptr and shared_ptrs, add lots of unused overloaded constructors and methods for every const and copy/move/value construct situation, then add an icing of templates. The trend is to assume that problems are solved by wrapping everything in more layers. But it seems like this ends in maybe fewer memory problems but also a lot less useful functionality, makes it a lot slower to build, and makes it so much harder to add, change and fix stuff.

The best code I've seen uses very, very few C++ features (if any at all) and just gets things done in a straightforward way without celebration.

Veserv · on Aug 2, 2021

The problem with this sort of analysis is that they are discussing improvements to average commercial software, but, from a security perspective, average commercial software is atrociously terrible and grossly inadequate in any non-trivial security context. To achieve a game-changing reduction in the total number of vulnerabilities and make them barely adequate would require improvements on the order of 10 to 100 times. A mere 100% improvement does not even move the needle when you need to improve by 10,000% to get to acceptable. It is far more reasonable to look at the existing security-critical systems and designs that already achieved outcomes 100 times better than average commercial software such as those certified to Orange Book A1 and identifying the high ROI improvements that can be cost-effectively retrofit to lower security designs. This is far more likely to result in good outcome than stacking 30% improvements onto a completely inadequate foundation in much the same way it would be easier to retrofit a luxury car design to create a cheap car than to enhance a go-kart design.

mamcx · on Aug 2, 2021

I work in this space, and agree with "average commercial software is atrociously terrible and grossly inadequate" not just in security but any metric you can think off.

However, this is the thing: A lot of security is related to improve in how deal with your inputs/outputs. And a lot of how you improve that is to make better designs in your classes/structs/enums/functions/etc.

In a language like Rust, people think the "borrow checker" is the only(major?) reason of the improvements, but working in this niche, just the fact I must model everything in terms of structs/enums/traits have "fixed" tons of stuff without even trying. And I get for free faster execution and half-resource consumption (or less!) that lower the bills!

Yes, still need to worry about security and safety, but the internals get nearly fixed by consequence of the design choices of Rust. For me, the borrow checker is a small thing in the whole picture, and is instead the rest that pay off much more ROI.

hollerith · on Aug 4, 2021

>people think the "borrow checker" is the only(major?) reason of the improvements, but working in this niche, just the fact I must model everything in terms of structs/enums/traits have "fixed" tons of stuff without even trying.

Sure, but Haskell had structs and enums (called algebraic data types) and traits (called type classes) (and pattern matching) in 1988, and even the people who think that Haskell should be used much more often than it is will concede that it is not sensible to use Haskell on many projects on which Rust would be a sensible choice (implementing a web browser, for example).

In other words, there are probably good technical reasons you weren't using structs, enums, traits and pattern matching before the invention of the borrow checker.

mamcx · on Aug 4, 2021

I use F# before Rust, and find it much more approachable than Haskell. Somehow Haskell look pretty interesting (when I read long ago http://learnyouahaskell.com yet unable to figure how use it, when with F# was almost immediate. Is like how I get Vue yet not Angular...

And in the case of Rust, it fit even better than F# (to me). I write "better" functional code in Rust than in F#, and the fact I can easily do small imperative code here and there is a big plus.

Anyway, I know most features of Rust are not innovative per se, but find the way is mixed pretty easy to grasp.

pjmlp · on Aug 2, 2021

Average commercial software is what 99% of developers get to write, the amount of places for the mythical elite 10X developers that always do everything perfect is very tiny.

oscargrouch · on Aug 2, 2021

I pity companies that will follow this hype train and rewrite perfectly working software to achieve nothing in the end, from a pragmatic point of view.

You can do an essay for every other vector of software development and claim how a rewrite will do you wonders:

Eg: FP, strong type system, Linux...

The problem is, in real life, we have a bunch of a problems to solve from many vectors, not just (the little square) of memory safety related bugs.

And people from the real world tend to measure those vectors together for their particular problems and often their answers, even if they go for C++, are a better suit to solve their issues than a objective-imperative from a person who looks only from one perspective, and think that is 'the most important thing®' that should be pursuit, risking avoidable financial bankrupts (by threatening financial bankrupt).

Think for a minute if the native access of a giving library, say, LLVM or Tensorflow is much more important for the project and/or the company survival..

Its like Bush's "War on terror" making people feel fear about some boogeyman but actually inflicting the real terror they said you should trust them to avoid.

Specialized, cartesian division of technology tend to turn us all into neurotics, and the problem with neurotics is that they don't see the bigger picture which is much more complex and systemic as things in real world tend to be.

Contrary to the extremists views, modern C++ is a fine option and people should not go for this FUD tactics, making they fear to opt for a perfectly nice platform to develop things, and that has proved itself over and over through time doing amazing things.

Rust might be a fine option too, but this whole urge to rewrite C++ software is pure non-sense.

https://www.joelonsoftware.com/2000/04/06/things-you-should-...

jart · on Aug 2, 2021

[flagged]

johntortugo · on Aug 2, 2021

I tend to agree with you. Some people add "Science" to the title just because 1) there are some kind of statistics on the post and/or 2) to make the post more appealing.

quelsolaar · on Aug 2, 2021

Buffer overflow is not a bug, its the consequence of a bug. Unsafe languages don't inherently have more bugs, they have more dangerous consequences of bugs.

If you want to reduce the consequences, there are a number of things you can do. If you want to reduce the number of bugs, you should look at what the cause of the bug is, not the consequence. C is a small language about moving stuff around memory with pointers, so the consequences of bugs are commonly going to be a wrong read/write. That fact says very little about what the mistakes actually are.

einpoklum · on Aug 2, 2021

You can tell the author will not address the matter seriously already from the title: First, talking about "Science" in general seems like grandstanding; and it immediately becomes clear that the author is citing some industry surveys regarding people's perceptions of the cause of bugs. So it's not like "Science has spoken".

Second, C and C++ are very different languages.

Thirty years ago, you could have put them in the same basket; but times have changed. C++, with its standard library, has advanced to a point where issues such as use-after-free and double-free disappear - not through programmer discipline, but by the programmer simply not allocating and freeing memory themselves. See this post:

"Why doesn't C++ have a garbage collector?" https://stackoverflow.com/a/48046118/1593077

Moreover, when you're dealing with low-level and system software, especially OS kernels, it is usually either impossible or impractical to use a higher-level language, with a large interpreter or an even-larger virtual machine. And when the "safe languages" are used for such tasks, one often goes into unsafe mode, like Java's JNI.

Now, it's quite possible that using Rust in more lower-level settings will result in fewer memory bugs. But the author has overreached with a flamebait of post and a claim.

pjmlp · on Aug 3, 2021

Ideally yes, unfortunely too many people keep writing C+, not C++.

Trying to prevent it seems to be a quixotic endevour other than migrating to something else (doesn't need to be Rust), unless C++ vendors finally adopt some kind of -fvintage-c++=off compiler flag.

einpoklum · on Aug 3, 2021

Well, the C++ coding guidelines project is focused in particular on static analysis measures, and there is quite a bit of collaboration from IDE developers (not sure about compiler authors).

So, a "vintage C++ off" is actually less unrealistic than you might believe. In particular, I see IDEs and compilers easily shifting to making it annoying for people to make raw allocations for example.

Still, point well taken about people tending to write "C+". They then also complain about how modern C++ is complicated and they can't "see what's going on" etc.

pjmlp · on Aug 3, 2021

Watch the latest Bjarne interview where he shows his disapointment for the C++ coding guidelines being largely ignored by the industry.

https://www.youtube.com/watch?v=ae6nFZn3auQ

Basically unless you are using Visual C++, there is little chance the C++ shop will have much C++ Core Guidelines love, and even on Visual C++, stuff like "borrow checker lite" are mostly broken beyond basic examples.

MaxBarraclough · on Aug 6, 2021

Is there at least a lint tool for checking compliance with the Core Guidelines?