Boeing identifies new software problem on grounded 737 Max

MDWolinski · on Feb 6, 2020

All software is buggy. The problem with the MCAS system is that pilots were not informed that it was there, nor were they given a way to override it and take full control of the airplane. Also, while the MCAS system relied on two sensors, if either failed, the MCAS system itself failed, so there was no built in back-up for it.

Bugs in software happen because situations where they arise are sometimes hard to predict. You can test your software all you want but it's not until it's in the field that you start discovering new issues because people tend to do things in ways developers didn't consider.

Tesla's software has over a billion miles of data on it and it still has issues in some basic functionality. And let's not talk about Iowa which in itself was a major failure in software release management.

Stierlitz · on Feb 7, 2020

@MDWolinski “.. while the MCAS system relied on two sensors ..”

MCAS used only the one sensor, this decision made so as to avoid recertification.

http://www.b737.org.uk/mcas.htm

“Are we vulnerable to single AOA sensor failures with the MCAS implementation or is there some checking that occurs?”

https://www.aviationtoday.com/2019/11/02/boeing-ceo-outlines...

mnm1 · on Feb 7, 2020

Not all software is buggy to the point of killing almost four hundred people. Comparing some shit app some interns built for Iowa with avionics software is frankly insulting to the people who work hard to make avionics software. The same goes for Tesla. The avionics industry, including Boeing, used to have a great record in this area. Even if the mcas bugs were unavoidable, the fact still is that the design was fatally flawed due to either sensor being a single point of failure. And of course, the main problem that the whole, entire airplane is unstable in the air. How can you still make excuses for Boeing at this point in time? The only reason this bug should be irrelevant is because this plane should never carry another commercial passenger. But I'm sure profits will prevail over lives once again starting this summer or whenever the FAA gives their go ahead.

catalogia · on Feb 7, 2020

> And of course, the main problem that the whole, entire airplane is unstable in the air.

That's not really true. The airframe is fine, except it doesn't handle like a 737. MCAS was meant to make the MAX handle like a 737.

Mentour Pilot, a 737 instructor with a youtube channel, has covered this fairly extensively: https://youtu.be/TlinocVHpzk?t=951

mnm1 · on Feb 7, 2020

Great video. But he says the exact same thing I did. The MCAS is necessary because of the different engine placement. So the airplane cannot recover from a stall without it. That, to me, makes the entire airplane unstable and improper for commercial flights as this is an expected condition at times that the plane should be able to recover from. The airplane cannot function without a deeply flawed software system no one understands and no one knows how to operate. Changing the software doesn't change any of these things.

dschuler · on Feb 7, 2020

It is possible to recover from a stall without MCAS, but the handling characteristics have changed. The requirement is that the force on the yoke increases with the angle of attack, but in the case of the MAX, the stick force becomes lighter at a certain point. As I understand it, Boeing was mainly trying to avoid training pilots for any changes.

catalogia · on Feb 7, 2020

> That, to me, makes the entire airplane unstable

That's fine, but know that the way you're choosing to use the word "unstable" doesn't square with how it's actually used in the aviation industry.

mnm1 · on Feb 8, 2020

Ok. I should have said completely unsafe for commercial passenger use instead.

m4rtink · on Feb 7, 2020

There is one specific Iowa which (if it was still in service) could become very deadly if it was using a buggy app.

linuxftw · on Feb 6, 2020

[flagged]

muldvarp · on Feb 6, 2020

> The MCAS system didn't fail. It operated exactly as it was designed, it received faulty sensor data which in turn flew the plane into the ground.

You're technically right, but I'd say that incorrect sensor data is something that should be taken into account. And failing to take that into account is a failure on part of the MCAS.

kayfox · on Feb 6, 2020

It did take it into account, but the incorrect angle reporting was not severe enough to result in the sensor being marked as faulty. From the sound of things it was designed to deal with large bit flips, sensor jamming the the home position or sensor misalignment (which is at fixed angles because of the mounting holes) rather than the sensor jamming in other positions or being horribly miscalibrated.

The company that repaired the sensor used in the Lion Air Flight did loose their license to repair any and all certified avionics parts and is probably out of business at this point. Donno about the technician that forged the post-install tests of that sensor.

salawat · on Feb 6, 2020

The architecture of the system was such that no cross-checking could occur. There was a feature to implement an AoA disagree light, but that only came into play if an optional package was purchased.

The Lion Air crash data had the computer registering an angle off by a ridiculous degree that should not have even been attainable by flying one of these planes in the manner for which it was designed.

The Ethiopian FDR showed even more ludicrous measurements as I recall. It wasn't just a case of "It wasn't bad enough", it was a case of the computer believing the plane was belly flopping into the air stream (75 degrees AoA!).

Note the Space Shuttle only hits around 40 degrees on it's reentry in order to keep the wind hitting the thermally insulated bottom, and that only at Mach 10+.

There is no excuse for any software engineer/computer scientist to ignore the basic physics of the domain they are coding for to the point they cannot identify when a critical measurement is so far out of whack as to be instantly dismissed.

FDR graph: https://visualapproach.io/wp-content/uploads/2019/04/Data_Sh...

pdonis · on Feb 6, 2020

> I'd say that incorrect sensor data is something that should be taken into account

I think that is the GP's point: MCAS operated exactly as it was designed, and it killed people because it was designed incompetently, not taking into account things that obviously should have been taken into account.

jahlove · on Feb 6, 2020

Is the fact that MCAS only relied on data from a single AOA sensor (when two existed) not a defect?

salawat · on Feb 7, 2020

Multi-sensor inputs are not required for systems whose failure impact isn't rated as catastrophic, which Boeing specifically avoided in their safety analysis, ostensibly due to the fact they knew the FAA would require sim training for a multi-sensor system. So if you accept Boeing's original classification it is not a defect.

Given that upon further testing after these disasters, the results required them to reclassify MCAS and the Flight computer it runs on as a single, potentially catastrophic point of failure, it is now a defect.

Just wanted to point out that regulationwise, that situation evolved over time due to withholding or material omission of critical information with regards to the nature of the system to regulators.

Whether or not a jury will agree is yet to be seen.

jahlove · on Feb 14, 2020

> So if you accept Boeing's original classification it is not a defect.

Then it's a design defect. The FAA even seems to think so. It seems like hair splitting.

sopooneo · on Feb 7, 2020

I think saying the system didn't fail relies an overly tight notion of the MCAS system boundary.

carefulfungi · on Feb 7, 2020

> The MCAS system didn't fail. It operated exactly as it was designed, it received faulty sensor data which in turn flew the plane into the ground.

"It worked on my airplane"

pfundstein · on Feb 7, 2020

> All software is buggy.

Assuming that's not hyperbole and just to be pedantic:

  mov ax,cs
  mov ds,ax
  mov ah,9
  mov dx, offset Hello
  int 21h
  xor ax,ax
  int 21h
  
  Hello:
    db "Hello World!",13,10,"$"

tantalor · on Feb 7, 2020

Bug report for you...

Expected: "Hello, World!"

Actual: "Hello World!" (missing comma)

See spec: https://en.wikipedia.org/wiki/%22Hello,_World!%22_program

pelliphant · on Feb 7, 2020

Don't know why I thought this was so funny, but I did. =)

kalium_xyz · on Feb 7, 2020

Sad thing we have little insight on microcode and the exact way machine code gets interpreted, for all we know there might still be a resulting bug from any step asm to execution.

kayfox · on Feb 6, 2020

I think the main thing we are seeing here is hundreds of smaller fixes that usually form the steady stream of Airworthiness Directives that an aircraft currently supported by the manufacturer sees turning into a news event every single time one comes out.

So far only one "aircraft" has had perfect software, and that was the Space Shuttle, every single other aircraft out there has had software issues that are worked out over the life of the aircraft, just like every piece of software, even that which has very strict testing regimes, has had defects in it.

joncp · on Feb 6, 2020

That's just a scale issue. The Space Shuttle only flew 135 times, so those one-in-a-million corner cases never really had a chance to happen. If it were to fly millions of missions like the 737 fleet has, then bugs would surface for sure.

kayfox · on Feb 7, 2020

The software quality of the space shuttle is much higher than commercial aircraft.

https://www.fastcompany.com/28121/they-write-right-stuff

blattimwind · on Feb 6, 2020

> So far only one "aircraft" has had perfect software, and that was the Space Shuttle

Actually it had 3+ known bugs.

rootbear · on Feb 6, 2020

And shortly after the code for the Apollo Guidance Computer was put up on Github, someone found a bug!

kayfox · on Feb 6, 2020

Drat, we have no known examples of perfect software.

radiorental · on Feb 7, 2020

10 goto 10

glitchc · on Feb 7, 2020

Great for those cold winter nights!

Will probably be optimized out by a modern compiler though. Sad.

0xffff2 · on Feb 6, 2020

True enough for modern commercial airliners; I can't help but point out that plenty of aircraft have "perfect" (read: non-existent) software. Those aircraft too generally have issues that are worked out over their lifetime. Software seems to be especially error prone, but maybe that's just because the mechanical engineers have a head start of several hundred years.

kayfox · on Feb 6, 2020

Honestly the mechanical engineering is also error prone, its just got margins that make it hard to mess up. There are instances where it does mess up, like the long list of cargo door issues through the 80s, the 737 rudder issue (as well as lots of other less famous hydraulic servo issues), and other issues.

bronco21016 · on Feb 7, 2020

That’s what I love about the older aircraft that are more mechanical. When it’s broke it’s obvious it’s broke. The mechanic comes out and can physically identify a part that’s broken and replace it. Problem solved and on you go.

With all of these aircraft that are so computer reliant it becomes this magic box that is nearly impossible to diagnose and fix quickly. You do the circuit breaker reset, then reset the whole jet, then check the connectors of the components of the system, then change some of the computers/controllers, all the while checking for any fault code that might lead you down the right path.

This process often takes 30-60 minutes by which time you’re boarded and ready to go and if it’s not fixed by then it turns into getting everyone off the aircraft and finding a different aircraft so the broken ship can be taken to the shop and a through investigation of the issue can be done.

Meanwhile the customers riding in the MD88 already had their mechanical part replaced and they’re on their way, none the wiser because the mechanic got it diagnosed and replaced before boarding was even done.

swiley · on Feb 6, 2020

I really wonder about these large engineering corporations, Toyota seems to have similar problems with software.

Part of me feels like many of these companies don’t keep code secret to protect IP, instead they do it because they know it’s a burning train wreck and don’t want people to find out.

salawat · on Feb 6, 2020

That's interesting. Toyota would be one of the last companies I'd expect to hear that about. They're notorious in Quality circles for taking Quality seriously; at least as far as their production line is concerned. Do they not apply that same philosophy to in house software?

mark-r · on Feb 6, 2020

The investigations carried out for unintended acceleration in Toyotas didn't paint a good picture.

https://www.safetyresearch.net/blog/articles/toyota-unintend... https://users.ece.cmu.edu/~koopman/pubs/koopman14_toyota_ua_...

s5ma6n · on Feb 7, 2020

Damn...

- "No configuration management"

- "No bug tracking system"

- "No formal specifications"

- "9,273 – 11,528 global variables"

- "Uses recursion, no mitigation for stack overflow. Memory just past stack is OSEK RTOS area"

I thought of Toyota as a much better company in terms of safety and reliability. I can't imagine other manufacturers and their code.

Glawen · on Feb 7, 2020

The slides are very bad quality work, the guy clearly never worked in the automotive world. Example: the automotive safety norms is called ASIL (iso26262), and it is perfectly ok to have a single ADC chip sampling the accelerator pedal input. SIL safety levels requires much more rendundancy than ASIL which is aimed at enabling carmaker to build affordable yet safe systems.

Another is the race conditions. Unless toyota/denso is very stupid, I really doubt than more than one thread is running on the CPU, because automotive OSEK typically run in locked step mode, meaning everything is run in one sequential thread, even if there are several cpu core.

Thirdly, global variables, as there is just one thread, are a perfectly ok thing to use, provided you add a special thing in the OS which guarantees that all inputs are frozen when a block of functions are called.

It is a very orientated slideshow with unproven claims, he discredit himself.

salawat · on Feb 6, 2020

Jesus...

Thanks for posting that. That was a horrifying read. I'm at a loss for words.

Looks like I've got some more reading to do...

thanatropism · on Feb 6, 2020

The Toyota / Arthur Deming quality philosophy is really applicable to repeatable process where quality control means detecting abnormal variation amidst normal variation.

vikramkr · on Feb 6, 2020

So not only are they trying to fix a fundamental hardware issue with a software patch, their inability to do software properly extends beyond just their MCAS system? This is a good reminder that air travel's extraordinary safety record isn't just a given, it's something that takes real work to achieve and when the people responsible for putting in that work (Boeing, regulators) begin taking safety for granted, that's when people die.

totalZero · on Feb 6, 2020

I agree. But we should keep in mind that there's no such thing as a bad apple; we can't blame individual executives or regulators.

It's a bad barrel: a company that has, on a cultural level, put its business motive above its responsibility to deliver a safe and high-quality product. We have seen documented evidence that employees knew there were dangers and problems, and discussed these issues, but nobody cared enough to slow things down and get the product right.

vikramkr · on Feb 6, 2020

Absolutely. That's why I said Boeing and the FAA failed their responsibility as opposed to a Boeing exec or a particular legislator - there are organizational, structural problems. Sure, some individuals made the decision to ignore reports or set a new culture, but the fact that they succeeded is concerning - why did everyone else enable them? Is there anything we could have done to encourage those engineers to whistleblow their concerns before the planes crashed? Would they have been taken seriously by the FAA or the media or investors who were pushing for growth at all costs? These are deep, structural problems.

salawat · on Feb 6, 2020

Interesting you're getting downvotes. Not sure if the general audience or not. Regardless, the primary motivation to get a worker in a highly consolidated industry to blow the whistle is for them not to feel like they only have one choice. I can see where Boeing having consolidated as much as it has can give the culture extra resilience against employee disruption simply from the fact there isn't anywhere else to go.

vikramkr · on Feb 6, 2020

I think if you get one or two downvotes for whatever reason, as hn starts to grey out the text, a bit of bandwagon effect happens, not sure why.

Good point about whistleblowing. Perhaps the faas reliance on self regulation alsobplayedbibto that consolidation, so even the one other place they might have gone was just something that looped right back to the monolith.

stallmanite · on Feb 7, 2020

Perhaps the threshold for graying out text should be much higher as the current setting is very path-dependent in that the first few people to vote on a comment have a disproportional effect.

jsight · on Feb 6, 2020

It isn't really a fundamental hardware issue, its a fundamental issue with trying to work around the training requirements that should come along with a new airframe.

I suspect that a thorough review of some of the more complex Airbus airframes currently in operation would result in some similarly scary findings, tbh.

artursapek · on Feb 6, 2020

Complacency kills. This is particularly true in aviation.

clSTophEjUdRanu · on Feb 6, 2020

As a former software defense worker, I wish there were 3rd party audits of code and dev ops. If you saw the code that's flying in missiles, aircraft, etc and how they got there youd want to go live in a cave.

jacquesm · on Feb 6, 2020

Some whistleblower should one day post an archive of Airbus or Boeing's software archives. That would make for interesting reading.

Glawen · on Feb 7, 2020

It's usually worthless without knowing what is attached to the input/output of the microcontroller. A lot of things are done ecternally on the wiring.

kayfox · on Feb 6, 2020

They have missiles for those caves. ;)

bradknowles · on Feb 6, 2020

It doesn't have to be a missile.

Any fuel-air explosive will do.

pnako · on Feb 6, 2020

A bug in a plane can make it crash in a fireball. But a bug in a missile is something that would make it NOT crash in a fireball.

Thus the obvious solution to quality problems is to switch missile software engineers and aircraft software engineers, and encourage them not to care about quality.

monocasa · on Feb 7, 2020

Or make it crash into a fireball literally anywhere except where it was supposed to.

V_Terranova_Jr · on Feb 7, 2020

Seconded.

platz · on Feb 6, 2020

> designed to warn of a malfunction by a system that helps raise and lower the plane’s nose

So, they can't even name the mcas system anymore?

slumdev · on Feb 6, 2020

I thought that's what the MCAS was. Unless there's more than one system that overrides the pilot to pitch the nose down?

kayfox · on Feb 6, 2020

Speed trim and the trim system in general.

onychomys · on Feb 6, 2020

In their defense, probably every piece of software of any complexity at all has bugs waiting to be found, and it's not super surprising that they found some new ones while doing a rigorous testing regimen.

coldpie · on Feb 6, 2020

In their defense, the software industry is a complete joke in terms of quality control. I hope that this and the spate of ransomware will wake the industry up to realize we need new processes, languages and tooling to make software provably correct. It's clear we can't use our existing languages and tooling to make high quality software. Realistically though, that's not going to happen. The smart move is to eliminate as much software from your life as possible. It's only going to get worse as decades of laziness catches up to us.

thawaway1837 · on Feb 6, 2020

This defense basically says that software engineering is far more of a joke than hardware engineering.

But it’s not a real defense for the Max because the problem with the Max is that Boeing has shifted a lot of the work from hardware to software.

This defense only punts the ball a few yards because the question now is why Boeing chose to shift a lot of the reliability from hardware, which this defense admits is better engineered, to software, which it admits is almost intrinsically worse.

If anything, this defense leads to the conclusion that the Max is an intrinsically unsafer plane, because it has shifted far more of the burden to software which is an intrinsically worse engineering discipline.

If it was just a matter of the software QA being rushed then the additional time gained by the grounding of the planes could solve the problem. But if it is software engineering itself which is the problem then no additional time will help unless it leads to shifting some of the burden of safely flying the plane from software to hardware, which doesn’t even appear to be an option Boeing has considered.

harikb · on Feb 6, 2020

This seems like a weird way to look at the problem. Hardware also has software/logic. The only reason it appears more stable is that people are shit scared of finding a bug that can't be fixed. If we delay the software release as long as a hardware development takes, with corresponding QA, I don't see why switching it all to hardware is going to make any more sense. Immutability is all it has, unless you are somehow implying that one can be hacked (by bad actors) and the other can't be.

nomel · on Feb 6, 2020

I've worked for a couple of fabless semiconductor startups. When fixing a hardware mistake costs millions for a new mask, there are bugs in all the silicon, even the big guys [1].

[1] https://www.theregister.co.uk/2018/01/26/cloudflare_crashes_...

bradstewart · on Feb 7, 2020

While I do agree that hardware engineering generally employs more verification and validation, look at the errata sheets for modern processors. They have a lot of bugs as well--most of which are patched in microcode (software) or handled by operating system kernels (also software).

Anarch157a · on Feb 7, 2020

> most of which are patched in microcode

Because most of it is software anyway. Unless the CPU is something ridiculously simple, like an 8-bit microcontroler, a lot of the instructions are executed by microcode stored in ROM on the CPU. What those patches do is load a replacement software in RAM at boot time.

m_fayer · on Feb 6, 2020

Watching Android descend from a place of dorky stability to sleek sealed glitch-city, kinda makes me think it's the institutions, as opposed to the tooling, that's behind the plunge in the quality of mainstream software.

jsight · on Feb 6, 2020

I can't agree. I've been using Android since the Nexus One days, and it is generally much more stable for me than some of those early devices.

shadowgovt · on Feb 6, 2020

Was Android the stable one? I recall relative to iOS that it was the one likely to be eating its entire battery in the early iterations.

peteradio · on Feb 6, 2020

Feed me features I shit backlogs.

salawat · on Feb 6, 2020

I couldn't help but laugh, because it is so damn true.

It's taken me a while to come to the realization that all project management is is a backlog manufacturing and prioritization layer that operates on top of actual software engineers.

Most just want to implement things, and don't care what it is they are implementing as long as they are getting paid.

kjs3 · on Feb 7, 2020

we need new processes, languages and tooling to make software provably correct

This is so not true.

Processes? Like T-CMM/TSM, FAA-iCMM, ISO 90003, TCSEC, TSP, etc., etc.? Languages? Like Ada, MISRA, Modula-2/IEC1131, Cyclone, etc., etc.? Tools? Like FRAMA-C, MALPAS, SPADE, SPARK, TLA+, etc., etc.?

The simple fact of the matter is that we've had all the processes, languages & tools to write high-reliability, secure code for decades. Go check out the CMU SEI for how long that one institution has been trying to get people to do the right thing.

The joke is that the software industry by and large wants to pretend we don't, and claim that there's some magical new tech that needs to get invented and when it arrives everyone will jump on it and suddenly software bugs will be a thing of the past. And every new tool, tech or whatever that shows up re-proves some unpleasant basic facts: writing secure, provable, reliable code is time consuming, usually difficult, which translates to relatively expensive. And the software industry doesn't want to hear that, or invest in it, and excuses it's miserable security record with "it's not us, the tech doesn't exist to do the job right".

As a sidenote, this is why I have a visceral distaste for the Rust people (the language seems fine). If they'd spent those (presumedly) thousands of man years building on and improving Ada instead of being all NIH averse we'd be much further down the road. They could have participated in the process and gotten their wish list included in the Ada-2012 standard and built on 35+ years of experience. But, hey, it's more fun to start from scratch every decade or so.

starfallg · on Feb 6, 2020

Nah. Just introduce 'modern software development methodologies'. Put in a CI/CD pipeline and Blue/Green with Canary. Roll back the release on hull loss. /s

nerpderp82 · on Feb 6, 2020

The snark is not warranted. Those techniques are absolutely fine when you are building mail order catalogs on the internet.

Waterluvian · on Feb 6, 2020

I think the underlying issue is that when it comes to safety, privacy, and security, even if people aren't dying in a plane crash, you still need to do it with a basic level of competence. You cannot roll back a privacy leak.

Now if you're talking about features that don't touch these things, go nuts. And there lies a second issue: the identification and distinction between what stuff is "go nuts" and what stuff is "do it right the first time". You can't approach every project with the latter or you'll waste so much time in the design phase when really you should just throw your new CSS theme out there and see how it goes. But you just can't approach the "account creation" feature the same way.

ikiris · on Feb 6, 2020

Until privacy leaks have more than 0 consequences, why would anyone care to prevent them?

catalogia · on Feb 7, 2020

Basic morality? Do you consider yourself a good person? Do you do the right thing, even when nobody is forcing you to? Do you steal when nobody is looking?

And yes, corporations aren't people, corporations don't have morality, I've heard all those excuses. Those corporations are comprised of people and those people should not be held to a lower standard.

whynaut · on Feb 6, 2020

The "on hull loss" bit addresses the fact that the discussion is around much more critical software. I didn't read it as snark.

avh02 · on Feb 6, 2020

The snark is warranted if you call yourself (generic yourself, not you in particular) a software "engineer" - but yeah, if your ego's content with "developer" then the snark's not warranted.

jimmaswell · on Feb 6, 2020

It's the job title the company gave me so I use it. No deeper meaning to me.

shadowgovt · on Feb 6, 2020

Even NASA shipped a computer with known severe failure modes for the Apollo vehicle.

thawaway1837 · on Feb 6, 2020

NASA was also doing things that were never done before.

The 737 Max is much worse at doing things that were done much better for decades before it.

fsh · on Feb 6, 2020

This was 60 years ago in a desperate race with the Soviets and with an expendable flight crew. Hardly comparable conditions to commercial aviation.

ClumsyPilot · on Feb 6, 2020

You can have a known risk/failure mode, and risk manage it.

It's another matter entirely to pass on the flawed device to a third party with no preparations and warning. Then people die.

GiorgioG · on Feb 6, 2020

> Those techniques are absolutely fine when you are building mail order catalogs on the internet.

Given that attitude, it's no wonder 99% of software out there is utterly bug-ridden.

gmanley · on Feb 6, 2020

Not all software has the same level of criticalness. The difference between a bug in an airplane vs one in a game or entertainment software is gigantic. Sometimes it really is more important to get the software out quicker vs it being bug free. Plus, formally provable software exists and is used in some mission critical applications including some Airbus software. It's just that it takes a lot longer to develop. These problems are more a business and process issue, as in Boeing prioritizing speed over quality.

SilasX · on Feb 6, 2020

The "software industry in general", sure, but my understanding is that any flight-critical software goes under a much more rigorous testing regime and uses much better practices the average software project. It's probably why such issues were caught so early here.

jffhn · on Feb 6, 2020

>critical software goes under a much more rigorous testing regime

Testing is not only not enough, it's also not required :) (as Dijkstra said, it can only show the presence of bugs, not their absence)

For example, the software for some automated subways in France has never been tested, only proven correct (source: https://www.youtube.com/watch?v=jc9QmqKIUj4&t=54m50s).

Similar formal methods are used for critical software pieces in Airbus aircrafts (source: the same speaker as for the linked video, but I don't remember which talk that was in).

jammygit · on Feb 6, 2020

When proving its correct, isn’t there always the possibility that the proof abstracted away some real world nuance into an assumption, and that assumption could be wrong?

jffhn · on Feb 7, 2020

Yes, I guess the proof is only relative to the formal spec, if the spec (formal or not) is nonsense it's another problem (so you might not have to test the program, but still have to test the complete system).

But I guess (again) that having formal specs getting proven helps to avoid inconsistent specs. I recall a talk by Martyn Thomas where he said that with these methods it was harder to get programs to compile (for example, the compiler could complain if it detected an inconsistency, either between the spec and the program or within the spec itself).

DigitalBison · on Feb 6, 2020

This topic reminds me of this relatively old (but still super interesting) article about the team that worked on the space shuttle's onboard computer systems and the rigorous processes they followed to ensure correctness and safety: https://www.fastcompany.com/28121/they-write-right-stuff.

Seenso · on Feb 6, 2020

> the team that worked on the space shuttle's onboard computer systems and the rigorous processes they followed to ensure correctness and safety

Those processes are expensive. It'd imagine that they're a huge political problem to maintain, giving cost-cutting pressures and the temptations of COTS [1].

[1] https://www.faa.gov/aircraft/air_cert/design_approvals/air_s...

nealmcb2 · on Feb 15, 2020

Compliance with SEI CMMI Level 5 probably is expensive. While looking for more data I noted that several Boeing units are at Level 5. And I also found that between 2008 and 2019, about 12% of appraisals given across all industries and countries were at maturity levels 4 and 5, and hundreds are at Level 5 (more than Level 4), according to CMMI Institute: https://cmmiinstitute.com/resource-files/public/cmmi-adoptio...

So it all makes me wonder about the official maturity of Boeing's 737 team, and also how much that rating relates to actual software quality and safety objectives (as apparently achieved in the space shuttle work)....

DigitalBison · on Feb 6, 2020

Oh totally, I can't dispute that. It's just good reading in general, but is more of a response to the grandparent comment stating "the software industry is a complete joke in terms of quality control". I think that's probably not a completely off-base assessment of quality control on average in the tech industry, but the point is it's not like no one knows how to do better; rigorous QA processes exist, but it's a huge tradeoff between cost/agility/etc and the expected cost/damage of not getting it right.

bradknowles · on Feb 6, 2020

There are SEI Level 5 operations in the world. They do exist. Or at least, we've been told they do.

But they are also super expensive.

And no one has any money to spend on anything like testing or quality any more.

bronson · on Feb 6, 2020

After 350 people die is early??

SilasX · on Feb 6, 2020

Okay, I think that's an unhelpful way to put it, but it's still worth addressing, and there's a lot to unpack here, so here goes.

My original point was that flight critical software shouldn't be lumped in with the bugs-typical world of software. I still say that's correct.

But "bug" has a range of meanings. From what I read of the story of the original fatal error on the 300 MAX, the software was not buggy in the sense of "deviating from specification". It's just that -- as it turns out -- the specification was bad, and the MCAS should not have overridden pilot input when it did.

In a sense -- the one I consider most important -- that matches my model of flight-critical software. As software, they made absolutely sure that, in every case, it did what it was intended to; to the extent that it was in error, it was not in the sense of "we failed to make the software do what the spec says".

OTOH, I agree that "bug" is taken more broadly in software than the sense I was using here, that there usually isn't such a clean ("Waterfall") separation between "specifying what the software should do" and "ensuring that it does that" -- there's a collaborative process of ferreting out the implications of "what the system should do at what points".

In these latest events, it looks like the bug was in the narrower sense I meant above, of "it doesn't meet the spec", and that bug was found before deaths. From the article:

>The problem was that an indicator light, designed to warn of a malfunction by a system that helps raise and lower the plane’s nose, was turning on when it wasn’t supposed to, the company said.

Now, in fairness, that's a much later discovery than my optimistic model held, but still far earlier than "being used for flying the public around", as (the analog of) might happen with the software industry more generally.

thawaway1837 · on Feb 6, 2020

The only issue is here is that the proximate cause of failure was what you’re saying.

What is worrying is that there could be many other potential sources of failure which simply didn’t trigger because the MCAS situation was so bad it led to 2 plane crashes before they even had a chance to bring down some planes.

In other words, if you have 2 bugs, one with a 1/100 chance of triggering and the other with a 1/1000 chance of triggering, by the time you hit 500 attempts, odds are the first bug triggered twice, while the second didn’t even once. So you solve the first bug, that still leaves you with a problem that has a 1/1000 chance of triggering, when the expectation is that bugs should only trigger at worst 1/10000 times.

Solving the MCAS issue is not by itself a reliable indicator that this is a safe plane to fly, especially since we know there are many fundamental procedural reasons to be worried about the quality of the plane.

SilasX · on Feb 6, 2020

Sounds right to me! My point is just, it doesn't feel right to lump this kind of failure in with "lol all software is super buggy and this is no exception". Yes, there was a failure here; no, it didn't look anything like "ohhh yeah pointers are super hard to get right".

jsjohnst · on Feb 6, 2020

> the MCAS should not have overridden pilot input when it did

The entire purpose of MCAS was to override pilot input as the typical instincts of a trained 737 pilot is to do the wrong thing. The failure was not handling edge cases (conflicting or missing sensor data) and no clear way to turn off the system when it was failing.

brokenmachine · on Feb 7, 2020

So, what you're saying is that MCAS should not have overridden pilot input when it did?

jsjohnst · on Feb 7, 2020

So what you’re saying is you have a need to just be “right“ without also understanding the back story behind it?

(I didn’t say anything about GP being wrong, I simply gave more details about the how and why)

salawat · on Feb 6, 2020

>It's clear we can't use our existing languages and tooling to make high quality software.

Gotta fundamentally disagree with you right there. We absolutely can make high-quality software with our current tooling. The issue is that doing that is expensive and time consuming, and the Market optimizes on good enough to be sold and not dropped.

This expense and difficulty isn't an inherent fault of the tools, but rather the monstrous other side of the coin in proving what your system isn't.

There are many implementations that are composable to generate an end result, the trick is to expend the energy to ensure you've made the specific one that also doesn't run into undefined behavior, domain specific or otherwise.

You will never escape from the tyranny of having to clearly communicate to a perfectly obedient machine exactly what it is you want it to do; part of which is being able to identify when you don't have all your requirements right.

thawaway1837 · on Feb 6, 2020

Airbus has been doing it for a while. The big difference between Airbus and Boeing has been how much more reliance Airbus places on technologies like MCAS vs straight up pilot skill.

Airbus has designed their planes to counteract bugs by adding tremendous amounts of redundancies. The Max is Boeing’s effort at creating an Airbus like plane without the redundantsafety features.

ixacto · on Feb 6, 2020

That is why I refuse to fly in anything that has Boeing on the side of it. Based on the 737 max fiasco and previous incidents they it looks like they are willing to make compromises in safety for profit. The regulatory agencies including the FAA here in the USA are captured and thus irrelevant, and the attempt to fix them died https://en.m.wikipedia.org/wiki/Regulatory_capture#Federal_A...

Unless we see fines based on a percentage of revenue or the CEO/Board of Boeing being thrown in prison, then we won’t see change.

Pmop · on Feb 6, 2020

Research on software reliability is plentiful and throughful, and there are existing tools to use such knowledge in a way to ensure safety of critical systems, from development to deploy. The thing is that they most likely ran their calculations and concluded that it's more profitable to cheap out on software development and QA, and pay for insurance or fines, should anything go wrong later. Engineers who devoted their time and effort to learn aforementioned techniques are very unlikely to work for 9$/h, it's more profitable and less stressful to go wash cars or whatever.

anshou- · on Feb 6, 2020

The people, processes, languages, and tools aren't the root problem. The root problem is always time and money.

rfmw19 · on Feb 6, 2020

> The root problem is always time and money.

I agree with you. However, if you're not some high level manager who controls this, what's the next best thing you can do?

I think most problems are less technical and are more about people and processes. You can still argue you don't have enough influence there, and that's completely possible and realistic. But that should be where we direct attention.

Yes, some technical advances in better tools and languages that provide stricter proofs and so on are needed and will help. But ultimately it's still the people that need to learn, use, and enforce the processes.

robocat · on Feb 6, 2020

> In their defense, the software industry is a complete joke in terms of quality control.

My dishwasher stopped working today. Last week I had to replace a wheel bearing in my van ($900). I just got a refund for a pond aerator that stopped working after a few weeks. And the head came off my plastic toy.

Hardware fails all the time: does that mean I can generically say all manufacturing has shitty QC?

blt · on Feb 6, 2020

I'd say the aerator and toy are analogous to software bugs, but the wheel bearing is a wear item. Even the best-engineered hardware has parts that wear out. That kind of planned maintenance has no equivalent in software.

pinkfoot · on Feb 6, 2020

In our defence, our customers are a compete joke in terms of the price they are prepared to pay for quality sfotware.

See MCAS programmers at $9/h.

dotancohen · on Feb 6, 2020

I just checked and it is true. The subcontractors who wrote the MCAS software were paid $9/h:

    > Increasingly, the iconic American planemaker and its
    > subcontractors have relied on temporary workers making
    > as little as $9 an hour to develop and test software,
    > often from countries lacking a deep background in
    > aerospace -- notably India.

https://www.bloomberg.com/news/articles/2019-06-28/boeing-s-...

salawat · on Feb 6, 2020

Note those engineers (HCL) only worked on the Multi-Function display code. I don't have the article on hand, but they did ascertain at one point that the MCAS code was written in house. I'll try to find the article again. I think it was Seattle Times.

Found it. Was Bloomberg. Here is the old thread.

https://news.ycombinator.com/item?id=20309052

thawaway1837 · on Feb 6, 2020

It’s a lot more comfortable to blame Indian engineers (and of course the HN crowd loves that explanation, no matter how invalid it is) than to place the blame where it belongs. Squarely on Boeing management.

slumdev · on Feb 6, 2020

It doesn't mean much that the software was written in-house.

Boeing has literally relocated their plant operations cross-country in the past in order to break the workers' unions.

Just think about how much pressure they can exert on a non-union profession like software development.

anticensor · on Feb 6, 2020

That is just enough for one employee.

coldpie · on Feb 6, 2020

That's still an industry problem. No one hires doctors or architects for $9/h.

PeterisP · on Feb 6, 2020

If you go for countries with lower wages (just as Boeing did for engineers), you definitely can hire doctors and architects for $9/hour or even less; as $9/h is more than the average doctor or architect gets paid there, you could definitely pick some who are quite decent.

E.g. just to use a random example, Russian statistics show that the average pay of an architect is something like $800/month in many regions - in some more, in some less, but definitely well withing $9/hour; and for doctors in regions outside Moscow it's something roughly similar; Moscow seems to get something like 2200$/month in the average statistics so that's more than $9/hour if they're working sane hours.

reilly3000 · on Feb 6, 2020

No new processes needed. ISO 90003 covers all kinds of software engineering practices that ensure quality and correctness. I wish more firms used what is already out there.

purpleidea · on Feb 6, 2020

Provably correct software (and memory and type safe languages!) are both important aspects. But they can still cause you to die in a software bug. The third and most important aspect is that the code must be viewable by any third party or individual. Open source here would be ideal, but at a bare minimum, we should be allowed to see the code that our lives depend on.

simion314 · on Feb 6, 2020

>and it's not super surprising that they found some new ones while doing a rigorous testing regimen. reply

So they did not do enough testing before, just the happy case and never considered sensor failures. I am now wondering if they tested the rest of the software or FAA will just look into MCAS and ignore all the rest.

johannes1234321 · on Feb 6, 2020

As a plane crash costs lives of a few hundred people the criteria should be different from "normal" software. And for most parts and most history software in aeronautics was created with special care and tooling and under research of provability.

organsnyder · on Feb 6, 2020

In the case of MCAS, that's likely true: the software was behaving exactly per spec; the problem was in the earlier requirements-definition phase.

salawat · on Feb 6, 2020

A grievous failure in Systems Engineering and FMEA, I believe is the industry term.

skissane · on Feb 6, 2020

> it's not super surprising that they found some new ones while doing a rigorous testing regimen

Why didn’t they find these bugs during the plane’s development and certification? It appears the testing then was less rigorous. They should extend the same rigorous testing to their other aircraft models also.

totalZero · on Feb 6, 2020

Shouldn't that rigorous testing regimen have preceded the sale and flight of these large passenger aircraft?

kayfox · on Feb 6, 2020

A regime which finds all the defects would cost billions on its own, only one such piece of software has been subject to such a regime and that's the Space Shuttle guidance program. Every other piece of software is tested to levels set by the aeronautical industry and regulators, which may not include all possible scenarios.

You will note, some of these scenarios are essentially fuzzing the memory of the flight computer and seeing what happens, ideally most bit flips will either be detected or be minor, but some can end up causing issues. I'm not sure if it would be possible to explore all the branches in the software in any sort of reasonable amount of time.

https://en.wikipedia.org/wiki/Qantas_Flight_72

imglorp · on Feb 6, 2020

That cost is already baked into aviation hardware. It's why fasteners are so expensive, for example, because the cost of certifying, tracking their history, and quantifying their quality is already amortized into each bolt.

Flight software development lifecycle--and regs as you point out--still needs to catch up.

m4rtink · on Feb 6, 2020

Actually, for fasteners there is often a material difference - cheaper fasteners are cut to shape while better fasteners are forged into shape, resulting in a much stronger part due to the resulting grain flow.

tgflynn · on Feb 6, 2020

> A regime which finds all the defects would cost billions on its own, only one such piece of software has been subject to such a regime and that's the Space Shuttle guidance program.

Do you have a source for that claim ? I'm skeptical that even a testing process costing billions of dollars could claim to cover all possible inputs and states for a software system of any complexity.

raisedbyninjas · on Feb 6, 2020

This describes not a testing regimen, but the entire development lifecycle. https://www.fastcompany.com/28121/they-write-right-stuff

tgflynn · on Feb 6, 2020

It's a very interesting article but

"as perfect as human beings have achieved" != perfect

andbberger · on Feb 6, 2020

What's this about the space shuttle? I'm intrigued.

I thought that the paris metro was the posterchild of formal verification methods.

salawat · on Feb 6, 2020

The software that went into the Space Shuttle was coded to CMM level 5; this is a level of Quality Assurance that would make even the most Iron Fisted, detail oriented QA begin to sweat at the pile of paperwork and pressure ahead of them.

It's a completely different experience that likely only a handful of modern testers would even contemplate nowadays, but has always been a shining example of something to aspire to be part of one day for me.

It isn't at all easy, and basically requires the discipline to understand, document, and justify every single line of code within the context of the overarching system.

No business wants to deal with it, and it is no surprise that only publically funded projects tend to get anywhere close.

https://ntrs.nasa.gov/archive/nasa/casi.ntrs.nasa.gov/201100...

https://developers.slashdot.org/story/00/05/19/050258/space-...

onychomys · on Feb 6, 2020

Sure. But I'd hope that they brought in new people, with a new outlook on things, and started all over from scratch. They're probably running mostly the same tests they were before the crashes, and also some new ones that people have dreamed up.

tremon · on Feb 6, 2020

I think it's more likely that all the managers in the chain of control suddenly feel like their ass is on the line if another 737 MAX drops out of the sky, which is why they're now paying more attention to what their engineers have been telling them for years.

Which is probably a good thing, except for the part that it took hundreds of people to die first to get here.

nradov · on Feb 6, 2020

The Space Shuttle flight computer software was effectively bug free. Only a tiny number of defects were found post release, and none impacted safety.

Findeton · on Feb 6, 2020

Actually there are two types of software failures: design failures and implementation failures. Design failures are harder, but we already have ways of completely avoiding implementation failures, with systems that prove mathematically that the software implements the design. NASA for example uses those in many occasions. Boeing doesn't, obviously, and there's no valid excuse.

anshou- · on Feb 6, 2020

When some software is being built the constraints are much more rigorous, but it all comes down to the organization. The guidelines established by NASA for their space program are a good example of rigorous requirements. Boeing should not be unfamiliar with this concept and can certainly afford to do this properly. This move was motivated purely by greed.

benwerd · on Feb 6, 2020

There is a less than zero chance I'll be boarding one of these planes again, ever. Trust is an important idea in any product, but particularly in areas like aviation, and I don't see how they can possibly build it back.

I do see an opportunity for software that ensures you are only booking journeys on the aircraft you feel are safe.

oska · on Feb 7, 2020

I agree with you. This plane is fundamentally flawed and the reason such a flawed plane went into production was to reward short-term, sociopathic thinking and earn short-term profits (which ended up blowing up in their faces anyway). Taking a flight on this plane now would, for me, be like going back to an abusive partner after they'd thrown me down a flight of stairs.

apexalpha · on Feb 6, 2020

on the other hand, the software for this plane might be the most scrutinized ever when all this is over.

oska · on Feb 7, 2020

The hardware remains fundamentally flawed.

frandroid · on Feb 12, 2020

> Asked about a likely date for a return to service for the Max, Dickson said it isn’t helpful to talk about timelines. Boeing needs to concentrate on making complete, quality submissions on its fixes for the plane, he said.

Ahh, "we'll ship it when it's ready, not on some arbitrary deadline." Music to any engineer/builder's ears.

stagas · on Feb 6, 2020

Can someone explain how a hugely complex machine with mostly parallel working analog parts fits into the digital computing paradigm? Isn't it predetermined to fail under extreme conditions, like those that are found while flying inbetween clouds and thunderstorms with all that pressure and fluctuations? How does sampling not fail, like, all the time? What kind of tooling is being used to mitigate for all these? Does anyone know?

garbage_88564 · on Feb 6, 2020

I am very naive to commercial aviation but this is my experience with building and crashing model aircraft repeatedly. I fly mostly FPV which puts me in the first person view from the cockpit.

Yes, electronics fail in the most weirdest ways due to connector failures, RF interference, software error, sensor failure.

When my systems start failing or acting up due to improper stabilization PID gains, etc. I have a big switch for MANUAL mode. I am able to fly this thing as long as the servos, radio, and camera get power. All sensors could be sheared off. I have no idea what my airspeed is ever because I don't use pitot tubes so I use a known engine throttle % whose stall characteristic I understand for level flight in various wind conditions and I don't make sudden maneuvers at throttle below this point.

Fixed wing planes have remarkable aerodynamic stability and I don't understand why 737 MAX cannot be piloted in a fly by wire manner with all computer aids disabled, giving the pilots direct control of the servos with a big red switch that mechanically disconnects the flight computers. This requires almost no code to implement.

jaywalk · on Feb 6, 2020

On Boeing aircraft, the pilots essentially do have "direct control of the servos" at all times. MCAS was implemented to make the MAX fly just like the NG despite the difference in engine size and placement. What MCAS actually did was not modifying the pilots inputs, but adjusting the stabilizer trim in certain scenarios.

The pilots do have direct control over the stabilizer trim, and have always had the ability to disable the electronic system in case of stabilizer trim runaway. This was not new to the MAX, and would have effectively disabled MCAS.

pdonis · on Feb 6, 2020

> What MCAS actually did was not modifying the pilots inputs, but adjusting the stabilizer trim in certain scenarios.

But MCAS had enough control authority to override the pilots' inputs; the pilots of both crashed aircraft were desperately trying to pull the nose up, but couldn't because MCAS had put in so much nose down trim that they couldn't counteract it.

> The pilots do have direct control over the stabilizer trim, and have always had the ability to disable the electronic system in case of stabilizer trim runaway. This was not new to the MAX, and would have effectively disabled MCAS.

The ability to disable the automatic stabilizer trim system was not new to the MAX, yes.

What was new to the MAX was that, unlike previous 737s, disabling the automatic stabilizer trim system would also disable the manual electric stabilizer trim system, so that the only way to adjust the trim would be by using the mechanical trim wheel. And it was possible for MCAS to adjust the trim into a range where it was mechanically impossible to adjust it back using the mechanical trim wheel.

garbage_937648 · on Feb 6, 2020

Pilots do not have direct servo control of the aircraft if there is any possibility of any computer system adjusting the servos aside from the throw commanded by the sticks held by the pilot.

Reading comments such as:

> The problem was that an indicator light, designed to warn of a malfunction by a system that helps raise and lower the plane’s nose, was turning on when it wasn’t supposed to, the company said.

Implies that there is intrinsically some computer system that continually parses the commanded stick deflection and applies an overlay.

What I am suggesting is a single toggle to make everything shut up and reset all servos to their midpoint all at once in one shot and let the pilot just fly the plane.

I have not seen any evidence that such a system exists. It is the elephant in the room. Airplanes do not need complex electronics to just fly if they are aerodynamically stable, and this plane is more or less stable except that under some conditions it will make the pilot soil their pants at higher AoA, which is where the promises of MCAS come in. Big deal. They can mentally compensate against that manually better than fighting a computer system working actively against your commanded inputs.

I have experienced the joy of a badly tuned PID controller turning my stabilization system into involuntary high speed descent. The fix is always to tell the computer to shut up and just fly the plane 100% manually.

pdonis · on Feb 6, 2020

> Implies that there is intrinsically some computer system that continually parses the commanded stick deflection and applies an overlay.

That's not how the 737 works. The 737 is not a fly by wire aircraft. The pilots control the rudder, ailerons, and elevators electrohydraulically; there is no computer filtering. The electric stability trim system, which is what MCAS feeds its input into, controls the trim tabs on the elevators. This does not change anything about the pilots' inputs to the elevators, but it does change the aerodynamics of the elevators in a way that can limit the pilots' ability to control pitch.

> this plane is more or less stable except that under some conditions it will make the pilot soil their pants at higher AoA, which is where the promises of MCAS come in. Big deal.

If the 737 MAX had been a new aircraft type, it would not have been a problem. There might still have been some adjustment needed to meet FAA certification requirements for stick force (basically, the stick force is supposed to increase with increasing angle of attack, so the pilot has to pull harder to keep the nose going up as you get closer to a stall). But there would not have been a need to cobble together anything like MCAS.

The problem was that Boeing wanted the 737 MAX to be certified under the existing 737 type certificate (because otherwise the potential customers wouldn't want it, since they didn't want to have to re-train and re-certify all their pilots), which meant that the stick force as a function of angle of attack had to be the same as for previous 737s. But the new engines on the 737 MAX made the plane aerodynamically different, so the "natural" stick force was different. MCAS was a software kludge to try to change the stick force.

kayfox · on Feb 7, 2020

>If the 737 MAX had been a new aircraft type, it would not have been a problem.

If the aircraft was a new type this still would have been an issue that would have to be corrected, see FARS 25.173.

pdonis · on Feb 7, 2020

Yes, which is why I said: "There might still have been some adjustment needed to meet FAA certification requirements for stick force"

garbage_937648 · on Feb 7, 2020

Good point

pdonis · on Feb 6, 2020

> Fixed wing planes have remarkable aerodynamic stability and I don't understand why 737 MAX cannot be piloted in a fly by wire manner with all computer aids disabled

It can be. MCAS can be disabled by disabling the electric stability trim system. The problem is that if you do that in a situation where MCAS has already adjusted the trim far enough from where it should be, it can be mechanically impossible to put the trim back where it belongs without using the electric trim system. So you have to first use the electric trim system to put the trim back where it belongs, then disable it so MCAS can't mess it up again.

salawat · on Feb 6, 2020

https://en.wikipedia.org/wiki/Failure_mode_and_effects_analy...

https://en.wikipedia.org/wiki/Fault_tree_analysis

Basically, you should be designing every system to gracefully handle the failure of every other system on which it is dependent.

So the MCAS routines, if they had been done correctly, and properly classified as to the hazard level, should have taken into account failures of the Flight Computer they were running on, anomaly detection via cross-check with the second AoA vane, etc. That quite clearly did not happen.

The same approach applies with any other hardware/software integration. Your sensors will break. You therefore need to determine what you need to do when that happens.

starpilot · on Feb 6, 2020

notadoc · on Feb 6, 2020

Throw in the towel on the 737 Max and go back to the drawing board.

bsimpson · on Feb 6, 2020

I'm shocked they keep publicly working on this plane.

I get that planes cost more money than I can fathom, and that making a whole fleet of impossible amounts of money costs a gazillion dollars. Still, this one seems spent. Nobody is going to knowingly fly on a 737 Max.

They ought to have retired the plane last year. They can design a new plane (that because of economics, will probably be very similar to this plane), release it when it's been properly vetted, swap out Maxes for it, retrofit those Maxes, etc.

I realize this is naive armchair quarterbacking from someone who has never worked in aviation, but there's a reason that Philip Morris is called Altria now and that Weinstein Co was merged into Spyglass. If the public doesn't trust your brand, no amount of "but we fixed it with this patch we rushed out the door" is going to change that.

yellow_lead · on Feb 6, 2020

> Nobody is going to knowingly fly on a 737 Max.

I will respectfully disagree with you. Most airlines won't tell you your aircraft when you book a ticket and even if they do, they seem allowed to change it at the last minute. If you have spent $500,$1000, $XXXX on a ticket, will you not board if you discover the plane has been switched? Will you avoid airlines that don't guarantee you a certain plane?

There are not many great options to travel long distances quickly. If this plane becomes commonplace, I'm afraid consumers will be forced to use it. If there are other ideas on how this could play out, I'm happy to consider them, but I fear this is nearly a given.

thawaway1837 · on Feb 6, 2020

I wouldn’t be surprised if ticket booking sites didn’t throw up a “this could potentially be a 737 Max plane” indicator the moment these planes are ungrounded.

Because the first ticketing site that did that would gain a definite edge. And even if the airline doesn’t tell you what plane is flying, we know what routes and airlines are flying 737 Max’s, which is enough to raise an indicator. If that happens, flying a 737 Max may potentially cause airlines to lose money even if the particular flight isn’t a 737 Max, because it might be one.

blt · on Feb 6, 2020

> Most airlines won't tell you your aircraft when you book a ticket

I don't think this is remotely true. I just checked Delta and British Airways; both show the aircraft type in "details". (Pick the 747 flights from BA while you still can!)

> even if they do, they seem allowed to change it at the last minute

Airlines shuffle aircraft occasionally, but usually the aircraft type for a particular flight is predictable. If airline A flies the 737 Max and airline B flies the A320 on the same route, people will flock to airline B.

gpm · on Feb 6, 2020

Last time I was on a delta flight, their terms of service (or whatever they call them) explicitly reserved the right to swap out the type of aircraft at any point in time. That was less than a year ago, I assume it is still the case.

oska · on Feb 7, 2020

They won't swap you to a 737-MAX if they don't own any of that aircraft. Delta, your example, does not have any in service and has not ordered any. It's not hard to find out which airlines use the plane and which don't.

yellow_lead · on Feb 7, 2020

Your experience may vary to mine. Recently I tried to determine my aircraft for a connecting flight on Air Canada and I couldn't tell it before I booked the flight (maybe that is the distinction, before booking vs after).

0xffff2 · on Feb 7, 2020

I just looked at flights on Air Canada's website and I can see the aircraft type under "details" there just like every single other booking website since online flight booking has been a thing.

oska · on Feb 7, 2020

It's quite easy to look up which airlines use this aircraft and which don't.[1] And further, which airlines are major purchasers of this aircraft (e.g. Southwest, Flydubai and Lion Air). And then avoid flying those airlines.

[1] https://en.wikipedia.org/wiki/List_of_Boeing_737_MAX_orders_...

kayfox · on Feb 6, 2020

My experience is most airlines in the US will tell you on booking, its just not put out in the open.

inferiorhuman · on Feb 7, 2020

The type of equipment an airline plans to use for a given route is generally decided upon well in advance. The information is typically not that hard to find. Most (American) airlines will show you when you search.

https://i.imgur.com/h8k2IFK.jpg

https://i.imgur.com/7xfQpQX.jpg

https://i.imgur.com/7Ax2AQh.jpg

https://i.imgur.com/z6YOo2G.jpg

https://i.imgur.com/QGxNFXp.jpg

nemetroid · on Feb 6, 2020

And it will often say "38M" rather than "Boeing 737 MAX 8".

kayfox · on Feb 6, 2020

Should be B38M.

Its standard ICAO terminology.

I haven't seen that sorta thing on booking sites, but its common on sites like FlightAware.

JumpCrisscross · on Feb 6, 2020

> They ought to have retired the plane last year

This would bankrupt the company.

(Not necessarily uncalled for. But not something it will do on its own.)

> If the public doesn't trust your brand

The public has shown one preference, above all else, when it comes to flying: pricing. The 737 MAX will be renamed, re-certified and nobody but an obsessive minority will avoid flying on it.

(This is why, btw, we need strong airline regulators. Market pressure is ceteris paribus insufficient.)

TeMPOraL · on Feb 6, 2020

> This would bankrupt the company.

Is that even possible? From what I understand, it's a strategic company for the United States, so they have to keep it alive no matter what (I recall reading that it's actually a law) - or at least its military branch.

rsynnott · on Feb 6, 2020

It can certainly be kept alive through bankruptcy. Boeing isn't going anywhere. The board's share value might, tho.

nokcha · on Feb 6, 2020

It is that possible that it can be bankrupted in the sense that stockholders get wiped out even if the US government bails it out in the sense of keeping it alive operationally.

TylerE · on Feb 6, 2020

Do you really want to re-regulate the airlines? Fares will at least double.

JumpCrisscross · on Feb 6, 2020

> Do you really want to re-regulate the airlines?

No, pardon me, I meant planes and their maintenance. Airlines setting fares and competing for slots is fine.

pdonis · on Feb 6, 2020

> I meant planes and their maintenance

Planes and their maintenance are already strictly regulated in the US. Note that the two crashes were not of planes flown by US carriers. The first crash (Lion Air) had maintenance irregularities that contributed to it.

MCAS, however, is not a maintenance failure but a design failure. "More regulation" is not really a good description of what is needed to prevent future design failures like that. What is needed is regulation that can't be outsourced to the companies being regulated, as the FAA was doing.

slashdev · on Feb 6, 2020

Yes. Regulations are worth the downsides when it's a matter of life and death. Plus it's already highly regulated, so being a little stricter is not going to mean noticeably higher fares.

The problem here is the regulatory body basically punted and let the company regulate themselves. Like that was going to work...

aianus · on Feb 6, 2020

Strong disagree. There is always an economic tradeoff. Hypothetically going from 0.00001% chance of death to 0.000001% chance of death is not worth my ticket price going from $500 to $2000.

Also why I drive to work every day instead of some safer form of transportation.

slashdev · on Feb 7, 2020

The price going from $500 to $2000 for just enforcing reasonable regulations and oversight on Boeing is a pretty blatant strawman.

I strongly disagree with your strong disagree.

TylerE · on Feb 7, 2020

It’s not a straw man. It’s actual facts. This isn’t a hypothetical.

Airlines were heavily regulated. We know how it worked.

slashdev · on Feb 7, 2020

It's a straw man. Reasonable regulations are not going to raise prices by 400%. That's absurd.

Nobody is saying regulate them to death. But maybe, just maybe, don't let the company themselves certify their own aircraft when it comes to matters of safety, required training, etc.

aianus · on Feb 8, 2020

> Regulations are worth the downsides when it's a matter of life and death.

I am in disagreement specifically with this absolute quote.

Regulations may or may not be worth the downsides, even when they save lives. The economic costs matter and should be calculated and a reasonable tradeoff should be made.

You can't make a statement like that and then turn around and complain about my hypothetical 300% price increase for a 900% safety increase.

slashdev · on Feb 14, 2020

I think we're in agreement then. I don't think in terms of absolutes, there's always a trade-off to be made and one must be rational and measured.

TylerE · on Feb 6, 2020

Regulation (Pre-1978) did mean much, much higher fares.

By 1990 alone inflation adjusted fares had fallen 30% since deregulation. They've fallen more since.

"Airline revenue per passenger mile has declined from an inflation-adjusted 33.3 cents in 1974, to 13 cents in the first half of 2010. In 1974 the cheapest round-trip New York-Los Angeles flight (in inflation-adjusted dollars) that regulators would allow: $1,442. Today one can fly that same route for $268." (https://www.bloomberg.com/businessweek/bwdaily/dnflash/conte...)

slashdev · on Feb 6, 2020

I think it would be a mistake to attribute all of those price changes to deregulation. Lot's of things have happened since then, including improvements in efficiency and technology, insane competition driving profit margins nearly to zero, removing lots of frills and comforts, increased automation (not in the plane as much as everywhere else in the process and org.) If you look at other regulatory environments you'll see similar price drops. I'm sure it was a factor to some extent though - but lower prices is not a good argument for playing fast and loose with human lives.

TylerE · on Feb 6, 2020

They aren't though. As I've stated in other comments, airline travel is incredibly safe. About 500x safer than driving.

This "life and death" talk is just tabloid sensationalism. The numbers don't support it, all. Airline travel is the safest means of transports that exists, or has ever existed, by quite a margin.

slashdev · on Feb 7, 2020

I agree, and I get how safe air travel is. But let's not underplay this either. We had two crashes with hundreds of fatalities because the regulator decided to let the fox watch the hen house. That's sad and avoidable.

thawaway1837 · on Feb 6, 2020

No it wouldn’t.

The regulations worked fine and weren’t super expensive, until the FAA decided to outsource its regulations to the very companies it was supposed to regulate (Boeing).

TylerE · on Feb 6, 2020

https://en.wikipedia.org/wiki/Airline_deregulation#Price

brianwawok · on Feb 6, 2020

Depends how many people die at current levels?

TylerE · on Feb 6, 2020

Very very few.

Air travel in the US averages 0.2 deaths per 10 billion passenger-miles.

Driving is 150 deaths per 10B P-M.

Driving across a small town is more likely to kill you than flying across the country.

brianwawok · on Feb 6, 2020

That is current. I suspect the longer that an industry has heavy self regulation, the higher the cost is.

I also do not agree with the premise that imposing stricter regulation would "double" the cost of tickets. Airline tickets have come down in price for MANY reasons, not just less regulation.

TylerE · on Feb 6, 2020

Incorrect.

Deregulation happened in 1978. Deaths have been trending down ever since.

brianwawok · on Feb 13, 2020

Do you know that correlation does not imply causation?

I would suggest the site https://www.tylervigen.com/spurious-correlations

triceratops · on Feb 6, 2020

That also has the upside of cutting carbon emissions from air travel.

masswerk · on Feb 7, 2020

Mind that the 737 Max is all about avoiding recertification (or grandfathering) and avoiding pilot retraining. Its raison d'être is to allow airlines with an existing fleet of 737s to operate with existing crews without further training and/or certification. Meaning, designing a new plane allowed operators to switch to Airbus, as well, which might be the more attractive option at the moment.

This is all about operator logistics and lock-in by costs of retraining and infrastructure. (I guess, this may be the real point to be addressed, since, as this is such a crucial factor, procedures are prone to be repeated in other configurations in the future. Failures are highly likely to be repeated, regardless of the actor.)

Edit: To emphasize the last thought, the 737 Max may be more of a systemic failure of the entire business, its regulations and how they are conducted, than a failure by a single actor on a single instance.

alkonaut · on Feb 6, 2020

I’ll probably fly it. As in, I wouldn’t pay an unlimited amount of time or money to avoid it, and it will be expensive (e.g a 10 hour drive instead of a one hour flight).

I assure you that outside of people interested in tech, few I know have any idea what the MAX8 is, how to tell one from the NG or an A320. Once the plane flies it’s going to be business as usual.

rootusrootus · on Feb 6, 2020

I am willing to bet that most people do not care enough to fight it. They want cheap, everything else is secondary. They will rationalize that all of this testing has made the MAX arguably the safest airliner they can fly in, and step aboard they will.

Most people are not well represented by the fine folks on HN.