Boeing 737 Max 8 failures..an explanation ?

Hillhater

100 TW
Joined
Aug 3, 2010
Messages
13,048
Location
Sydney ..(Hilly part !) .. Australia/ Down under !
I came across this and thought some ES members might be interested ..

BEST analysis of what really is happening on the #Boeing737Max issue from my brother in law who’s a pilot, software engineer & deep thinker. Bottom line don’t blame software that’s the band aid for many other engineering and economic forces in effect.

Some people are calling the 737MAX tragedies a #software failure. Here's my response: It's not a software problem. It was an Economic problem that the 737 engines used too much fuel, so they decided to install more efficient engines with bigger fans and make the 737MAX.
Airframe problem. They wanted to use the 737 airframe for economic reasons, but needed more ground clearance with bigger engines.The 737 design can't be practically modified to have taller main landing gear. The solution was to mount them higher & more forward.
Aerodynamic problem. The airframe with the engines mounted differently did not have adequately stable handling at high AoA to be certifiable. Boeing decided to create the MCAS system to electronically correct for the aircraft's handling deficiencies.
During the course of developing the MCAS, there was a:
Systems engineering problem. Boeing wanted the simplest possible fix that fit their existing systems architecture, so that it required minimal engineering rework, and minimal new training for pilots and maintenance crews.
The easiest way to do this was to add some features to the existing Elevator Feel Shift system. Like the #EFS system, the #MCAS relies on non-redundant sensors to decide how much trim to add. Unlike the EFS system, MCAS can make huge nose down trim changes.
On both ill-fated flights, there was a:
Sensor problem. The AoA vane on the 737MAX appears to not be very reliable and gave wildly wrong readings. On #LionAir, this was compounded by a
Maintenance practices problem. The previous crew had experienced the same problem and didn't record the problem in the maintenance logbook. This was compounded by a...
Pilot training problem. On LionAir, pilots were never even told about the MCAS, and by the time of the Ethiopian flight, there was an emergency AD issued, but no one had done sim training on this failure. This was compounded by an..
Economic problem. Boeing sells an option package that includes an extra AoA vane, and an AoA disagree light, which lets pilots know that this problem was happening. Both 737MAXes that crashed were delivered without this option. No 737MAX with this option has ever crashed.
All of this was compounded by a:
Pilot expertise problem. If the pilots had correctly and quickly identified the problem and run the stab trim runaway checklist, they would not have crashed.
Nowhere in here is there a software problem. The computers & software performed their jobs according to spec without error. The specification was just shitty. Now the quickest way for Boeing to solve this mess is to call up the software guys to come up with another band-aid.

I'm a software engineer, and we're sometimes called on to fix the deficiencies of mechanical or aero or electrical engineering, because the metal has already been cut or the molds have already been made or the chip has already been fabed, and so that problem can't be solved.

But the software can always be pushed to the update server or reflashed. When the software band-aid comes off in a 500mph wind, it's tempting to just blame the band-aid.

Technology Internet
 
These last two years have seen historic lows in the prices of fuel, and the 737 and also the 737max were both developed during a time when there was a fear of near-future fuel prices climbing.

There is a constant drive for higher efficiencies, and when more efficient engines are made, there is a desire to squeeze a square peg into a round hole.

When it comes to the angle of attack (AOA), I am reminded of Charles Lindbergh. Long before he became the first pilot to fly non-stop across the Atlantic, he was a postal pilot who flew in every kind of bad weather. He would fly low and follow railroad tracks, where a nose - dip could kill him.

He had the best instruments that were available, and yet he trusted a spoon hanging by a string to tell him if he was level or not. He often flew through fog, and like all pilots, he experienced how he felt like he was going up when he couldn't see, even though he was actually flying level.

Instruments have occasionally failed him, but when his life was on the line on a foggy night, the spoon never failed.
 
I'd say there was a software problem as well. During the LionAir accident, one AoA sensor was reading normally and the other one was reading way too high. Intelligent software could have cross checked with airspeed, vertical speed, bank angle and engine setting and determined that one sensor was bad - and disregarded that reading, instead using the other reading to maintain stability. Or at the very least disabled nose-down trim while the disagreement was present. That would have prevented both accidents.

However, the author does have a good point in that a lot of things (including certification) went bad.

As to the certification issue, in April Boeing praised the new administration for helping them "streamline" the certification process. Looks like there might have been some downsides to such streamlining.
 
Good write up quoted by the OP. Looks pretty consistent with all I've read to date in a condensed form.

This gent is pretty good with facts. He flies right seat in 777 and is about the facts, not emotional diatribe. A good reporter. https://www.youtube.com/user/blancolirio/videos
 
billvon said:
Intelligent software could have cross checked with airspeed, vertical speed, bank angle and engine setting and determined that one sensor was bad - and disregarded that reading, instead using the other reading to maintain stability. Or at the very least disabled nose-down trim while the disagreement was present. That would have prevented both accidents.

I agree that better software can partially compensate for some of the other problems. An aircraft is a system and all the components have to work together as a system. Many high performance aircraft are so aerodynamically unstable they need computers just to stay in the air.
 
fechter said:
I agree that better software can partially compensate for some of the other problems. An aircraft is a system and all the components have to work together as a system. Many high performance aircraft are so aerodynamically unstable they need computers just to stay in the air.
Definitely. It's also worth noting that Airbus aircraft are now entirely fly by wire - they use computers full time to stabilize the plane (although the airframes themselves are relatively stable.) There have been a fair number of accidents related to this system, although to Airbus's credit there have been far fewer than anyone anticipated from such a complex system with 100% authority.
 
Reminds me of the stealth fighter teething problems. Built unstable as hell, it depends entirely on the software to make it flyable. They had some crashes in the first few years of operational use. But in that case, one pilot dies.

Very sad to see this happen in an airline. One thing they taught me in pilot training, all the FAA regulations are written in blood. This time, too much blood. But to be fair to Boeing, you do need it to be possible to get a new aircraft through certification.

IMO, if you need to be in autopilot during climb out to be stable, that's a bit too unstable a design for an airline. I hate the idea of pilots letting autopilot take off and land. If they are not good enough, they should not be commercial pilots. I'm not saying no fly by wire, but I'm saying when the pilot pulls that stick up, the plane should stick up. Not countermand the stick and dive. Not till they are at 10,000 feet AGL, and have room for recovery. Yes, that takes a pilot that doesn't stall the thing out on take off. Yes, that should be a no problem thing for a commercial pilot.
 
The best analysis I've seen is pointing to a lack of basic piloting skills which leads to not disengaging the autopilot then manually trimming the plane when there is a runaway trim.
They receive very little/no training on the specific flight characteristics of the Max8 which seems to include a strong pitch response to power changes.
 
Grantmac said:
The best analysis I've seen is pointing to a lack of basic piloting skills which leads to not disengaging the autopilot then manually trimming the plane when there is a runaway trim.

Is this something they're likely to have time to do if they're not ready & prepared for it happening?
 
Grantmac said:
The best analysis I've seen is pointing to a lack of basic piloting skills which leads to not disengaging the autopilot then manually trimming the plane when there is a runaway trim.
Those aren't basic pilot skills.

In any emergency, a pilot has three primary jobs:
Aviate
Navigate
Communicate

Those are the most basic piloting skills.* 99% of accidents out there occur because a pilot fails to do one of those three things.

Debugging the plane's systems comes in a distant fourth, and only happens after the rest of the above are under control. That's why this is such a big deal. Unless you know how to quickly disconnect that system (which was NOT part of any new training for the 737 MAX) the system will continue trimming down until it overpowers the pilot and makes it impossible for him to fly the airplane.

(* - one of the harder things for new pilots to learn is to do all those things first BEFORE debugging the plane. For example, during a single engine engine-out, you have to trim the plane for best glide, select an emergency landing site, set up a pattern for the landing site, declare an emergency - and only THEN start messing with the engine to try to get it restarted. During training, if you try to debug first (which is your first impulse) you'll get dinged by your instructor or evaluator.)
 
dogman dan said:
IMO, if you need to be in autopilot during climb out to be stable, that's a bit too unstable a design for an airline.
Well, the problem only occurs during sudden increases in engine power, or steep turns at high power settings - neither of which happens during normal climbout.
I'm not saying no fly by wire, but I'm saying when the pilot pulls that stick up, the plane should stick up. Not countermand the stick and dive.
Right, and it always does. But the trim system is capable of overpowering the pilot in this case.
Not till they are at 10,000 feet AGL, and have room for recovery. Yes, that takes a pilot that doesn't stall the thing out on take off. Yes, that should be a no problem thing for a commercial pilot.
99.9% of pilots take off manually. Were you thinking that the pilots of 737 MAX flights were taking off under autopilot control? They are not, and that wasn't the problem.
 
I've actually had a loss of engine power on initial climb that ended in a completely undamaged aircraft, which also happened to be an antique taildragger. So I've generally got those priorities in place when it really matters.

However dealing with run away trim falls under Aviate since it can overpower manually flying the aircraft. Since this isn't happening on take off but rather around the time they'd be engaging the autopilot then disengaging it definitely falls under aviating as well.
Any time you change the state of a system and something gets dramatically worse (with the exception of carb heat) then the very first action is to reverse that change.
 
Grantmac said:
I've actually had a loss of engine power on initial climb that ended in a completely undamaged aircraft, which also happened to be an antique taildragger. So I've generally got those priorities in place when it really matters.
Me too, although it didn't result in a forced landing (fortunately.)
However dealing with run away trim falls under Aviate since it can overpower manually flying the aircraft.
Right - but again, pilots are not (currently) trained to do that. And when the sh!t hits the fan, pilots generally fall back on their training.
Since this isn't happening on take off but rather around the time they'd be engaging the autopilot then disengaging it definitely falls under aviating as well.
Uh - it happened on takeoff both times. Lion Air reported problems just three minutes after takeoff; the Ethiopian flight reported problems one minute after takeoff. The autopilot was not engaged on the Lion Air flight, and was likely not engaged on the Ethiopian flight.
 
Fortunately it was a partial power situation and I made the field for a downwind landing. An incident but not an accident.


Do we KNOW they were hand flying? Because the auto usually goes on just after wheels up according to a few sources I've asked.

If the MCAS has the ability to apply that much trim when not on auto based on a non-redundant sensor then that seems like a huge design oversight.
 
More info being released..

...
An off duty pilot riding in the cockpit of the ( Lionair ) Boeing 737 Max 8 fixed a malfunction on the second-to-the-last flight for the aircraft before same plane crashed during a different flight in the Java Sea the next day, Bloomberg reported on Tuesday evening.
The pilot reportedly advised the crew to kill the power to a motor that was pointing the aircraft’s nose downward. That move helped prevent a catastrophe, according to Bloomberg.
The aircraft was being operated by a different crew the next day, on October 29, 2018, and crashed fewer than 15 minutes after takeoff, killing all 189 passengers and crew on board.
Apparently they had to switch off the MCAS system ?
But they never reported the problem to maintenance engineers. ! :shock:
https://www.businessinsider.com.au/lion-air-crash-off-duty-pilot-saved-boeing-737-max-2019-3?r=US&IR=T
 
That's nuts if true!

Likewise if the MCAS can over-power the pilot and crash the plane in manual flight based on a single sensor - I always had the impression any critical instruments or sensors were triple redundant on aircraft?

It's one thing to say "if auto-pilot malfunctions, disengage it and fly manually" but quite another to expect a pilot to diagnose a faulty system that can overpower manual flight and have to kill power to it (switch on console? breaker?) to regain control. It's hard to imagine such a design would pass review.

It's hard not to speculate on events like this...
 
Im a retarded balloon pilot, but I got the impression that the newest airline jets did take off and land on some kind of automated systems, rather than just glide path warnings. Leading to a bit of pilot dependence on it. This one is not exactly auto pilot I guess, but an all the time anti stall thing. What I heard was that in the past it would stick you down if you were stalling rather than just shake the stick, but the pilot could override it with brute strength. Now you cannot, unless you turn it off, as that guy did the day before the other crash. Busy aviating like hell, those pilots that crashed did not have time to break out a manual and find the kill switch. I just thought if they went full manual, or at least could over ride if required, to a higher altitude, then they would have time to aviate out a problem.
 
Grantmac said:
Do we KNOW they were hand flying?
On the first flight, yes. We have the DFDR records. They took off and the pilot started fighting the MCAS almost immediately. He'd trim the nose up, MCAS would trim it down. This happened over and over. At one point they lowered the flaps as part of the debugging process. This disabled that MCAS feature so the MCAS stopped fighting him. They likely thought they had fixed it. Then they raised the flaps, the fight started again - and the MCAS eventually won.

We don't have the DFDR data from the second flight yet. But if that problem occurred _without_ manual input, they would only have flown for a few seconds before nosing into the ground - so it's very likely they were hand flying.
If the MCAS has the ability to apply that much trim when not on auto based on a non-redundant sensor then that seems like a huge design oversight.
Yep.
 
Hillhater said:
Apparently they had to switch off the MCAS system ?
But they never reported the problem to maintenance engineers. !
Well, turned off the trim motors, rather than the MCAS. But without those motors the MCAS can't cause trouble. From what I read they DID report this. Once they started looking they saw something like a dozen reports of problems like this in the ASRS and in airline maintenance records.

dogman dan said:
Im a retarded balloon pilot, but I got the impression that the newest airline jets did take off and land on some kind of automated systems, rather than just glide path warnings.
While many modern airlines _can_ fly that way, it's been my experience that most don't. Takeoffs and landings are almost always hand flown. The type of landing where you need 100% automation (a type IIIC - zero ceiling, zero RVR) are rare and require a lot of certification of both airplane and pilot.
I just thought if they went full manual, or at least could over ride if required, to a higher altitude, then they would have time to aviate out a problem.
That was the old paradigm for Boeing. They changed it for the 737 MAX - and apparently pilots were unaware of the change.
Punx0r said:
Likewise if the MCAS can over-power the pilot and crash the plane in manual flight based on a single sensor - I always had the impression any critical instruments or sensors were triple redundant on aircraft?
Usually. From the DFDR data from the first flight, this was caused by a single bad sensor.
It's one thing to say "if auto-pilot malfunctions, disengage it and fly manually" but quite another to expect a pilot to diagnose a faulty system that can overpower manual flight and have to kill power to it (switch on console? breaker?) to regain control. It's hard to imagine such a design would pass review.
Washington (the state) has started a criminal investigation. They are looking to see:

1) How much warning Boeing had that this was a problem (we know they had a fair amount.)
2) How much political pressure was applied to the FAA to "streamline" the approval process for the MAX.
3) Whether the behavior of the MCAS system was misrepresented to pilots or regulators.

This might be necessary but it will also make the NTSB's investigation _much_ more difficult - because now people will be giving the answers they think will keep them out of jail.
 
That is some interesting insight Bill which is outside what I'd been able to bring up in my searching.

It definitely does not bode well for consumer trust in Boeing if it should come to light that at any point they has misgivings about that system.
 
I suppose the question now becomes...which airlines insisted on redundant AoA sensors, and which ones (like the two crashed planes) found a single vital sensor to be allowable.

I am DEEPLY disappointed that Boeing would allow any plane to leave their factory with single AoA sensor as an option. As stated above, certain vital components should have triple redundancy...
 
Since there are knowledgeable pilots on this thread, let me ask this question:

Early on, they stated how the 737 max engines could provide "lift." Well to me that was happening because they possibly installed the new engines with a positive tilt upwards, that bought more clearance for the nacelle with the runway but added a rotation thrust vector and subsequent lift vector that was engine power dependent.

I wonder if part of this MCAS installation was to tame the unintended consequence of getting more runway clearance for the nacelles?
 
spinningmagnets said:
which airlines insisted on redundant AoA sensors, and which ones (like the two crashed planes) found a single vital sensor to be allowable.
To be clear, the MAX has two AOA sensors. They are the 'dangly things' just beneath the cockpit windows on many airliners. One is shown in this pic: (vane in the middle of the circle)
DrY-rMlWsAANnIH.jpg


The software was only _looking_ at one sensor, though (or doing dumb averaging that caused a single failure to make it look like the AOA was too high - that's unclear so far.) If the software could reject the bad sensor data then you wouldn't have seen this problem. I suspect that their first attempt to fix it will be a software patch that detects and ignores a bad sensor. It shouldn't be hard - if your airspeed is normal and you are climbing, and one sensor shows a 30 degree AOA, it's bad.
 
bigmoose said:
Since there are knowledgeable pilots on this thread, let me ask this question:

Early on, they stated how the 737 max engines could provide "lift." Well to me that was happening because they possibly installed the new engines with a positive tilt upwards, that bought more clearance for the nacelle with the runway but added a rotation thrust vector and subsequent lift vector that was engine power dependent.
Maybe. But that's a very complicated thing to figure out and I doubt it was just one thing that caused the new pitch-up moment.
I wonder if part of this MCAS installation was to tame the unintended consequence of getting more runway clearance for the nacelles?

Yes, but I think it was a long train of issues that led to this.

1) They wanted the new efficient engines. But the fans were bigger, and the 737 was designed to be low to the ground* so you didn't need fancy equipment for cargo/luggage/maintenance.

2) There was no way they were going to fit under the wing. So they had to move them ahead of the wing - _and_ lengthen the landing gear. It also affected weight and balance.

3) The new position seemed to work OK at first. It should be noted that increasing power on aircraft with engine configurations like the 737 (and 757, and 767) you get some nose-up tendency because the engines are placed below the center of gravity. But in a few situations (like steep turns and high angles of attack at low speeds) this was more pronounced.

4) The pitch up during later flight envelope testing even caught the test pilots by surprise. I think Boeing referred to this as "unique aircraft handling characteristics" in a press release. An airframe redesign would have taken a long time, lost a lot of sales to Airbus and didn't even have a high chance of success, since so many things go into airframe design.** So they added a feature to the MCAS to trim the nose down in those two cases as a stopgap.

* - this was fine with the first JT8D engines - cigar-shaped low bypass engines. They were long but skinny. The next upgrade to the CFM56-3B engines was problematic because they couldn't fit them in the same space (sound familiar?) So they reduced the size of the fan, which reduced efficiency, and moved the engine further forward. They also played around with the inlet shape, which is why many 737's have a funny oval engine intake rather than a round one. It is likely that their success with this modification made them confident they could do it again on the MAX with the LEAP engines.

** - Everything affects everything in aircraft design. Raising the engines should have REDUCED the pitch-up tendency when power is applied since there is less mechanical advantage. But a positive rig angle - the angle the engines are mounted in relation to aircraft centerline, which you mentioned above - might have increased it. More air over the wing from the larger fan might have increased pitch-up when flaps are deployed, but not when they are not deployed - which is where they saw the problem. But if the larger, further forward engine shadows the wing at high angles of attack, the opposite may happen as the nose comes up. So it would be hard for anyone to say "ah, if we just change the rig angle/mounting position all will be well!"
 
Thanks Bill! Great write up and great information!
 
Back
Top