CrowdStrike Clowns Strike Again
Salvete, amici!
Along with millions of other people, the Mz. and the K’s fell victim to CrowdStrike’s incompetence last Friday:
- https://arstechnica.com/information-technology/2024/07/major-outages-at-crowdstrike-microsoft-leave-the-world-with-bsods-and-confusion/
- https://www.nytimes.com/2024/07/22/technology/crowdstrike-outage-congress-hearing.html
Their codeshare flight to Amsterdam was cancelled, though it wasn’t clear whether that was done preemptively by Delta or by KLM, and they were stuck at Berlin’s notoriously late, overbudget, and poorly designed “new” Brandenburg airport.1 While I tried to get help from Delta (there was an estimate 6 hour 45 minute wait for a callback on the helpline for Medallion members), KLM was fortunately able to rebook them through Paris with an overnight layover. KLM-AirFrance’s IT operations seem to have been substantially less exposed to the consequences of CrowdStrike’s blunder than those of their US partner, which lost track of its flight crews in the outage:
It looks like it will take through the end of this week for Delta to get things under control, though it would not surprise me if the airline takes another week to shake off its Blue Screen of Death hangover.
The root cause of the debacle appears to be dereferencing a null pointer, a basic programming error that every decent programmer working in memory-unsafe programming languages like C/C++ guards against paranoiacally:
If that is in fact the culprit, it speaks to poor coding and verification practices within CrowdStrike. (Automatic zero had it been a CS217 assignment. Dave Hansen should glide off the piste and beat those fools with his ski poles.) That kind of bug absolutely should have been caught in code review, or simply by using standard static analysis tools:
CrowdStrike has plenty of form for incompetence and mischief. Over a two-year stretch my work-issue laptop has gone through regular periods of near-unusabilty as the falcond daemon sucks up all of the CPU cycles and often lots of RAM. Each time the fix has been to wait, and wait, for CrowdStrike to issue a patch:
CrowdStrike’s suggested fixes for this cluster don’t inspire confidence the company’s technical chops:
And CrowdStrike’s CEO George Kurtz was the CTO of McAffee at the time the anti-virus software company caused another worldwide Windows Blue Screen Of Death meltdown when that company released a similarly faulty update:
(Though several outlets report that the damage from that led to McAfee being sold to Intel — by implication, at a bargain price – that 2011 transaction seems to have been only minimally affected by the 2010 fiasco. Intel bought McAfee at a 60% premium over its market price.)
Kurtz managed to fail upwards into billionaire status with CrowdStrike.
CrowdStrike was also leading rain wall in the early days of the bullshit hurricane of Russiagate, publicly proclaiming that Russian government sponsored hackers had breached DNC computers and removed a cache of highly embarrassing emails (revealing, among other things, the DNC’s coordinated efforts to rig the 2016 Democratic Party primary process against Bernie Sanders in order to secure the party’s nomination for Hilary Clinton). Only under oath in a closed session of the Senate Intelligence Committee did CrowdStrike’s Sean Henry admit that the company had no real evidence of Russian involvement, facts kept from the US public until mid-2020. Veteran CIA analyst Ray McGovern’s pieces for Consortium News explain this hoodwinking very clearly:2
FTC Chair Lina Kahn, who is likely the most beneficently effective public servant appointed by Joe Biden (probably an accident), connected the CrowdStrike-caused outage to market concentration: “All too often these days, a single glitch results in a system-wide outage, affecting industries from healthcare and airlines to banks and auto-dealers. Millions of people and businesses pay the price. … Another area where we may lack resiliency is cloud computing. In response to @FTC’s inquiry, market participants shared concerns about widespread reliance on a handful of cloud providers, noting that consolidation can create single points of failure.”
I’d love to see Kahn go after CrowdStrike, but I suspect the PR-centric groaf-tech company is too plugged in and too useful to too many political players for her to have any chance of curbing it.
-
The awfulness of Berlin Brandenburg Airport and the story of its excruciatingly slow and inept design and construction put paid to any residual notion of Germany as paragon of efficiency and engineering know-how:
- https://www.bbc.com/news/world-485273088
- https://www.dw.com/en/berlins-new-airport-finally-opens-a-story-of-failure-and-embarrassment/a-55446329
- https://edition.cnn.com/travel/article/berlin-brandenburg-airport-one-year-on/index.html
- https://www.thegermanreview.de/p/the-real-story-behind-berlins-airport
- https://interestingengineering.com/science/heres-how-berlin-brandenburg-airport-became-one-of-the-biggest-engineering-failures
The venerable Berlin-Tegel airport that it replaced was indeed too small and too old to serve the city much longer, but it was an exceptionally convenient and flier-friendly throwback to mid-20th century airport design. Until 2020 you could step out of a taxi, walk 20 meters to a check in desk, and then another 20 meters through security right at your departure gate. Incredibly easy.
-
CrowdStrike defended its hacking claims in an obfuscatory blogpost that relies on statements in the Senate Intelligence Committee report which ultimately refer back to CrowdStrike’s evidence-free assertions for substantiation, sometimes via a chain of assessments attributed to various US intelligence agencies which had not themselves conducted any digital forensics but instead relied on CrowdStrike’s summaries.