Daily Digest — 2026-05-27

26 newsletters today.

In this digest

Abandoned Futures — The Northrop YB-49: The Flying Wing Bomber That Was 40 Years Too Early and Killed by a Phone Call
ArXiv Paper Digest — HTMLCure: Turning Browser Experience into State Guided Repair for Interactive HTML
Daily Automotive Engines — Oil Pump Relief Valve Bypass Routing: Where Does the Excess Oil Actually Go?
Daily Debugging Puzzle — Go's Nil Map Asymmetry: The Read That Smiles While the Write Panics
Daily Digital Circuits — Built-In Self-Test (BIST): How Hardware Tests Its Own Memories and Logic at Power-Up
Daily Electrical Circuits — Current Mirrors: The Workhorse of Analog IC Bias Distribution
Daily Engineering Lesson — Ground Loops: Why "Ground" Isn't Always Zero Volts
Forgotten Books — The Victorian Spermicide: Quinine Pessaries and 8 Million Satisfied Customers
Forgotten Darkroom — The Rule-of-Thumb Workman vs. the Intelligent Investigator: A 1902 Plea for Scientific Literacy in the Trades
Forgotten Patent — Herman Hollerith's Electric Tabulating Machine: The 1889 Patent That Invented the Data Center — and Became IBM
Daily GitHub Zero Stars — polygentic/anti-fun
Daily Hardware Architecture — The Branch Predictor's Capacity Pressure: Why More Branches Means Worse Prediction
Hacker News Deep Cuts — An Intensive Introduction to Cryptography
HN Jobs Teardown — Khan Academy: What Their Hiring Reveals
Daily Low-Level Programming — TLB Shootdowns: Why Unmapping Memory on One Core Stalls All the Others
RFC Deep Dive — RFC 2577: FTP Security Considerations
Stack Overflow Unanswered — Shared memory between two elf files
Daily Software Engineering — The Merkle Tree Pattern: Comparing Massive Datasets Without Sending Them Over the Wire
Tool Nobody Knows — bbe: sed for Binary Files (When Hex-Editor-Through-a-Pipe Just Won't Do)
What If Engineering — What If We Pumped Liquid Nitrogen Through Skyscraper Beams to Make Them Stronger?
Wikipedia Rabbit Hole — Cryogenic deflashing
Daily YT Documentary — Building a Film Festival Without Gatekeeping | Not Film Fest at NAB 2026
Daily YT Electronics — How to Make a Transformerless Power Supply #engineering #electrical #gkfacts #currenttransformer
Daily YT Engineering — Whiteboard Lesson: The Fluid Dynamics of Why Your Room Won't Cool
Daily YT Maker — RaspberryPi5 Smart Kiosk Build Complete
Daily YT Welding — Start of the Fummins project. Removing the Cummins and CNC plasma cutting to fix a new engine stand.

Abandoned Futures

The Northrop YB-49: The Flying Wing Bomber That Was 40 Years Too Early and Killed by a Phone Call

2026-05-27

On June 5, 1948, test pilot Glen Edwards lifted the second prototype Northrop YB-49 off the runway at Muroc Army Air Field. Ninety minutes later, the aircraft disintegrated in the Mojave Desert, killing all five crew. The base was renamed Edwards Air Force Base in his honor. The flying wing program had four more years to live, and then Jack Northrop's masterpiece would be cut up with acetylene torches while he watched.

The YB-49 was a 172-foot-wingspan, eight-engine jet bomber with no fuselage and no tail. Just a wing. Northrop had been chasing this configuration since 1929, convinced that eliminating parasitic drag from fuselages and empennages was the path to true aerodynamic efficiency. The math was unassailable: a pure flying wing carries no structure that isn't generating lift. The XB-35 piston version flew in 1946; the YB-49 jet conversion flew in 1947. It could carry 10,000 pounds of bombs 2,000 miles. Its radar cross-section, accidentally, was so small that ground controllers repeatedly lost it on approach to Andrews Field in 1949 — a fact buried in test reports for thirty years until B-2 engineers rediscovered it.

Then came the cancellation. The official reasons were technical: yaw instability without a vertical tail made it a poor bombing platform with 1948 analog autopilots, and the bomb bay was too short for the Mark III nuclear weapon. The unofficial reason, according to Jack Northrop's 1979 deathbed interview recorded by Clete Roberts, was a meeting with Air Force Secretary Stuart Symington in 1948 where Symington allegedly demanded Northrop merge with Convair. Northrop refused. The contract went to the Convair B-36 instead. Symington became Convair's president shortly after leaving government.

Eleven flight-worthy airframes were ordered scrapped in 1953. Not mothballed. Scrapped. Jack Northrop, banned from his own factory's wing program, was reportedly told the destruction was to prevent "further development by parties who didn't deserve it." He didn't see a flying wing fly again until 1980, when classified B-2 program officials brought him into a SCIF and showed him the design. He died ten months later.

Why it works now:

Fly-by-wire: The yaw instability that doomed the YB-49 is trivially solved by digital flight controls running at 100+ Hz. The B-2 proved this in 1989. A clean-sheet civilian flying wing — say, a 250-passenger transport — would have 30% lower fuel burn than a conventional tube-and-wing per Boeing's own X-48B blended-wing-body tests (2007-2012).
Composite manufacturing: The YB-49's aluminum structure couldn't handle the bending loads of a true span-loaded wing. Carbon fiber laid up with automated fiber placement makes the 1948 structural problem disappear.
Cabin pressurization: The cylindrical fuselage exists largely because cylinders handle pressure differentials elegantly. Modern multi-bubble composite pressure vessels (proven on the 787) make non-cylindrical pressurized cabins manufacturable.
Boundary layer ingestion: Engines buried in the wing root, sucking in slow boundary-layer air, recover 4-8% additional efficiency — exactly the YB-49's engine placement.

NASA and Boeing's X-48 program demonstrated all of this between 2007 and 2013. The Aurora D8 "double bubble" took some of it forward. JetZero received $235 million in 2023 to build a full-scale BWB demonstrator by 2027. Eighty years after Glen Edwards died, the configuration is finally getting its industrial chance — but the lost decades were a choice, not a technical inevitability.

Key Takeaway: The YB-49 wasn't aerodynamically wrong — it was electronically premature and politically inconvenient, and we burned 60 years of efficiency gains pretending otherwise.

ArXiv Paper Digest

HTMLCure: Turning Browser Experience into State Guided Repair for Interactive HTML

2026-05-27

Authors: Jiajun Wu, Jian Yang, Tuney Zheng, Wei Zhang

ArXiv: 2605.26807v1

PDF: Download PDF

If you've ever asked an LLM to "build me a webpage," you've probably noticed something strange: the page looks great in the initial screenshot, but the moment you actually try to use it — scroll, click a button, resize the window, play the little game it generated — something falls apart. A dropdown doesn't open. A modal traps you. The layout collapses on mobile. The page looked fine; it just didn't work fine.

That gap between "looks correct" and "behaves correctly" is the problem HTMLCure tackles. The authors point out that most evaluation pipelines for AI-generated HTML judge pages from a single screenshot, which means tons of interactive bugs slip through. Worse, when a page does fail the screenshot test, it often gets thrown out entirely — even though many of those pages are almost right and could be fixed with a small repair.

HTMLCure's idea is to evaluate and repair HTML the way a real user would experience it. Specifically, it:

Actually drives the browser. It loads the page across multiple viewports (desktop, mobile, etc.) and exercises it through realistic interactions — scrolling, hovering, clicking, resizing, even playing through gameplay states.
Records what happens at each step. Instead of just capturing one screenshot at the end, it logs deterministic browser traces: what the DOM looked like, what events fired, what visual state the page was in at each interaction.
Feeds those traces back as repair signals. When something breaks — say, a button visually appears but does nothing on click — the trace gives an LLM precise, state-grounded evidence of the failure, not just a vague "this page looks wrong."

The "state-guided" part is the key insight. A static screenshot tells you what a page is; a recorded interaction trace tells you what a page does and fails to do. By treating the browser itself as the source of truth — and turning real interaction history into structured feedback an LLM can act on — HTMLCure can fix pages that screenshot-based evaluators would have silently discarded.

This matters more than it might sound. As LLMs increasingly generate real applications instead of static mockups, the bottleneck isn't generation quality — it's verification under use. Tools that only score the first frame will keep approving pages that fail on the second click. HTMLCure points at a different evaluation paradigm: judge code by interacting with it, and use that interaction itself as the repair signal.

Why it matters: As AI-generated interfaces move from demos to deployment, evaluation has to shift from "does it render?" to "does it behave?" — and HTMLCure shows that browser interaction traces can do both the judging and the fixing.

Daily Automotive Engines

Oil Pump Relief Valve Bypass Routing: Where Does the Excess Oil Actually Go?

2026-05-27

We've covered relief valves before as the pressure-limiting safety device, but here's the question most people never ask: when that relief valve cracks open and dumps oil, where does it actually go? The answer separates good engine designs from oil-starvation grenades.

There are three common bypass routing strategies, and each has serious consequences for cold-start behavior, aeration, and pump efficiency.

Internal bypass (pump-to-inlet): Excess oil dumps back to the pump's suction side. Simple, compact, and used on most gerotor pumps. Problem: the oil recirculates through the pump repeatedly, picking up heat and shearing. On a cold start with 0W-20 at -20°F, you can see relief flow >50% of pump output, and that oil never cools.
External bypass (return-to-sump): Excess oil routed back to the oil pan via a dedicated passage, usually splashing onto the windage tray or pan baffle. Cooler oil, but the falling stream aerates the sump — a major cause of foamy oil on high-RPM engines. BMW's S65 V8 famously had aeration issues partly from this.
Two-stage bypass: Premium designs (Porsche, modern Ford Coyote) route bypass through a calmed passage that discharges below the oil level, eliminating aeration. Adds cost and complexity but preserves oil quality.

Real-world example: The LS-series small block uses internal bypass on its gerotor pump. During cold starts in winter, owners report momentary low-oil-pressure warnings — that's the relief valve fully open, oil short-circuiting the pump inlet, and pressure briefly spiking then dropping as viscosity falls. It's normal behavior, but it's also why oil pump shaft failures (the infamous LS pump shaft snap) tend to happen on cold mornings: the relief valve can't dump enough volume fast enough, and pressure spikes drive shaft torque past its limit.

Rule of thumb: Relief valve flow capacity should equal at least 40% of maximum pump displacement. If your pump moves 10 gpm at redline and the relief can only dump 3 gpm, you'll over-pressure the gallery and crack the filter housing or blow the front seal. This is why aftermarket "high-volume" pumps without matched relief valve upgrades can destroy stock oiling systems.

One subtle detail: bypass passage diameter matters as much as valve diameter. A relief valve that opens fully but discharges through a restrictive passage creates back-pressure that fights the spring, causing pressure oscillation and valve flutter. Engineers size the bypass passage at 1.5–2× the valve port area to ensure free flow.

See it in action: Check out 07-13 Chevy 2nd oil pressure relief @nnbsxisaac2280 #chevrolet #shorts by JD Coats to see this theory applied.

Key Takeaway: Where bypassed oil goes — back to pump inlet, splashing into the sump, or routed below oil level — determines aeration, cold-start behavior, and whether your high-volume pump upgrade kills your engine.

Daily Debugging Puzzle

Go's Nil Map Asymmetry: The Read That Smiles While the Write Panics

2026-05-27

This program loads a server config from JSON and adds a request-ID header before serving. The JSON file looks fine, the loader returns no error, even the first header lookup prints cleanly. Then the program crashes with a panic on a line that is doing the most ordinary thing imaginable.

type ServerConfig struct {
    Host    string            `json:"host"`
    Port    int               `json:"port"`
    Headers map[string]string `json:"headers"`
}

func loadConfig(path string) (*ServerConfig, error) {
    data, err := os.ReadFile(path)
    if err != nil {
        return nil, err
    }
    cfg := &ServerConfig{}
    if err := json.Unmarshal(data, cfg); err != nil {
        return nil, err
    }
    return cfg, nil
}

func main() {
    // server.json contains: {"host": "api.example.com", "port": 443}
    cfg, err := loadConfig("server.json")
    if err != nil {
        log.Fatal(err)
    }

    fmt.Println("existing X-Auth:", cfg.Headers["X-Auth"]) // prints "", no panic
    cfg.Headers["X-Request-ID"] = newRequestID()           // panic here
    log.Fatal(http.ListenAndServe(":8080", handler(cfg)))
}

The Bug

The JSON file has no "headers" key. json.Unmarshal doesn't fabricate fields that aren't present — it leaves cfg.Headers at its zero value, which for a map is nil. So far, so reasonable.

The cruel part is Go's deliberately asymmetric behavior around nil maps:

Reading from a nil map is legal. m[k] returns the zero value of the value type. len(m) returns 0. for k, v := range m simply runs zero iterations. No panic, no warning.
Writing to a nil map panics with assignment to entry in nil map. There is no allocation-on-demand; the runtime needs a hash table to insert into, and there isn't one.

This asymmetry is what turns the bug from "obvious nil deref" into "ticking time bomb." The diagnostic line — fmt.Println(cfg.Headers["X-Auth"]) — that a careful programmer would write to confirm the map looks healthy will always succeed, regardless of whether the map is nil or empty. The map looks empty, behaves empty, and reports empty length. Only the first write reveals that "empty" and "nil" are not the same animal.

It's especially nasty because the panic site has no syntactic clue. cfg.Headers["X-Request-ID"] = ... is the kind of line that gets reviewed in two seconds. The nil-ness was introduced thousands of bytes earlier, in a deserialization step that didn't even know Headers existed.

The Fix

After unmarshaling, ensure any map fields you intend to write to are non-nil. Either initialize them eagerly in your loader, or give the type a UnmarshalJSON that does it:

func loadConfig(path string) (*ServerConfig, error) {
    data, err := os.ReadFile(path)
    if err != nil {
        return nil, err
    }
    cfg := &ServerConfig{}
    if err := json.Unmarshal(data, cfg); err != nil {
        return nil, err
    }
    if cfg.Headers == nil {
        cfg.Headers = make(map[string]string)
    }
    return cfg, nil
}

A defensive alternative at the call site is maps.Insert patterns or a small helper like setHeader(cfg, k, v) that lazily allocates. Slices avoid this trap because append on a nil slice allocates, but maps have no equivalent — there is no mapAppend, only direct subscript assignment, and that requires a backing table.

Key Takeaway: In Go, a nil map reads silently like an empty one but panics on every write — so always initialize map fields after deserialization, because absent JSON keys leave them nil.

Daily Digital Circuits

Built-In Self-Test (BIST): How Hardware Tests Its Own Memories and Logic at Power-Up

2026-05-27

Once a chip leaves the fab, you can't probe its internal SRAMs with a logic analyzer — they're buried under ten metal layers. So designers bake the tester into the silicon itself. Built-In Self-Test (BIST) is dedicated on-chip hardware that exercises memories and logic, compares results against expected values, and raises a pass/fail flag — all without external equipment.

Memory BIST (MBIST) is the most common flavor. A small finite state machine walks an algorithmic pattern across every cell of an embedded SRAM. The classic algorithm is March C-:

↕ Write 0 to all cells (any order)
↑ Read 0, write 1, read 1 (ascending addresses)
↑ Read 1, write 0, read 0 (ascending)
↓ Read 0, write 1, read 1 (descending)
↓ Read 1, write 0, read 0 (descending)
↕ Read 0

This catches stuck-at faults, transition faults, coupling faults between adjacent cells, and address decoder errors — the failure modes that real SRAMs actually exhibit. The whole sequence runs in 10N cycles where N is the number of addresses.

Rule of thumb: A 1 Mb SRAM (131,072 × 8) clocked at 500 MHz finishes March C- in 131,072 × 10 / 500e6 ≈ 2.6 ms. That's why boot delays on modern SoCs include a "BIST window" — the chip is literally testing its own RAMs before letting firmware touch them.

Logic BIST (LBIST) takes a different tack: an on-chip LFSR generates pseudo-random patterns, scans them into the design's flip-flops, captures the response, and compresses it into a MISR (Multiple Input Signature Register). The final signature is compared against a golden value computed during simulation. One 32-bit signature represents millions of cycles of behavior — if even one flop misbehaved, the signature mismatches.

Real-world example: Automotive MCUs under ISO 26262 functional safety must demonstrate fault detection during every key-on event. An NXP S32K344 runs both MBIST and LBIST during the ~50 ms boot window. If any block fails, the chip refuses to release the CPU from reset — better a dead dashboard than a runaway throttle. The same hardware also runs periodic online BIST during idle slots, scrubbing for faults that develop while the car is driving.

The cost is area: MBIST controllers add roughly 2–4% to SRAM area, and full-coverage scan chains for LBIST add ~5–10% to logic area. Compared to the cost of shipping a defective chip — or worse, a chip that fails in the field — it's a bargain.

See it in action: Check out Software Based Self Test of Embedded Processors by ProjetsECE to see this theory applied.

Key Takeaway: BIST hardware turns the chip into its own tester, running algorithmic memory patterns and pseudo-random logic stimuli at power-up so silicon faults are caught before software ever runs.

Daily Electrical Circuits

Current Mirrors: The Workhorse of Analog IC Bias Distribution

2026-05-27

Inside nearly every op-amp, comparator, and analog IC you've ever used, dozens of current mirrors are quietly copying a single reference current to bias differential pairs, output stages, and gain nodes. Understanding them unlocks how analog ICs actually work — and lets you build precision bias networks on a breadboard with matched transistor pairs.

The basic BJT mirror uses two matched transistors (Q1 and Q2) with their bases tied together and emitters grounded. Q1 is diode-connected (base shorted to collector), forcing it into the active region where its V_BE sets the shared base voltage. Since Q2 sees the same V_BE, it conducts the same collector current — assuming matched devices and equal temperatures.

The reference current is set by a resistor from V_CC to Q1's collector:

I_REF = (V_CC − V_BE) / R

For V_CC = 5 V, V_BE ≈ 0.65 V, R = 4.3 kΩ → I_REF ≈ 1 mA. Q2 then sources (or sinks) 1 mA into whatever load you connect to its collector, regardless of load voltage — as long as Q2 stays out of saturation (V_CE > ~0.2 V).

Real-world non-idealities matter:

Base current error: Both transistors steal base current from I_REF, so I_OUT = I_REF × β/(β+2). For β=100, that's a 2% error. Fix with a Wilson or cascode mirror.
Early effect: Q2's output current rises slightly as V_CE2 increases, giving finite output impedance (~50–200 kΩ). A cascode mirror boosts this to many megohms.
Thermal mismatch: A 2 mV V_BE difference causes ~8% current error. Use a matched pair like the BCM847 dual NPN or the venerable CA3046 transistor array — same die, same temperature.

Practical example: Biasing a differential pair tail current. You want 200 µA flowing through the diff pair's emitters, independent of supply variation. A simple resistor wouldn't do — its current would shift with V_EE. A current mirror referenced to a stable voltage (a bandgap, say) holds the tail current rock-steady. This is exactly how the LM741's 19 µA tail current is set internally.

The Widlar variant adds a small emitter resistor to Q2, letting you generate microamp currents from a milliamp reference without using a 5 MΩ resistor: I_OUT × R_E = V_T × ln(I_REF/I_OUT). With I_REF=1 mA, R_E=5 kΩ → I_OUT ≈ 20 µA. Beautiful logarithmic compression in one resistor.

MOSFET mirrors work identically but with no base current error — perfect for CMOS ICs, though they suffer worse V_GS matching than BJTs.

Key Takeaway: A current mirror copies one well-defined reference current to many loads using matched transistors, forming the bias backbone of every analog IC ever made.

Daily Engineering Lesson

Ground Loops: Why "Ground" Isn't Always Zero Volts

2026-05-27

In schematics, ground is a single triangle symbol — a reference point assumed to be 0V everywhere. In real systems, ground is copper wire with resistance, and current flowing through that resistance creates voltage differences between points that are all supposedly "ground." When two pieces of equipment share a ground connection through multiple paths, you create a ground loop: a closed conductive loop that picks up magnetic interference and carries stray currents you never intended.

How they form: Imagine a sensor mounted on a machine, wired back to a PLC in a control cabinet 30 meters away. The sensor's signal ground connects to the PLC's analog ground via the cable shield. But both the machine and the cabinet are bonded to building ground through their power cords. Now there are two paths between the sensor ground and PLC ground: through the signal cable, and through the building's safety ground. That loop is an antenna.

Why it matters: A 60 Hz magnetic field from a nearby motor passes through the loop and induces a circulating current (Faraday's law). Even a few millivolts of difference between "grounds" gets injected into your signal path. For a 4-20 mA loop you might not notice. For a thermocouple reading microvolts, or audio equipment, you hear it as a 60 Hz hum.

Rule of thumb: Copper wire is roughly 1.7 mΩ per foot for 12 AWG. If 10 A of fault or return current flows through 50 feet of ground conductor, the voltage drop is 10 × 50 × 0.0017 ≈ 0.85 V. That's not "zero" — and any sensitive instrument referencing the far end sees that 0.85 V as offset.

Real-world example: Recording studios and broadcast facilities are obsessive about this. A guitarist plugs an amplifier into one wall outlet and a pedalboard into another on a different circuit. The two outlets' ground references differ by a few hundred millivolts at 60 Hz. The signal cable between them completes the loop, and the speakers buzz audibly. Fix: plug everything into one outlet strip (single-point ground), or use an isolation transformer on the signal line.

Common mitigations:

Single-point grounding (star topology): all grounds converge at one node, no loops possible
Differential signaling: RS-485, 4-20 mA, balanced audio — receivers ignore common-mode ground offsets
Optical isolation: optocouplers or fiber break the conductive path entirely
Isolation transformers: for analog signals or AC power
Ground the shield at one end only: common practice for instrumentation cable

See it in action: Check out SHORTS - WHY WE BOND (Neutral

amp; Ground) Explained in 3 Minutes by Electrician U to see this theory applied.

Key Takeaway: Ground is a wire, not a magical zero-volt plane — and any loop you form between two "ground" points will pick up noise and carry currents that corrupt your signals.

Forgotten Books

The Victorian Spermicide: Quinine Pessaries and 8 Million Satisfied Customers

2026-05-27

Book: Dr. Foote's home cyclopedia of popular medical, social and sexual science by Edward B. Foote (1902)

Read it: Internet Archive

Tucked into the back-matter advertisements of Dr. Foote's 1902 medical encyclopedia is a remarkable artifact of Victorian reproductive technology: a discreet pitch from A. Lambert & Co. of 16 Dalston Lane, London, for "VIMULE Soluble Tablets," sold as a contraceptive pessary. What is startling is not that such a product existed, but the chemistry behind it — and the casual confidence with which it was marketed.

"Their active property is a special preparation of Quinine, which has been proved, under careful microscopic examination and experience of many years to be the most perfect of any for instantly destroying the generative nature of the seminal fluid."

The ad goes further, boasting commercial scale that would impress a modern e-commerce founder:

"For a considerable time we have recommended these to numbers of ladies in delicate health, and although we have sold upwards of eight millions 8,000,000 we have not had a single complaint of failure."

Dr. Edward Bliss Foote (1829–1906) was a New York physician and one of the most prolific popular-medical writers of the 19th century — a freethinker who championed birth control, dress reform, and frank sexual education at a time when the Comstock laws made distributing such information a federal crime. His Home Cyclopedia was a doorstop compendium that mixed practical recipes with surprisingly progressive social commentary, and its commercial appendices reveal an entire shadow economy of "marital appliances" sold by mail order.

Was the quinine claim real? Surprisingly, yes — partially. Quinine is a genuine spermicide. In 1885, the British pharmacist Walter Rendell introduced quinine-based pessaries that became the standard chemical contraceptive in Britain for roughly fifty years, used widely until the 1930s. Laboratory studies confirmed that quinine sulfate at sufficient concentrations does immobilize spermatozoa. Marie Stopes, the founding figure of British family planning, recommended quinine pessaries in her clinics well into the 1920s.

The catch: quinine's spermicidal effect is weak and unreliable compared to modern nonoxynol-9, and the "8 million sold, zero failures" claim is obvious advertising puffery. Real-world failure rates were likely 20–30% per year — better than nothing, but the "absolutely reliable and unfailing" pitch contributed to many unplanned pregnancies. Quinine pessaries were finally displaced by the rubber diaphragm and later by hormonal contraception.

The truly forgotten detail is the marketing positioning:

"Can be used by the wife without the knowledge of her husband" — covert female-controlled contraception, decades before "the pill" became shorthand for the same idea.
Sold in "sealed metal boxes... perfectly free from observation" — the discreet brown-paper-package model still used by sexual wellness brands today.
Distributed by a "Manufacturer of Surgical Instruments" that also sold trusses, syringes, and "Every description of Appliance for the Prevention of Conception" — a Victorian Planned Parenthood storefront hiding inside a medical-supply shop.

A century before Plan B and modern femtech apps, a London surgical-supply firm was already selling discreet, female-controlled fertility products by mail — with overconfident efficacy claims that would make a modern FTC regulator wince.

The forgotten claim: Quinine pessaries were the dominant chemical contraceptive of the late Victorian era — real spermicide chemistry hidden inside aggressive, unreliable marketing.

Forgotten Darkroom

The Rule-of-Thumb Workman vs. the Intelligent Investigator: A 1902 Plea for Scientific Literacy in the Trades

2026-05-27

Book: A manual of photoengraving : containing practical instructions for producing photoengraved plates in relief-line and half-tone by Jenkins, H. (Harry), 1868- (1902)

Read it: Internet Archive

Buried in the preface of a 1902 trade manual for photoengravers — a workshop guide for men who etched zinc plates with acid to reproduce illustrations for newspapers and magazines — Harry Jenkins makes a quietly radical argument about what it means to be a skilled tradesman. The book itself is a practical recipe book: how to coat a plate, how to mix the bichromate sensitizer, how to expose a halftone screen. But before the formulas, Jenkins pauses to deliver a small sermon:

The importance of a study of the scientific laws upon which the practical work is based can not be too strongly emphasized. It is the possession of this knowledge that makes the difference between the intelligent investigator and the "rule-of-thumb" workman, and the student is urged to give ample attention to these fundamental principles.

This is a striking framing for 1902. The photoengraving trade was, at the time, the digital photography of its day — a young, fast-moving technology that had only existed commercially for about two decades. Most practitioners learned by apprenticeship, copying the master's gestures without understanding why the chemistry worked. A "rule-of-thumb" workman could produce a perfectly serviceable plate on a good day, but when the bath went off, or the weather shifted, or a new pigment behaved oddly, he had no theory to fall back on.

Jenkins is arguing for what we'd now call first-principles thinking. He wants his readers to understand the chemistry of silver halide reduction, the optics of the halftone screen, the acid-resist behavior of hardened gelatin — not because the theory is beautiful, but because recipes break and principles don't.

What's remarkable is how thoroughly modern society has re-learned this lesson and then forgotten it again. The contemporary parallel is almost too obvious: the developer who can wire up a React component by pattern-matching against Stack Overflow but cannot debug when the framework misbehaves; the data scientist who runs model.fit() without understanding gradient descent; the home cook who follows recipes flawlessly but cannot rescue a broken sauce. Jenkins would recognize all of them as "rule-of-thumb workmen."

The forgotten wisdom here isn't a technique or a recipe — it's an attitude toward craft. In an era when YouTube tutorials and LLM-generated code make the rule-of-thumb path easier than ever, Jenkins's 1902 preface reads like a warning shot across 124 years. He believed that the difference between a competent tradesman and a master wasn't dexterity or experience, but the willingness to learn the underlying science when the trade school told you that you didn't need to.

The photoengraving industry that Jenkins wrote for was obliterated by offset lithography within fifty years. But the practitioners who understood the chemistry — the Frederic E. Ives who co-wrote Jenkins's color-work chapters — went on to invent the next generation of imaging. The rule-of-thumb workmen simply lost their jobs.

The forgotten claim: Mastery in any technical trade comes not from memorizing procedures but from understanding the scientific principles beneath them — because recipes fail and principles don't.

Forgotten Patent

Herman Hollerith's Electric Tabulating Machine: The 1889 Patent That Invented the Data Center — and Became IBM

2026-05-27

In 1880, the U.S. Census took eight years to count 50 million people. By the time the results were published, they were already obsolete. The Census Bureau feared the 1890 count — projected at 63 million — might take longer than the decade itself. A young engineer named Herman Hollerith, working at the Bureau and watching clerks hand-tally tick marks for years, decided to mechanize the problem.

His solution was filed as U.S. Patent 395,782, "Art of Compiling Statistics," granted January 8, 1889. The mechanism: encode each person's data — age, sex, race, occupation, marital status — as a pattern of holes punched into a stiff card the size of an 1887 dollar bill. To tabulate, a clerk placed the card under a press of spring-loaded pins. Where a hole existed, the pin dropped through into a mercury-filled cup beneath, completing an electrical circuit that advanced a counter dial by one. To sort, the same circuit triggered a solenoid that flipped open the lid of a corresponding sorting bin.

It was, in essence, a parallel query engine. Set the wiring to "count cards where column-12 = female AND column-22 = widowed" and the machine performed what we'd now call a SELECT COUNT(*) WHERE filter — at roughly 80 cards per minute, against millions of records.

The 1890 Census finished in six weeks for the initial population count, and the full statistical analysis in two years. Hollerith's machines saved the government an estimated $5 million (about $170 million today). Governments worldwide lined up: Russia, Austria, Canada, France, Norway. In 1896 Hollerith founded the Tabulating Machine Company. In 1911 it merged with three others. In 1924 the conglomerate renamed itself International Business Machines — IBM.

What's startling is how many modern concepts the patent already contained:

Structured records with fixed fields — each column position meant something specific. This is a database schema.
Boolean query composition — by wiring multiple sensing pins in series (AND) or parallel (OR), an operator built compound predicates. This is SQL's WHERE clause as copper and mercury.
Indexed sorting — the sorter bins are a one-pass radix sort, the same algorithm modern databases use for billion-row joins.
Separation of storage and compute — cards (data) were physically distinct from the tabulator (processor). Swap the cards, run a different query. This is the architectural premise of every data warehouse from Teradata to Snowflake.
Batch processing — load a tray, return later for results. The mainframe job queue, born 70 years before mainframes.

The punched card itself outlived Hollerith by half a century. FORTRAN, COBOL, and the entire 1960s programming workflow ran on 80-column cards descended directly from his 1889 design. The IBM 80-column card was standardized in 1928 — and the column count survives in your terminal emulator's default width today.

Could it be built better now? It already was: every analytical database — BigQuery, Redshift, ClickHouse, DuckDB — is Hollerith's machine with the mercury cups replaced by SSDs and the sorting bins replaced by hash partitions. The conceptual leap was his. The semiconductors are an implementation detail.

Key Takeaway: Hollerith's 1889 patent didn't just mechanize counting — it invented the structured record, the indexed query, and the separation of data from processor, founding both IBM and the entire architecture of modern analytical databases.

Daily GitHub Zero Stars

polygentic/anti-fun

2026-05-27

Language: Unknown

Link: https://github.com/polygentic/anti-fun

Out of today's batch of zero-star repos, polygentic/anti-fun stands out purely on the strength of its name. In a sea of "blank-app," "dotfiles2," and randomly-named placeholder projects, a repo called anti-fun under an org called polygentic demands a second look. There's no description, no language detected, and no topics — but the naming convention suggests someone thinking carefully about what they're building rather than reaching for a generic scaffold.

The polygentic organization name itself is interesting wordplay — riffing on "polyglot" or "polygenic," it hints at a project space dealing with multi-agent systems, multi-language tooling, or perhaps genetic/evolutionary approaches. "Anti-fun" as a project name has the deadpan quality of names like boring-company or no-bullshit — projects that lean into being deliberately unglamorous about something the maintainer takes seriously.

Without a README or code yet visible, this repo is essentially a name claim — an early stake in a concept. That's actually one of the more interesting moments to discover a project, because:

You can watch an idea take shape from commit zero
The earliest contributors often have outsized influence on direction
The naming choices often reveal the maintainer's taste and worldview before any code does

Who might find it useful: Developers who enjoy following projects from inception, anyone curious about polyagent/polyglot systems, and people who appreciate a project name that doesn't try to sound enterprise-friendly. Star it now and you'll be employee #1 if it turns into something — or you'll have a fun bookmark to revisit in six months when you've forgotten what "anti-fun" was supposed to mean.

Sometimes the gems are the ones you have to wait on.

Why check it out: A deliberately-named project from an intriguingly-named org caught at commit zero — perfect for developers who like getting in on the ground floor of an idea before it has a README.

Daily Hardware Architecture

The Branch Predictor's Capacity Pressure: Why More Branches Means Worse Prediction

2026-05-27

Every branch predictor we've discussed — TAGE, perceptron, tournament — assumes it has enough table entries to remember each branch's behavior. In reality, predictor tables are small (typically 4–32 KB), and your program has thousands of dynamic branches per millisecond. When branches outnumber entries, they alias: two unrelated branches hash to the same predictor slot and corrupt each other's history. This is called predictor capacity pressure, and it's the silent killer of prediction accuracy in large codebases.

The math is brutal. A typical L1 BTB holds ~4K entries. A modern Pattern History Table holds ~16K 2-bit counters. If your hot loop spans 8K static branches (common in interpreters, JITs, and big switch-heavy code like protobuf decoders), half your branches collide. Aliasing patterns can be constructive (both branches go the same way — no harm) or destructive (they disagree, and the counter oscillates uselessly).

Real-world example: The V8 JavaScript engine's interpreter used to run a giant switch statement dispatching ~200 bytecodes through a single indirect branch. That one indirect branch saw essentially random targets, and the BTB couldn't predict it — mispredict rates hit 40%+. The fix was threaded dispatch: each bytecode handler ends with its own indirect jump to the next handler. Now the BTB sees ~200 separate branch sites, each with its own history of "what bytecode usually follows me." Mispredict rates dropped to ~10%, and interpreter throughput nearly doubled.

Rule of thumb: If your working set of taken branches exceeds the BTB size, expect 2–5× more mispredicts. Estimate it: count unique branch PCs executed in your hot path. If it's over ~4000, you're aliasing. perf stat -e branch-misses,branches tells you the rate; perf record -e branch-misses tells you where they cluster.

Modern CPUs fight back with several techniques:

Hierarchical BTBs: Intel's Golden Cove has L1 BTB (~1K), L2 BTB (~12K). L1 hits in 1 cycle; L2 hits cost a bubble but cover more branches.
Tagged predictors (TAGE): Use partial tags so colliding branches detect the collision and fall back to a base predictor instead of corrupting each other.
PC-folding hash functions: XOR multiple PC bits to spread branches more uniformly across tables.

The practical lesson for programmers: code size matters for prediction accuracy, not just for icache. Macro-expanded code, unrolled loops past a point, and bloated template instantiations create branches that compete for predictor entries. Sometimes a smaller, "slower" loop predicts better and runs faster overall.

Key Takeaway: Branch predictors have fixed capacity — when your code has more branches than table entries, they collide and corrupt each other's history, making smaller code paths often predict (and run) faster than larger ones.

Hacker News Deep Cuts

An Intensive Introduction to Cryptography

2026-05-27

Link: https://intensecrypto.org/public/index.html

HN Discussion: 3 points, 0 comments

Free, high-quality cryptography textbooks are rare. Most introductory material splits into two unsatisfying camps: cookbook-style guides that teach you which library function to call without explaining why, and graduate-level treatises that demand measure theory before page ten. Boaz Barak's An Intensive Introduction to Cryptography — the lecture notes from his Harvard course — sits in the genuinely useful middle.

Based on the URL and the title, this is the full online edition of Barak's textbook-in-progress, freely readable at intensecrypto.org. Barak is one of the more thoughtful expositors in theoretical CS, and his approach to cryptography reflects that: he treats crypto as a branch of computational complexity rather than a bag of tricks. The "intensive" in the title isn't marketing — it means the book derives security properties from first principles, with proofs.

What a technical reader is likely to find here:

Definitional rigor. Most engineers know that AES is "secure" but couldn't precisely state what that means. Barak's notes walk through IND-CPA, IND-CCA, and the reduction-based proofs that connect them to hardness assumptions.
Modern coverage. Beyond the symmetric/asymmetric basics, the course historically covers zero-knowledge proofs, fully homomorphic encryption, lattice-based crypto, and post-quantum constructions — the actual frontier, not just the textbook 1990s pipeline.
Pseudorandomness as the unifying concept. Barak's framing — that almost all of cryptography reduces to constructing pseudorandom objects from minimal assumptions — is genuinely clarifying once it clicks.
Free and maintained. No paywall, no DRM, no $120 hardback. The notes get revised as the field evolves.

For practitioners, the value isn't that you'll write your own AES (you shouldn't). It's that after working through material like this, you stop treating crypto primitives as magic. You can read a protocol spec, identify the security game it's playing, and notice when something is off — when a "nonce" is being reused, when a MAC is missing, when a construction is leaking through a side channel. That intuition is exactly what separates engineers who quietly ship subtly broken systems from those who don't.

At three points and zero comments, this is precisely the kind of submission HN tends to upvote enthusiastically when someone notices it — a free, high-signal educational resource from a credible author. It just landed at the wrong moment in the queue.

Why it deserves more upvotes: A rigorous, free, modern cryptography textbook from a top theoretical CS expositor is exactly the evergreen reference HN exists to surface.

HN Jobs Teardown

Khan Academy: What Their Hiring Reveals

2026-05-27

Source: HN Who is Hiring

Posted by: dangoor

Of the ten postings, Khan Academy's is the most strategically revealing because it contains a quiet confession most companies bury: "Our site has been built on Python 2 and Google App Engine for its first 10 years of exis[tence]." That single clause is the entire teardown.

The stack tells a survival story. Python 2 reached end-of-life on January 1, 2020 — roughly three months before this posting. Khan Academy is openly hiring senior backend/fullstack engineers to do what is almost certainly a multi-year migration off a deprecated runtime and off classic Google App Engine (a platform Google itself has been steering customers away from in favor of GAE Standard 2nd gen, Cloud Run, and GKE). The fact that they lead with this in the posting — rather than hiding it — is a recruiting tactic: they're filtering for engineers who get excited about brownfield migrations on systems serving millions of students.

What it reveals about the company's stage. Khan Academy is 10+ years old, non-profit, and load-bearing for global education — and the timing (March/April 2020, school closures worldwide) means traffic just exploded. They explicitly mention being focused on "helping teachers working in their classrooms (and a lot of teachers and parents are using Khan away from their classrooms right now!)." So they're scaling demand and re-platforming and doing it as a non-profit. That's a brutal trifecta.

Skills/trends highlighted:

Legacy migration expertise is back in demand — the industry is paying off a decade of GAE/Python 2 technical debt simultaneously.
Remote USA/Canada hiring from a Mountain View company in March 2020 — early signal of the pandemic-driven remote-default shift that would reshape SF Bay hiring within months.
Mission-driven hiring as compensation strategy — non-profits compete with FAANG salaries by selling impact, and "free education for anyone, anywhere" during a global school shutdown is the strongest version of that pitch they'll ever have.

Green flags: radical honesty about the stack, clear mission, remote-friendly, senior-level roles (no junior burnout fodder for the migration).

Red flags: a 10-year Python 2 / GAE codebase means significant coupling to deprecated APIs (ndb, the old taskqueue, webapp2). Whoever joins is signing up for years of 2to3, datastore migrations, and arguing about whether to go Cloud Run vs. GKE. The posting doesn't mention what target stack they've chosen — which suggests they may not have decided yet.

The signal: The Python 2 sunset plus COVID-driven traffic spike is forcing a generation of education-tech companies to re-platform under load — and they're using mission, not money, to recruit the senior engineers who can do it.

Daily Low-Level Programming

TLB Shootdowns: Why Unmapping Memory on One Core Stalls All the Others

2026-05-27

Every core caches virtual-to-physical translations in its own private TLB. When one core modifies a page table — via munmap, mprotect, madvise(MADV_DONTNEED), swap-out, or COW — every other core that might have cached that translation now holds a stale entry. The hardware does not coherently invalidate TLBs across cores. The OS must do it in software, and the mechanism is called a TLB shootdown.

The sequence on x86 Linux:

Initiating core updates the PTE and flushes its own TLB with INVLPG.
Kernel walks the mm_cpumask to find which cores have ever run threads of this address space.
It sends each of them an inter-processor interrupt (IPI) — vector CALL_FUNCTION_VECTOR or a dedicated TLB vector.
Each target core takes the interrupt, runs flush_tlb_func, executes INVLPG (or a full MOV CR3 for batch flushes), and ACKs.
Initiating core spins until all ACKs arrive, then returns to userspace.

The cost: an IPI round-trip is ~1–3 µs on modern Xeons. With 64 cores all running threads of the same process, a single munmap can stall the originator for tens of microseconds and steal cycles from every other core. At scale this dominates.

Real-world example: A JVM with 200 threads on a 96-core box calling System.gc(). The collector unmaps reclaimed regions; each munmap fires shootdowns to all 96 cores. Production traces have shown 40% of "GC pause time" being TLB shootdown IPIs, not actual collection work. Same pattern hits Go's scavenger releasing memory back to the OS.

Rule of thumb: A shootdown costs ~1.5 µs × (active cores in mm_cpumask). At 64 cores that's ~100 µs per unmap. If you're unmapping in a hot loop, batch the operations or use MADV_FREE instead of MADV_DONTNEED — MADV_FREE defers the actual unmap and avoids the immediate shootdown.

Mitigations the kernel already does: coalescing multiple invalidations into a single full flush (MOV CR3) when the per-page list exceeds tlb_single_page_flush_ceiling (default 33 pages), and PCID tags so a MOV CR3 doesn't blow away unrelated entries. ARM is better off here — TLBI instructions broadcast over the interconnect, no IPI needed — but the stall on the interconnect is real too.

Key Takeaway: Unmapping memory isn't a local operation — it's a synchronous, all-cores IPI storm whose cost scales linearly with how many cores share your address space.

RFC Deep Dive

RFC 2577: FTP Security Considerations

2026-05-27

RFC: RFC 2577

Published: 1999

Authors: M. Allman, S. Ostermann

By 1999, FTP was already a quarter-century old (the original spec, RFC 114, dates to 1971), and the wider world had finally noticed that it was riddled with attack surface. Rather than try to fix FTP itself — a Sisyphean task given how much deployed software depended on its exact quirks — Mark Allman and Shawn Ostermann wrote RFC 2577 as an operator's playbook: a catalogue of every known way FTP could be abused, with concrete mitigations. It is one of the more honest documents the IETF has published, essentially saying "this protocol is full of holes, here is how to live with them."

The headline attack is the FTP bounce attack. FTP uses two channels: a control connection (port 21) and a separate data connection. In active mode, the client tells the server where to deliver data via the PORT command — which specifies an arbitrary IP address and port. Nothing in the original spec required that address to be the client's own. So an attacker could connect to ftp.victim.com, issue PORT pointing at internal.target.com:25, and have the FTP server happily open a TCP connection to a host behind a firewall — using the FTP server as a proxy. Worse, attackers could stage arbitrary bytes on the FTP server (uploaded as a file), then bounce them at an SMTP port to send forged mail, or scan internal networks one port at a time. RFC 2577 mandates that servers refuse PORT commands whose IP doesn't match the control connection's peer — the fix nearly every FTP daemon ships today.

The document then enumerates a parade of other issues:

Restricted-access attacks: servers should refuse data connections to privileged ports (<1024) to prevent bouncing into NFS, syslog, or rsh.
Port stealing: because PASV data ports are often allocated sequentially, an attacker can predict the next port and connect first, hijacking or denying the legitimate transfer. The fix: randomize port selection.
Brute-force passwords: FTP transmits credentials in cleartext and historically had no rate-limiting. RFC 2577 recommends delays after failed logins, capped attempt counts, and disconnecting after N failures — basic hygiene that wasn't universal in 1999.
Username enumeration via USER/PASS timing: servers should respond identically whether the username exists or not.
Anonymous FTP pitfalls: writable directories enable warez staging; ~ftp permissions are easy to misconfigure into a root-readable filesystem.

What makes RFC 2577 quietly important is that it's still the canonical reference for FTP hardening 27 years later. Modern FTP servers — vsftpd, ProFTPD, pure-ftpd — implement essentially this checklist. If you've ever wondered why vsftpd.conf has knobs like port_promiscuous=NO or pasv_min_port/pasv_max_port, you're looking at RFC 2577 enforcement.

The broader lesson is architectural. FTP's sins flow from one design decision: using a separate data channel whose endpoint is negotiated in-band over the control channel. That choice makes NAT traversal painful (every NAT needs an FTP ALG to rewrite PORT commands), makes TLS retrofit awkward (FTPS exists, but the dual-channel TLS dance is famously fiddly), and creates the bounce attack as a near-direct consequence. SFTP — which is just file operations over SSH — sidesteps every one of these problems by using a single authenticated channel. RFC 2577 is, in a sense, the document that finally made the case for replacing FTP, even as it taught us how to keep it limping along.

Why it matters: Every FTP-server hardening default you take for granted — bounce-attack prevention, port randomization, login throttling — is straight out of this 1999 operator's playbook for a protocol the IETF had quietly given up on redesigning.

Stack Overflow Unanswered

Shared memory between two elf files

2026-05-27

Stack Overflow: View Question

Tags: linker, ld, elf, bare-metal

Score: 1 | Views: 99

The asker has two bare-metal ELF images that will be co-resident in physical memory and wants them to share a region containing both data and code. Their instinct is to drop my_object_file.o(.data) into a fixed-address section (.my_section 0x10000) in both linker scripts and hope the resulting symbols line up.

Why it's tricky: two independent link invocations are exactly that — independent. Even with the same object file and the same address, the linker is free to:

Order symbols differently within the section (especially if other inputs contribute).
Apply different relocations because the surrounding code references symbols at different addresses.
Inline, fold (ICF), or garbage-collect (--gc-sections) entries that look unused from one ELF's perspective but not the other's.
Generate PLT/GOT stubs in one image and not the other.

So "include the same .o in the same fixed section" is necessary but not sufficient. You need the addresses, not the inputs, to be the contract between the two ELFs.

A more robust approach:

Single source of truth for layout. Build the shared region as its own ELF (or a flat binary) linked once at 0x10000. Generate a symbol file from it (nm or --just-symbols).
Consume symbols, not sources. In each of the two application linker scripts, pull in those addresses via INCLUDE shared.syms or pass -Wl,--just-symbols=shared.elf. Now both ELFs reference identical absolute addresses without re-linking the shared code.
Alternatively, hardcode the contract: declare each shared symbol with PROVIDE(my_func = 0x10010); in both scripts. Ugly, but explicit.
Mark the section NOLOAD in one of the ELFs if only the other should actually deposit bytes there — otherwise both loaders will write the region and the second write wins.

Gotchas:

Code sharing is harder than data. Shared functions must be position-independent or linked at the exact address they'll execute from. Any internal calls/branches resolve at link time, so both ELFs must agree on those targets — which is exactly what --just-symbols from a pre-linked shared image gives you.
BSS-style zero-init in a shared region is dangerous: whichever ELF's startup runs second will clobber state the first one set up.
Cache coherency and MPU/MMU regions matter on real silicon — the shared window typically needs to be mapped as shareable (or uncached) on both cores/contexts.
Don't trust --gc-sections here; mark the shared section with KEEP(...).

The challenge: Two independent linker invocations can't be coerced into producing matching symbol addresses by accident — the shared layout must be linked once and then imported as addresses into both ELFs.

Daily Software Engineering

The Merkle Tree Pattern: Comparing Massive Datasets Without Sending Them Over the Wire

2026-05-27

You have two replicas, each holding 500 GB of data. One has drifted out of sync. How do you find the differences without shipping a terabyte across the network? You build a Merkle tree: a tree where every leaf is a hash of a data block, and every internal node is a hash of its children's hashes. Compare roots — if they match, the datasets are identical. If they differ, walk down only the branches that disagree.

The magic is the asymmetry: equality is proven with a single hash comparison, but inequality is localized in O(log n) steps. You only transfer the actual data for the blocks that truly differ.

How it's built:

Partition data into fixed-size chunks (e.g., 4 KB blocks, or key ranges in a database).
Hash each chunk → these become leaf nodes.
Concatenate sibling hashes and hash again → parent nodes.
Repeat until you have a single root hash.

Real-world example: Cassandra and DynamoDB use Merkle trees for anti-entropy repair. When two replicas need to reconcile, they exchange root hashes first. If equal, done — zero data transferred. If not, they walk the tree, exchanging only the subtree hashes on each level. For a 1 TB table partitioned into 1 million 1 MB chunks, finding the one corrupted chunk takes ~20 hash comparisons (log₂ 1,000,000) instead of streaming the whole table.

Git uses the same idea — every commit is a Merkle tree of file hashes. That's why git fetch can determine what objects you're missing with a handful of hash exchanges. Bitcoin uses it so light clients can verify a transaction is in a block without downloading the block.

The bandwidth rule of thumb: for n chunks with d differences, naive sync transfers all n chunks; Merkle sync transfers roughly d × log(n/d) chunks worth of hashes plus the d differing chunks. When d is small relative to n, the savings are enormous. When d approaches n, you're better off just shipping everything.

Watch out for:

Chunk boundaries matter. Inserting one byte at the start shifts every block, making everything look different. Content-defined chunking (rolling hashes, like rsync uses) handles this.
Tree depth vs. chunk size tradeoff. Smaller chunks = more precise diffs but bigger trees. Tune for your access patterns.
Build cost is O(n). You're hashing everything. Cache the tree and update incrementally on writes.

Key Takeaway: Merkle trees turn "are these two datasets equal?" from a linear-bandwidth question into a logarithmic one — invest in the hash tree once and reconciliation gets dramatically cheaper forever after.

Tool Nobody Knows

bbe: sed for Binary Files (When Hex-Editor-Through-a-Pipe Just Won't Do)

2026-05-27

Every few years someone hands me a corrupt firmware blob, a leaked savefile, or a multi-gigabyte mmap'd database with one wrong byte at offset 0x4A2E, and asks me to "just sed it." I used to mumble something about xxd | sed | xxd -r and quietly hate my life. Then in 2005 I discovered bbe — the Binary Block Editor — and never looked back.

bbe is sed's binary-aware cousin. It understands the concept of a block (fixed-length, delimiter-bounded, or pattern-anchored), and runs a tiny command language against each block. It handles NUL bytes natively, takes hex escapes everywhere a pattern is expected, and streams gigabytes without buffering the whole file. On Debian/Ubuntu it's a one-liner: apt install bbe.

The trivial case — substitute a binary pattern:

$ bbe -e 's/\x89PNG\r\n\x1a\n/\x89BAD\r\n\x1a\n/' good.png > broken.png

Try doing that with GNU sed without it choking on the embedded NUL or eating your \r. You can't — sed is line-oriented and POSIX sed barfs on binary data entirely.

Patch a single byte at an exact offset (the firmware-modder's bread and butter):

# Replace byte at offset 0x4A2E with 0xAA, leave everything else alone
$ bbe -b '#0:1' -e 'r 0x4A2E \xAA' firmware.bin > patched.bin

The -b flag defines the block — here, the whole file as one block starting at byte 0. r OFFSET STRING replaces bytes at a position. There's also i (insert), d (delete), j (join), y/abc/xyz/ (transliterate), and >FILE / <FILE to redirect a block to or from disk.

Operate per fixed-size record — e.g. strip trailing NUL padding from every 512-byte tar block:

$ bbe -b ':512' -e 's/\x00\x00\x00\x00$//' archive.tar > trimmed.tar

Block defined by delimiters — extract every JPEG embedded inside a memory dump or PCAP, one file per match:

$ bbe -b '/\xFF\xD8\xFF/\xFF\xD9/' -e '>jpeg_%n.jpg' memdump.bin
# %n auto-increments, so you get jpeg_1.jpg, jpeg_2.jpg, ...

That one's the killer use case for me. Carving files out of unknown binary containers usually means firing up a 600 MB forensics suite. Three flags of bbe and you're done.

Translate bytes wholesale — useful for trivial obfuscation or fixing endianness in fixed-width records:

$ bbe -e 'y/\x00\xFF/\xFF\x00/' image.raw > inverted.raw

Chain multiple commands on each block like sed:

$ bbe -b '/HEADER/FOOTER/' \
       -e 'd 0 8; s/\xDE\xAD\xBE\xEF/\xCA\xFE\xBA\xBE/; r 16 \x01\x02' \
       blob.dat

Why not just use a Python script with open(..., 'rb') and .replace()? Because Python loads the file into memory, and your "quick fix" for the 40 GB VM image becomes a 40 GB malloc. bbe streams. It's also the right call inside shell pipelines where dropping into a real language feels like overkill — dd if=/dev/sdb | bbe -e 's/\xDE\xAD/\xBE\xEF/' | dd of=/dev/sdc just works.

The man page is short. Read it once and you'll keep bbe in your toolbox forever — right next to xxd and the dog-eared printout of ASCII codes you keep meaning to throw out.

Key Takeaway: When you need to grep-and-replace inside binary data without round-tripping through hex, bbe gives you sed's ergonomics with native support for NUL bytes, hex escapes, fixed-size blocks, and pattern-bounded extraction — perfect for firmware patches, file carving, and surgical edits on multi-gigabyte streams.

What If Engineering

What If We Pumped Liquid Nitrogen Through Skyscraper Beams to Make Them Stronger?

2026-05-27

Steel gets stronger when it gets colder. This isn't a gimmick — it's a well-documented property of body-centered cubic metals. A36 structural steel at room temperature has a yield strength around 250 MPa. Cool it to liquid nitrogen temperatures (77 K, or −196 °C) and that climbs to roughly 450–500 MPa. Austenitic stainless steels do even better: 304 SS jumps from ~215 MPa yield at room temp to over 700 MPa at 77 K, and its ultimate tensile strength can exceed 1500 MPa. So what if we ran cryogenic plumbing through skyscraper columns to double their load capacity?

The seductive math. A typical 50-story tower has core columns sized for ~30 MN axial loads. If yield strength doubles, you could halve the steel cross-section, or keep the section and double the building height. For a 200,000 m² high-rise using ~15,000 tonnes of structural steel at ~$1,200/tonne, halving steel saves ~$9 million in material alone. Tempting.

Now the heat leak problem. Concrete and air around the column are at ~293 K. A bare steel column at 77 K with a surface area of, say, 400 m² per floor would conduct heat ferociously. With even a high-performance multilayer vacuum insulation (MLI) jacket — say k_eff ≈ 0.0005 W/m·K at 5 cm thickness — heat leak is:

Q = (k × A × ΔT) / t
Q = (0.0005 × 400 × 216) / 0.05 ≈ 864 W per floor

Over 50 floors: ~43 kW of continuous cooling load. Liquid nitrogen's latent heat of vaporization is 199 kJ/kg, so you'd boil off:

43,000 W ÷ 199,000 J/kg ≈ 0.22 kg/s ≈ 19 tonnes per day

At industrial LN₂ prices (~$0.15/kg delivered), that's ~$2,800/day, or $1 million per year just to keep the columns cold. Recondensing on-site with cryocoolers? A Stirling cryocooler at 77 K has a Carnot efficiency of maybe 15%; ideal Carnot COP at that ΔT is 77/(293−77) = 0.36, so real COP ≈ 0.05. You'd need ~860 kW of electrical input continuously. Goodbye, savings.

The structural nightmares. Thermal contraction of steel from 293 K to 77 K is about 0.3%. A 200 m column shrinks by 60 cm. Every beam-to-column connection becomes a thermal expansion joint capable of moving 6 mm per floor. Differential contraction between cooled columns and warm floor slabs would tear conventional moment connections apart. You'd need sliding bearings everywhere, sacrificing the lateral stiffness that makes a moment frame work in the first place.

And the human factor. A cryogenic leak in an occupied building displaces oxygen — LN₂ expands 700× when it vaporizes. A single 200-liter spill releases 140 m³ of nitrogen gas, enough to asphyxiate occupants in a stairwell. The 2006 Texas A&M nitrogen incident killed a researcher from a far smaller release.

Where it almost makes sense. Cryogenic strengthening is real and used: LNG storage tanks exploit the 9% nickel steel's improved toughness at −162 °C, and some aerospace tankage uses cold-strengthened austenitics. But these are incidentally cold — the cryogen is the cargo, not a structural service fluid.

For buildings, the energy-in-perpetuity to maintain strength dwarfs the one-time material savings within ~9 years, and that's before insurance underwriters laugh you out of the room.

Key Takeaway: Cryogenic steel really is twice as strong, but the parasitic refrigeration load and thermal-contraction joint problems convert a $9M material saving into a $1M/year operating bleed plus a building that tries to tear itself apart every winter.

Wikipedia Rabbit Hole

Cryogenic deflashing

2026-05-27

Wikipedia: Read the full article

Pick up almost any small rubber or plastic part — the grommet on your headphones, an O-ring in your faucet, a tiny gear inside a medical device — and run your fingernail along its seam. Feel that? Probably nothing. But moments after that part popped out of its mold, there was a thin, ragged frill of excess material clinging to the parting line where the two mold halves met. That whisker of waste is called flash, and getting rid of it is one of manufacturing's quietly annoying problems. Trim it by hand and you're slow and inconsistent. Trim it by machine and you risk scarring the part. So the industry came up with something wonderfully strange: freeze the part until the flash becomes more brittle than glass, then sandblast it with frozen plastic pellets.

Welcome to cryogenic deflashing. The process exploits a property called the glass transition temperature — the point at which a polymer stops behaving like a chewy solid and starts behaving like a shatterable ceramic. Rubber, silicone, nylon, and many thermoplastics have glass transitions well below freezing, often around −80 °C to −150 °C. Liquid nitrogen, boiling at −196 °C, takes parts comfortably past that threshold.

Here's the clever bit: the flash is thinner than the part itself, so it cools faster and gets brittler sooner. When you blast the chilled parts with a media — usually tiny polycarbonate pellets, since you don't want to use something harder than the workpiece — the flash snaps off cleanly while the bulk of the part, still slightly warmer in its core, absorbs the impact without damage. The temperature differential becomes a kind of selective scalpel.

A few details that make this even cooler:

The tumbling baskets often rotate inside the nitrogen vapor, ensuring uniform exposure and preventing parts from clumping.
Polycarbonate media is preferred because it itself becomes brittle at cryogenic temperatures and gradually pulverizes — so it self-cleans and won't embed in the parts.
The technique handles parts that would be impossible to deflash mechanically, like rubber bellows with internal seams or tiny medical components with intricate geometry.

If you've heard of the Liberty Bell cracking, or watched someone smash a rose dipped in liquid nitrogen at a science demo, you've seen the same principle: ductile materials become brittle when cold enough. Cryogenic deflashing is essentially that party trick weaponized into a precision industrial process, running 24/7 in factories that make everything from automotive seals to pacemaker housings.

It connects to a broader family of cryogenic manufacturing tricks — cryogenic machining (where coolant is liquid nitrogen, extending tool life dramatically), cryogenic grinding of spices (so volatile oils don't evaporate from the heat of milling), and cryogenic recycling of tires (where the frozen rubber shatters away from the steel belts).

Down the rabbit hole: Nearly every rubber seal in your car was likely shot-blasted with frozen plastic pellets at −196 °C — and that's why you've never noticed a seam on one.

Daily YT Documentary

Building a Film Festival Without Gatekeeping | Not Film Fest at NAB 2026

2026-05-27

Channel: CFA Institute (1310 subscribers)

Of the candidates in today's batch — mostly trailers, awards montages, and hashtag-laden promo clips — this interview stands out as the only one that actually teaches something about how the independent film world works behind the curtain.

Filmed at NAB Show 2026, the League of Filmmakers sits down with producer Gio Labadessa, founder of the Not Film Fest, to dig into a question most festival circuits avoid: why are so many indie filmmakers locked out by submission fees, opaque curation, and personal connections? Labadessa walks through his deliberate effort to build a festival without those traditional gatekeeping mechanisms — covering programming philosophy, how he handles selection without the usual industry politics, and the economics of running an accessible festival.

For anyone curious about the festival pipeline — submitting filmmakers, programmers, or viewers wondering why certain films keep appearing while others vanish — this is a candid look at the structural problems and one founder's concrete attempt to fix them. It's a conversation, not a sizzle reel, which makes it the rare watchable item in a list otherwise dominated by trailers and award show footage.

Note: the channel attribution "CFA Institute" on a film festival video is unusual and may be a metadata mismatch in the source listing.

Why watch: An honest look at how indie film festivals gatekeep — and a working blueprint for running one that doesn't.

Daily YT Electronics

How to Make a Transformerless Power Supply #engineering #electrical #gkfacts #currenttransformer

2026-05-27

Channel: ASHORT3210 (516 subscribers)

Note: this batch was thin — most candidates were PC build vlogs, hashtag-spam shorts, or clickbait. This one at least covers a real circuit concept worth understanding.

A transformerless power supply (more precisely, a capacitive dropper) is a clever and compact way to step mains AC down to a low DC voltage without the bulk and cost of an iron-core transformer. The trick: an X-rated capacitor in series with the live line acts as a lossless reactive impedance, dropping most of the voltage as the AC alternates. A bridge rectifier and smoothing cap then produce DC, with a Zener clamping the output.

It's the kind of circuit you'll find inside cheap LED bulbs, mains-powered clocks, and tiny appliance controllers — anywhere you need a few milliamps at 5–12 V and can't justify a transformer or switcher. Understanding why this works (and why it's dangerous) is genuinely useful: the output is not isolated from mains, so every node is live relative to earth. Touching it can kill you.

Worth watching for the circuit concept, but treat it as theory — not a "build this on a breadboard" project. If you want isolated low-voltage power, use a proper transformer or a USB brick.

Why watch: A quick explainer of the capacitive-dropper topology found inside cheap mains-powered electronics — useful engineering knowledge, with the important caveat that the output is not isolated from mains.

Daily YT Engineering

Whiteboard Lesson: The Fluid Dynamics of Why Your Room Won't Cool

2026-05-27

Channel: Engineering In Sight (772 subscribers)

This is a rare find: a whiteboard lesson that takes a problem nearly everyone has experienced — one room in the house that refuses to cool no matter how hard the AC works — and uses it as the entry point into the actual fluid dynamics of forced-air HVAC systems. The framing is genuinely educational rather than gimmicky.

Expect a walk through fan curves (the relationship between static pressure and volumetric flow rate that every blower obeys), system curves (how duct geometry, length, and fittings impose resistance), and the operating point where those two curves intersect. That intersection is what actually determines CFM delivered to a register — not the fan's rated capacity on the box.

From there, the lesson connects to real-world failure modes: undersized return ducts starving the blower, long flex runs with too many 90-degree bends spiking pressure drop, and why simply "turning up the fan" doesn't linearly increase flow once you're climbing the steep part of the system curve. It is the kind of foundational HVAC engineering content normally buried in textbooks or ASHRAE handbooks, presented at a whiteboard pace.

At 772 subscribers, this channel is exactly the under-the-radar engineering teaching that deserves attention.

Why watch: A practical, whiteboard-paced explanation of fan curves and duct system resistance that turns a relatable household frustration into a real lesson in applied fluid dynamics.

Daily YT Maker

RaspberryPi5 Smart Kiosk Build Complete

2026-05-27

Channel: thatssojosh (45 subscribers)

Most of today's candidates were hashtag-laden Shorts or quick clips with little real teaching content. This one stands out as the final episode of a multi-part Raspberry Pi 5 build series, which means the creator has actually been documenting a project end-to-end rather than just showing a finished gadget.

The hook here is the addition of a SQLite database for persistent guest sign-ins. That's a genuinely useful step up from the toy projects most beginner Pi tutorials stop at — it crosses the line from "blink an LED" into building something that holds real state across reboots. For anyone learning Python on embedded Linux, watching someone wire a database into a kiosk-style UI is a good template: you see schema design, write/read paths, and how to handle a touchscreen front-end against a local DB all in one project.

Because it's the final episode, expect a working demo of the whole kiosk plus a recap of the architecture. It's also worth poking around the earlier episodes if the database layer alone doesn't show enough of the wiring and OS setup. A 45-subscriber channel doing a complete, multi-episode build is exactly the kind of small-creator effort worth a watch.

Why watch: A real end-to-end Raspberry Pi 5 kiosk project that goes beyond hello-world by adding persistent SQLite storage for guest sign-ins.

Daily YT Welding

Start of the Fummins project. Removing the Cummins and CNC plasma cutting to fix a new engine stand.

2026-05-26

Channel: Jimmy Riging (264 subscribers)

Most of this week's candidates were hashtag-spam shorts from the same plasma cutting channel, so this Jimmy Riging video stands out as the only one promising a real, sustained project walkthrough. A "Fummins" is the long-running swap nickname for dropping a Cummins diesel into a Ford chassis — in this case, pulling the inline-six out of a 2nd gen Dodge Ram and transplanting it into an F550. That's a non-trivial engineering exercise: motor mounts, transmission bellhousing adapters, wiring harness integration, fuel system plumbing, and cooling all have to be reconciled between two very different platforms.

This first episode focuses on the teardown side — getting the Cummins out cleanly — plus a practical bit of fabrication: using a CNC plasma cutter to modify an engine stand so it can actually hold a 12-valve Cummins, which is heavier and physically larger than most stands are designed for. That combination of diagnosis, fabrication, and heavy diesel wrenching is exactly the kind of content that small-channel builders do better than the polished shops, because you get to see the real decisions and the mistakes.

At 264 subscribers, Jimmy is clearly just starting out, so the production may be rough, but the project scope is ambitious and worth following from episode one.

Why watch: A ground-floor look at a Cummins-into-F550 swap, including the often-skipped step of fabricating a stand strong enough to hold the engine.

All newsletters