25 newsletters today.
Abandoned Futures
2026-06-08
In 1955, Bell Aircraft proposed something audacious to the U.S. Air Force and Navy: a supersonic vertical-takeoff fighter that would dispense with runways entirely. The Bell D-188A, given the military designation XF-109 by the Air Force (and briefly XF3L by the Navy), was to be a Mach 2.3 interceptor powered by eight General Electric J85 turbojets arranged in a configuration nobody had attempted before — and nobody has built since.
The layout was a study in brute-force engineering:
Eight engines. One pilot. The design weight was about 23,000 lb, and the projected performance was extraordinary: Mach 2.3 at 60,000 ft, a combat radius of 575 miles, and the ability to take off vertically from a forest clearing, a ship deck, or a bomb-cratered runway. It was meant to solve the central NATO nightmare of the late 1950s — that Soviet first-strike nukes would crater every airfield in West Germany within the first hour of war.
Bell built a full-scale wood-and-metal mockup at its Niagara Falls plant in 1959, completed propulsion ground tests on the swiveling nacelle concept, and ran extensive wind tunnel work. The Air Force assigned the XF-109 designation in 1961. Then, in 1961, the project died — not from a technical failure, but from Robert McNamara's cost-effectiveness reviews. McNamara's analysts concluded that eight engines meant eight times the maintenance burden, that the transition from hover to forward flight was thermodynamically marginal (hot exhaust ingestion was a real risk), and that the F-4 Phantom could simply be procured in larger numbers for the same money. The mockup was scrapped. No prototype was ever flown.
Here's why 2026 should reopen the file. Every objection McNamara raised has been answered by 65 years of progress:
The U.S. Navy is currently spending billions on the F/A-XX program seeking carrier-deck flexibility. The Marines fly the F-35B at $109M a copy with a single-point-of-failure lift fan. A modern D-188A — distributed-propulsion, supersonic, dispersible — is not a fantasy. It's a 67-year-old Bell engineering report waiting for someone to dust it off.
ArXiv Paper Digest
2026-06-08
AI coding agents like Claude Code and Gemini CLI have a feature that's become wildly popular: skills. A skill is a little bundle — a markdown file with natural-language instructions, some executable scripts, and a list of tool permissions — that you can drop into your agent to teach it a new trick. Want your agent to know how to deploy to your specific cloud setup? Install a skill. Want it to handle a niche file format? Install a skill.
The problem: skills are basically a new kind of software supply chain, and nobody knows how dangerous it is. A skill is simultaneously code (the scripts it runs) and prompt (the instructions it feeds the agent). That hybrid nature means traditional malware scanners miss things — a script might look benign, but the markdown instructions could nudge the agent into doing something harmful with totally legitimate tools. And prompt-injection detectors miss the other half: a perfectly innocent-looking instruction file paired with a malicious script.
This paper introduces MalSkillBench, the first benchmark designed to measure how well detection tools catch malicious skills. The key contribution is the word runtime-verified: rather than just labeling skills as "looks suspicious," the authors actually ran each malicious skill and confirmed it does the bad thing it claims to do. That ground truth is what's been missing — previous security tools were essentially being graded against vibes.
What's in the benchmark:
The key insight is that a skill's risk lives in the seam between code and language. A reviewer reading just the script sees nothing wrong. A reviewer reading just the markdown sees nothing wrong. Only when an agent stitches them together does the attack materialize — and that's exactly the analysis gap attackers will exploit.
For anyone running agents with third-party skills (which is increasingly everyone), this is the first honest measuring stick for whether your defenses work. Expect this to become the de facto evaluation for skill-scanning tools, the same way SWE-bench became the yardstick for coding agents.
Daily Automotive Engines
2026-06-08
The turbocharger shaft spins on a thin film of pressurized oil — it never actually touches the bearings during operation. That oil film exists in two places: the radial clearance (between shaft and journal bearing OD/ID) and the axial clearance (the thrust bearing endplay, often called "shaft float"). Both are measured in thousandths of an inch, and both determine whether your turbo lasts 200,000 miles or grenades at 50,000.
Radial clearance on a typical journal bearing turbo runs 0.003–0.006" between shaft and bearing ID, plus another 0.003–0.005" between bearing OD and housing bore. That's right — the bearing itself floats on oil, spinning at roughly half shaft speed. This full-floating design doubles the oil film area and lets the bearing absorb shock loads. The trade-off: you need more oil flow (typically 0.5–1.5 gpm at idle pressure) and the rotor assembly has more total play.
Axial clearance (thrust float) is much tighter — usually 0.001–0.004" total endplay measured at the compressor nose. The thrust bearing is a small bronze or steel washer with machined oil grooves that handles the imbalance between compressor and turbine pressure pushing the shaft fore and aft. Boost pressure pushes the compressor wheel backward; exhaust backpressure pushes the turbine forward. They never perfectly cancel, so the thrust bearing eats the difference.
Rule of thumb: Grab the compressor wheel and wiggle. Side-to-side play you can feel (more than ~0.006") is acceptable on a journal bearing turbo — that's the floating bearing doing its job. In-and-out (axial) play you can feel at all is a warning sign; the thrust bearing should be tight enough that movement is invisible to the naked eye.
Real-world example: The Garrett GT2860RS used on Subaru STI builds spec's 0.0035–0.0045" radial clearance and 0.0015–0.0035" axial float. Owners who run E85 with no fuel pressure regulator upgrade often see thrust bearing failure first — the cooler, denser E85 charge creates more boost spikes, hammering the thrust washer until the compressor wheel kisses the housing. The telltale: oil smoke on hot restarts and a perfectly circular rub mark on the back of the compressor wheel.
Excessive clearance anywhere causes oil leakage past the piston-ring seals at each end of the CHRA — those rings need a stable, centered shaft to seal. That's why a worn turbo smokes: it's not the seals failing, it's the bearings letting the shaft move enough to break the seal geometry.
Daily Debugging Puzzle
sizeof Array Decay Trap: The Element Count That Forgets How Many Elements2026-06-08
This function is supposed to print every element of the array it receives. The caller passes an array of five integers, but only two of them show up on the terminal. The code compiles without a warning on many setups, and the formula sizeof(arr)/sizeof(arr[0]) is one most C programmers have typed a thousand times.
#include <stdio.h>
void print_all(int arr[]) {
size_t n = sizeof(arr) / sizeof(arr[0]);
for (size_t i = 0; i < n; i++) {
printf("%d\n", arr[i]);
}
}
int main(void) {
int numbers[] = {10, 20, 30, 40, 50};
print_all(numbers);
return 0;
}
Expected output: five lines, 10 through 50. Actual output on a typical 64-bit Linux box: 10 and 20, and then silence.
The parameter declaration int arr[] looks like an array, but it isn't one. In a function parameter list, C silently rewrites every array type into a pointer to its element type. So void print_all(int arr[]) is exactly identical to void print_all(int *arr) — the brackets are visual decoration with no semantic weight.
That means inside print_all, sizeof(arr) is sizeof(int *) — eight bytes on a 64-bit system, four on a 32-bit one. Divide by sizeof(arr[0]), which is sizeof(int) (four), and you get n == 2. The loop runs twice and stops, no matter how big the array actually was.
This is the array-to-pointer decay rule, and it's one of the oldest traps in C. The same expression sizeof(numbers)/sizeof(numbers[0]) works in main, where numbers is a real array of known size — there sizeof(numbers) is 20. The moment you cross the function boundary, the type information evaporates.
Worse: modern compilers can warn about this (-Wsizeof-array-argument in GCC and Clang), but the warning is not in -Wall on older versions, and the code is perfectly well-defined — it just silently does the wrong thing. There's no UB to catch, no sanitizer to flag it.
You must pass the length explicitly. There is no portable way for the callee to recover the size of the original array:
void print_all(const int *arr, size_t n) {
for (size_t i = 0; i < n; i++) {
printf("%d\n", arr[i]);
}
}
#define ARRAY_LEN(a) (sizeof(a) / sizeof((a)[0]))
int main(void) {
int numbers[] = {10, 20, 30, 40, 50};
print_all(numbers, ARRAY_LEN(numbers));
return 0;
}
The macro ARRAY_LEN only works where a is a true array — apply it to a pointer and you'll get the same nonsense. In C11 and later, you can guard against that with _Static_assert and __builtin_types_compatible_p, refusing to compile when someone hands the macro a pointer. C99 also offers a sharper signature — void print_all(size_t n, int arr[static n]) — which both documents the intent and lets some compilers warn on too-short arrays at the call site.
The deeper lesson: in C, arrays are not first-class values. They can't be passed, returned, or assigned. Every time it looks like you're "passing an array," you're really passing a pointer to its first element — and the size, the cardinal piece of information, is your job to carry alongside it.
int arr[] parameter is a pointer in disguise, so sizeof(arr)/sizeof(arr[0]) silently returns 2 (or 1) instead of the array length — always pass the size explicitly across function boundaries.
Daily Digital Circuits
2026-06-08
Every load and store your CPU executes uses a virtual address. The DRAM only understands physical addresses. Somewhere between the load instruction and the cache, hardware has to translate one to the other — and it has to do it in roughly one clock cycle, or your pipeline stalls on every memory access.
The translation lives in the page table, a tree structure in memory. On x86-64 with 4-level paging, walking that tree takes four memory accesses. If every load triggered a page walk, you'd spend 5× more memory bandwidth on translation than on actual data. The TLB is the cache that prevents this.
A TLB is a small, fully- or set-associative CAM-like structure indexed by virtual page number, storing the corresponding physical page number plus permission bits (read/write/execute, user/supervisor, dirty, accessed). On a hit, translation completes in 1 cycle. On a miss, a hardware page walker (a dedicated state machine) traverses the page table and refills the TLB — typically 20–100+ cycles, longer if the walk itself misses in the data cache.
Real example — the cost of a TLB miss: An Intel Skylake L1 dTLB has 64 entries. Each entry maps a 4KB page, so the L1 dTLB covers exactly 64 × 4KB = 256 KB of virtual address space. Walk linearly through a 1 GB array and you'll thrash the dTLB on every page boundary. With ~25-cycle walks happening every 4 KB, you've added ~6 cycles of overhead per byte — and that's before you account for cache misses.
Rule of thumb: If your working set fits in (L1 dTLB entries × page size), translations are free. Above that, switch to huge pages (2 MB on x86, configured via madvise(MADV_HUGEPAGE) or hugetlbfs) — a single 2 MB entry replaces 512 4 KB entries, expanding reach 512×.
This is why database engineers obsess over huge pages: a hash table with random access patterns can spend 30–50% of its cycles in page walks if it overflows the STLB. Context switches make it worse — most TLB entries aren't tagged with a process ID (or use a small ASID space), so a switch flushes most of the structure and the next process pays a wave of walks.
Daily Electrical Circuits
2026-06-08
The push-pull converter solves a problem flyback and forward converters can't: efficiently moving hundreds of watts through a transformer without wasting half of every cycle. Two switches alternately drive opposite ends of a center-tapped primary, so the transformer core swings symmetrically through both quadrants of its B-H loop. That doubles the usable flux swing compared to a forward converter, letting you shrink the core for the same power.
How it works: Two MOSFETs (Q1, Q2) connect the ends of the primary to ground; the center tap goes to V_in. When Q1 conducts, current flows through the top half of the primary, inducing voltage in the secondary. Q1 turns off, both switches sit idle for a dead time, then Q2 conducts and drives the bottom half — flipping the secondary polarity. A center-tapped secondary with two diodes (or synchronous FETs) rectifies both half-cycles into a continuous output through an LC filter, exactly like a buck converter's output stage.
The killer advantage: the transformer never sees DC bias because flux swings symmetrically. Compare to a forward converter, which only pushes flux in one direction and needs a reset winding or RCD clamp to demagnetize the core every cycle. Push-pull uses all of the B-H loop, so the core is roughly half the size for the same throughput.
The killer problem: flux walking. If Q1 and Q2 don't conduct for exactly equal times — mismatched gate drive delays, asymmetric R_DS(on), even slight duty cycle errors — the core accumulates DC flux each cycle and eventually saturates. Saturation drops primary inductance to nearly zero, current spikes, and the FETs explode. Modern push-pull controllers use current-mode control (peak current sensing per cycle) to force exactly equal volt-seconds on each switch and prevent walking.
Real-world example: A 48 V telecom rectifier delivering 12 V at 50 A (600 W). Push-pull at 100 kHz with a ferrite EE42 core handles this comfortably. Each FET sees 2×V_in = 96 V during its off-time (the other half-winding's voltage adds through transformer coupling), so you'd pick 150 V MOSFETs with margin.
Rule of thumb — primary turns: For a given core, N_pri = (V_in × D_max) / (4 × f_sw × B_max × A_e). With V_in = 48 V, D_max = 0.45 per switch, f_sw = 100 kHz, B_max = 0.15 T (ferrite, derated for symmetric swing), A_e = 178 mm² (EE42): N_pri ≈ (48 × 0.45) / (4 × 100k × 0.15 × 178e-6) ≈ 8 turns. Note the factor of 4 instead of 2 — symmetric drive doubles the available flux swing.
Push-pull dominates the 100 W–1 kW isolated supply range where flyback runs out of steam and full-bridge is overkill.
Daily Engineering Lesson
2026-06-08
Investment casting (also called lost-wax casting) is a precision casting process that produces parts with surface finishes of 60-125 microinches Ra and dimensional tolerances of ±0.005 in/in — good enough that many parts ship straight from the foundry with no machining. The process is ancient (Egyptians used it 5,000 years ago for jewelry) but remains the go-to method for turbine blades, surgical implants, and firearm components.
The seven-step process:
Why it wins for complex parts: Because the pattern is sacrificial, you can cast undercuts, internal passages, and intricate geometry that sand casting (which requires pattern withdrawal) cannot. Jet engine turbine blades have internal cooling channels formed by ceramic cores placed inside the wax pattern — those channels would be impossible to machine.
Cost rule of thumb: Tooling for the wax die runs $5,000-$50,000 (vs. $500-$5,000 for sand-cast patterns), but per-part cost is lower above ~500 units because finishing is minimal. Break-even vs. machining from billet is typically 100-500 parts for complex geometry.
Real-world example: A GE LEAP engine HPT blade is investment cast from a single nickel superalloy crystal (directional solidification keeps grain boundaries from forming perpendicular to the centrifugal load). The blade has 70+ internal cooling holes and serpentine passages — formed by a ceramic core leached out with caustic after casting. No other process can produce that geometry in that alloy.
Limitations: Part size is capped around 75 lb for most foundries (shell weight and pour dynamics), wall thickness minimums are ~0.030", and lead times run 8-16 weeks for first articles because of die fabrication.
Forgotten Books
2026-06-08
Book: The Child's Book of Nature for the Use of Families and Schools Intended to Aid Mothers and Teachers in Training Children in the Observation of Nature. Part 3: Air, Water, Heat, Light, Etc. Revised Edition by Worthington Hooker, M.D. (1886)
Read it: Internet Archive
The title page of this small volume states its purpose plainly:
"THE CHILD'S BOOK OF NATURE FOR THE USE OF FAMILIES AND SCHOOLS INTENDED TO AID MOTHERS AND TEACHERS IN TRAINING CHILDREN IN THE OBSERVATION OF NATURE."
That phrase — training children in the observation of nature — sounds anodyne today, like a generic mission statement. But the man behind it, Worthington Hooker, M.D., was no neutral schoolbook compiler. Hooker was a Yale professor of medicine and one of nineteenth-century America's fiercest opponents of medical quackery. His 1849 work Physician and Patient is sometimes cited as the first American treatise on medical ethics, and he spent his career attacking patent-medicine peddlers, homeopaths, and the "irregular" practitioners then bleeding the country of money and lives.
The forgotten idea baked into this children's book is that observational science education was a public-health intervention. Hooker's premise — explicit in his adult writings and implicit in the very existence of this three-volume set covering Plants, Animals, and Air, Water, Heat, Light, &c. — was that a child who had learned to watch a kettle boil, or to test what dissolves in water, would grow into an adult who could not be fooled by a man selling colored sugar-water as a cure for consumption. He wasn't writing nature lessons to produce naturalists. He was writing them to produce skeptics.
Consider how strange the framing on this title page actually is. The book is addressed not to children, nor even to teachers, but specifically to mothers — at a time when most science textbooks pretended mothers did not exist. Hooker understood that the household, not the schoolroom, was where medical decisions got made, and where the snake-oil pamphlets arrived in the mail. If you wanted to inoculate a generation against quackery, you had to start in the kitchen, with the parent who decided whether the croup got camphor or calomel.
Is the underlying claim — that early science instruction produces durable adult skepticism — actually true? The evidence is mixed but suggestive. Modern studies on "scientific literacy" repeatedly find that simply knowing facts about science correlates only weakly with resistance to misinformation. What correlates strongly is what Hooker was actually teaching: the habit of asking how do I know this? and being willing to test it. The Nobel laureate Richard Feynman's famous childhood, in which his father made him guess why a ball rolled in a wagon and then test it, is essentially the Hooker method. So is the modern movement toward "inquiry-based" elementary science.
What we've lost is the explicit reason Hooker thought this mattered. We teach kids to "do science" as career prep or general enrichment. Hooker taught it as armor.
Forgotten Darkroom
2026-06-08
Book: CIA Reading Room cia-rdp80-00809a000600300299-7: PRODUCES NEW OPTICAL EQUIPMENT, PRECISION INSTRUMENTS by CIA Reading Room (1950)
Read it: Internet Archive
In March 1950, a CIA analyst sat in a Washington office translating an item from the East German newspaper Die Wirtschaft. The subject was, by Cold War standards, almost absurdly mundane: a new camera coming out of a state-owned factory in Dresden-Niedersedlitz. But buried in the dry intelligence prose is a description of a device that would reshape global photography for the next half-century.
A new miniature, single-lens, mirror-reflex camera, the Praktica, has been introduced by the people-owned Camera Plants in Dresden-Niedersedlitz. Of attractive design, the 24 x 36-millimeter camera is simple to operate and equally effective for both amateur and professional photography.
The analyst then catalogued — perhaps without quite realizing it — the complete feature set of what would become the dominant camera format of the 20th century:
The Praktica has all the characteristics of a single-lens mirror-reflex camera, such as direct focusing on the ground glass panel; identical view-finder lens and photographic lens, so that the picture recorded on the film will correspond exactly to the picture seen through the view finder; and preclusion of parallax errors, even with interchangeable lenses of varying focal lengths.
What the CIA was describing was the modern 35mm SLR — the camera that would, for the next 50 years, define what "a serious camera" meant. The Praktica was indeed historic: produced by KW (Kamera-Werkstätten) in Dresden, it's widely considered the first commercially successful 35mm SLR with a pentaprism viewfinder lineage (introduced in successor models). Before this, "serious" photographers used rangefinders or twin-lens reflexes, both of which suffered from parallax — the maddening problem of the viewfinder seeing a slightly different scene than the lens.
The report even notes the killer feature for portrait work:
The picture as seen through the view finder also shows the actual effect of the diaphragm setting, that is, the degree of definition produced under varlous conditions of aperture ratio. This is particularly important for close-range photography, where proper depth of focus is critical.
That's depth-of-field preview — a feature photographers still pay extra for today. The CIA's anonymous translator was describing, in 1950, the essential workflow of every Canon AE-1, Nikon F, Pentax K1000, and ultimately every modern mirrorless camera: WYSIWYG photography.
What's remarkable is the geopolitical irony. East Germany — soon to be a byword for shoddy consumer goods — was at this moment the world leader in camera engineering, inheriting the optics tradition of Zeiss and Praktica's Dresden facilities. The Praktica line would sell over 9 million units globally before German reunification killed the brand. Japanese manufacturers like Asahi (Pentax) and Nikon would study these designs, refine them, and ultimately eat East Germany's lunch.
So when a CIA officer flagged this paragraph as worth translating in 1950, he was unknowingly filing an intelligence report on the future of how humans would see themselves — every wedding photo, every National Geographic cover, every Vietnam War photograph, all flowing from the design principles laid out in this one declassified memo.
Forgotten Patent
2026-06-08
In the late 1880s, Almon Brown Strowger ran a funeral parlor in Kansas City, Missouri. Business was slowing, and Strowger eventually figured out why: the local telephone operator was the wife of a rival undertaker. When grieving families called and asked for "the undertaker," she routed them to her husband. Strowger's response was not a lawsuit — it was a patent.
On March 12, 1889, Strowger filed US Patent 447,918, "Automatic Telephone-Exchange," granted March 10, 1891. It described a machine that could connect any subscriber to any other subscriber without a human in the loop. The caller's dial sent electrical pulses down the line, and a vertical-and-rotary stepper switch at the exchange counted those pulses and physically stepped a wiper across a bank of 100 contacts. Ten pulses, then seven, then four — the wiper lands on subscriber 1074. The bias of the operator was eliminated by removing the operator.
The first commercial Strowger exchange opened in La Porte, Indiana in 1892 with 75 lines. By the 1920s, "step-by-step" switches were the dominant telephone-routing hardware on Earth. AT&T resisted for decades, then gave in; Strowger gear was still carrying calls in parts of the US Bell network into the 1990s. A patent born of personal grievance ran the global voice network for a century.
The hidden architectural insight. Strowger did something deeper than automate a clerk. He invented the idea that the user's input is the routing instruction. The dial pulses were not data about a call — they were the routing protocol. The caller, by dialing, was programming the network in real time. There is a straight conceptual line from Strowger's wipers to:
Could it be built better now? It already has been — many times. Crossbar switches replaced Strowger's wipers in the 1940s. Reed relays replaced crossbars. Digital time-division switches (ESS) replaced reeds. IP routers replaced circuit switches. Each generation kept Strowger's core abstraction (caller-supplied address → automated, hop-by-hop selection) and changed only the substrate: brass, then steel, then silicon, then code.
There's a deeper lesson hiding in the patent. Strowger was not an electrical engineer. He had no theoretical model of networks, no graph theory, no Shannon. He had a problem (a corrupt operator), a constraint (he could not change the human), and a question: what if the network itself made the decision? That question is the founding question of every routing protocol since. BGP, OSPF, MPLS, and SDN are all elaborate answers to a Kansas City undertaker's grievance.
Daily GitHub Zero Stars
2026-06-08
Language: Go
This is a Go-based MCP (Model Context Protocol) server that exposes semantic search over your documents and code to any MCP-compatible AI client. It stitches together three pieces that are quickly becoming the default local-RAG stack: Ollama for embeddings, Chroma as the vector store, and Docker for one-command deployment.
What makes it interesting is the language choice. Most RAG tooling in the MCP ecosystem is written in Python or TypeScript, which makes sense because that's where the ML libraries live. But an MCP server doesn't actually do ML — it just orchestrates HTTP calls to Ollama and Chroma. Writing it in Go gives you:
The topic tags (opencode, mcp, rag, ollama, chromadb) suggest the author is targeting the opencode / Claude Code / Cursor crowd — developers who want an AI assistant that can answer questions about a private codebase or knowledge base without sending anything to a third party. That's a real and growing need: company wikis, internal SDKs, and personal note vaults all benefit from a local semantic index that any MCP client can query.
Who would benefit? Self-hosters running Ollama who want to plug a private corpus into their AI workflow, Go developers looking for a clean reference for building MCP servers in something other than TypeScript, and privacy-conscious teams that need RAG without a SaaS dependency.
Daily Hardware Architecture
2026-06-08
Modern x86 CPUs run AVX-512 (and to a lesser extent AVX2) at a lower clock frequency than scalar code. This isn't a bug — it's a thermal and electrical bargain the CPU makes with itself. Wide vector units burn enormous current when active; sustaining nominal frequency through a 512-bit FMA every cycle would exceed the package's voltage regulator limits and the die's thermal envelope.
Intel formalized this with frequency licenses. On Skylake-X / Cascade Lake, a core operates in one of three states:
The transitions aren't instant. When a core hits an AVX-512 instruction, it requests a higher license. The voltage regulator ramps up, but until it does, the core runs at a guaranteed-safe lower frequency for ~20µs. After the AVX-512 burst ends, it stays in the elevated license for ~2ms before dropping back — because thrashing licenses would cost more than just staying there.
Real-world example: A web server that occasionally calls memcpy compiled with AVX-512 can throttle the entire core for milliseconds after each call. If your hot path is scalar pointer-chasing and you sprinkle in a single AVX-512 routine, the scalar code runs slower for the next ~2ms than it would have if you'd used AVX2. Cloudflare famously disabled AVX-512 in their workloads for exactly this reason around 2017-2018.
Rule of thumb: AVX-512 is a net win only if the vectorized region runs long enough to amortize both the frequency transition (~20µs) and the trailing penalty on neighboring scalar code (~2ms). For a 4 GHz → 3 GHz drop (25%), you need the vector region to be at least ~4× faster than scalar to break even on its own runtime — and that's before accounting for the tail penalty on whatever runs next.
Ice Lake and later mostly fixed this for client chips by improving voltage regulator response and per-instruction power gating, but server SKUs (Sapphire Rapids, Emerald Rapids) still throttle measurably under sustained AVX-512 load. AMD's Zen 4 handles AVX-512 by double-pumping 256-bit units instead of widening them — no license states, no thermal cliff, but also no peak throughput advantage.
Hacker News Deep Cuts
2026-06-08
Link: https://irfanali.org/blog/repmin
HN Discussion: 1 points, 0 comments
The URL points to a writeup on repmin — a famously deceptive little programming puzzle that has been a favorite of functional programming educators for decades. The problem statement sounds trivial: given a tree of numbers, return a tree of the same shape where every leaf is replaced by the minimum value from the original tree. The catch is the elegance constraint — can you do it in a single traversal?
In an eager language, the obvious answer is no. You need one pass to find the minimum and a second pass to rebuild the tree. But in Haskell, Richard Bird showed in 1984 that lazy evaluation lets you write something that looks impossibly circular: the function returns both the minimum and the rebuilt tree in one go, with the rebuilt tree referring to the minimum that the same call is still computing. It feels like time travel — hence the post's title.
What makes this particular writeup interesting is the JavaScript angle. JavaScript isn't lazy by default, but it has all the primitives needed to simulate laziness: thunks, closures, generators, and now `Promise`-based deferred evaluation. Translating repmin to JS forces you to make explicit what Haskell hides — and that's where the educational value lives. You see exactly which knots tying-the-knot actually ties.
Why a technical audience should care:
The post has one upvote and zero comments, which is a shame — this is exactly the kind of small, dense programming essay HN used to surface routinely. It's the genre of "I learned a beautiful trick, here's why it matters," not chasing trends, not selling a product.
HN Jobs Teardown
2026-06-08
Source: HN Who is Hiring
Posted by: trithagoras
Of the eleven postings in this thread, Chronosphere's is the most strategically revealing. It's a Series A company built around M3, an open-source metrics platform spun out of Uber. That single sentence — "created by the founders of Chronosphere while at Uber" — is the entire business model in disguise.
The tech stack tells the story. M3 is a Go-based time-series database designed to ingest billions of data points per second at petabyte scale. The fact that Chronosphere is hiring a Senior Frontend Engineer and a Senior UX Designer — not more backend distributed-systems engineers — is the giveaway. The hard infrastructure problem is already solved (it ran Uber). What they're missing is the product layer that turns a Cassandra-killing TSDB into something an SRE at a normal company actually wants to pay for. They're racing to wrap M3 in dashboards, alerting, and a workflow story before Datadog, Grafana Cloud, or a re-energized Prometheus ecosystem eats their lunch.
What it reveals about stage and direction. Series A + NYC/Seattle dual-coast hiring + "Remote for now" (clearly a COVID-era concession, not a remote-first culture) signals a company that is commercializing open source. This is the same playbook as Confluent (Kafka), Databricks (Spark), and Elastic — take the OSS project you built at BigCo, raise venture money, and sell the managed service. The challenge baked into this hire: convincing skeptical infra buyers that they should pay for something they could theoretically run themselves.
Skills and trends highlighted:
Red flags: The posting is copy-pasted boilerplate ("modern and highly scalable monitoring monitoring" — literal duplicated word), no salary band, and no mention of equity. The "Remote for now" hedge suggests internal disagreement about distributed work. Also: founding a company on tech you built at your previous employer always carries lingering IP questions, though Uber open-sourced M3 cleanly.
Green flags: Real production pedigree (not vaporware), clear OSS-to-commercial thesis, and they're hiring product-shaped roles rather than just more engineers — which means someone there understands that distributed-systems excellence doesn't sell itself.
Daily Low-Level Programming
2026-06-08
You've used RDTSC to measure cycles. It reads the 64-bit Time Stamp Counter into EDX:EAX in a handful of cycles — far cheaper than any syscall-based clock. But there's a problem: RDTSC is not a serializing instruction. The out-of-order engine is free to execute it earlier or later than where you wrote it. Your "before" timestamp can be read after instructions you intended to measure, and your "after" timestamp can be read before the work finishes. The measurement becomes noise.
The traditional fix was to bracket RDTSC with CPUID, which fully serializes the pipeline. It works but is brutal — CPUID can cost 200+ cycles and varies by leaf, polluting the very thing you're trying to measure.
RDTSCP (added with Nehalem/Barcelona) is a partial fix. It guarantees that all prior instructions in program order have completed before the TSC is read. It does not prevent later instructions from starting early. It also returns the IA32_TSC_AUX MSR in ECX — which Linux populates with the CPU number, so you can detect if you got migrated mid-measurement.
The canonical recipe for measuring a region:
CPUID (serialize), then RDTSC (read start). The CPUID fences anything from leaking up past the start.RDTSCP (waits for work to retire, then reads), then CPUID (so later instructions can't pull the end-read down past them).This is exactly what Intel's whitepaper "How to Benchmark Code Execution Times" prescribes, and what the Linux kernel's arch/x86/include/asm/msr.h uses in its precision-timing macros. PTP daemons, DPDK, and io_uring's IORING_FEAT_NATIVE_WORKERS stat collection all use RDTSCP for sub-microsecond timing where syscall overhead would dwarf the measurement.
Rule of thumb for choosing your timer:
RDTSCP + CPUID bracket. Pin the thread (sched_setaffinity) and check ECX matches at start/end.clock_gettime(CLOCK_MONOTONIC) via vDSO is fine; the ~20 ns of overhead is negligible.The trap: the TSC frequency is not the CPU frequency on modern parts — it ticks at the nominal "base" rate regardless of turbo or P-state. Convert with the TSC frequency from /sys/devices/system/cpu/cpu0/tsc_freq_khz or CPUID leaf 0x15, not from /proc/cpuinfo's reported MHz.
RFC Deep Dive
2026-06-08
RFC 114 is the genesis document of file transfer on the internet — written in April 1971, more than a decade before TCP/IP went live on the ARPANET. Every ftp command anyone has ever typed traces its lineage back to this 19-page memo from a graduate student at MIT.
The problem. By 1971 the ARPANET had a handful of sites running heterogeneous mainframes — Multics, TENEX, the IBM 360, the SDS 940 — each with its own filesystem, character set, record structure, and login model. Researchers wanted to ship files between them, but every pair of hosts required an ad-hoc convention. Abhay Bhushan, working under J.C.R. Licklider at MIT, set out to define a single protocol so that "a person, or a program operating on his behalf, can use the network to deal with file systems at remote hosts."
Key design decisions that still shape FTP today.
ascii/binary toggle that, in 2026, nobody understands.RETRIEVE, STORE, APPEND, DELETE, RENAME. The choice — controversial at the time, since binary opcodes were faster — set the precedent for SMTP, HTTP, IMAP, and the entire family of text-based internet protocols.Interesting history. Bhushan presented this work at the 1971 Spring Joint Computer Conference. RFC 114 was revised in RFC 141, then thoroughly rewritten as RFC 172 (1971), RFC 354 (1972), RFC 542 (1973), and finally RFC 765 (1980) when it was ported to TCP. RFC 959 in 1985 is the version most engineers know, but its structure — control channel, data channel, ASCII commands, three-digit reply codes — is recognizably the 1971 design. Bhushan also wrote the first Telnet spec and went on to a career at Xerox PARC.
Why it still matters. Aside from FTP itself (which refuses to die — anonymous FTP mirrors still serve Debian and the kernel sources), RFC 114 established the shape of an application protocol: a stateful conversation in human-readable ASCII over a reliable byte stream, with a separate data path for bulk transfer. SMTP (1982), NNTP (1986), HTTP (1991), and IMAP (1988) all inherited that template. Even today, when you debug an HTTP/1.1 connection with telnet host 80, you are using a design pattern Bhushan picked in 1971 because punching commands by hand made the protocol easier to learn.
Daily Software Engineering
2026-06-08
Eventual consistency is a deal you make with distributed systems: if no new writes happen, all replicas will eventually agree. The word "eventually" is doing a lot of heavy lifting. It might mean 50 milliseconds. It might mean 30 seconds. It might mean "after that node finishes its three-hour catch-up." Your job is to make the application tolerate that window without lying to users.
The classic failure mode is the read-your-own-writes bug. A user updates their profile picture, the write goes to the primary, the page reloads, the read hits a lagging replica, and the user sees their old picture. They click "update" again. Now you have two writes racing, and your support inbox has a complaint that says "your site is broken."
Real-world example: A social app posts a comment. The write lands in region us-east-1. The user in us-west-2 refreshes and sees nothing — replication lag is 800ms. Three fixes are common:
The consistency window calculation: if your replication lag's p99 is 200ms and a user issues 10 reads/sec, the probability that at least one read in a 1-second window hits stale data is roughly 1 − (1 − 0.01)^10 ≈ 9.5%. That's not a rare bug — it's a daily occurrence at any meaningful scale. Measure replication lag and design the UI around its p99, not its average.
Rule of thumb: classify every read by its tolerance for staleness. Strong: account balance before a transfer, inventory at checkout. Bounded: feed timestamps, view counts (stale by ≤5s is fine). Eventual: aggregate analytics, recommendations. Default to bounded. Only pay the latency tax of strong consistency where money or correctness demands it.
The trap junior engineers fall into is treating "eventually consistent" as a synonym for "broken." It isn't — it's a contract. The bug isn't the lag; it's the UI that pretends the lag doesn't exist. A spinner that says "syncing…" for 400ms is honest. A button that silently does nothing because the write hasn't propagated yet is the bug.
Tool Nobody Knows
2026-06-08
For thirty years, every time I wanted to know what the kernel was actually doing, I had bad options. strace stops the world with ptrace. perf samples but doesn't tell you why. SystemTap compiles a kernel module and made me cry in 2009. bpftrace finally fixes it: an awk-shaped language that compiles to eBPF, runs in the kernel at native speed, and aggregates results in-kernel so you don't drown in events.
The one-liner most engineers learn first — and the one that has saved me a dozen late nights — is the system-wide openat() tap:
bpftrace -e '
tracepoint:syscalls:sys_enter_openat {
printf("%-16s %s\n", comm, str(args->filename));
}'
Every process. Every file open. No attaching, no PID, no recompile. The overhead is maybe 1% on a busy box. Try doing that with strace -f from PID 1 and watch your machine catch fire.
The real party trick is in-kernel aggregation. Want a histogram of read latencies, bucketed in log2, in microseconds, for every block device, live?
bpftrace -e '
kprobe:vfs_read { @start[tid] = nsecs; }
kretprobe:vfs_read /@start[tid]/ {
@us = hist((nsecs - @start[tid]) / 1000);
delete(@start[tid]);
}'
Hit Ctrl-C and you get an ASCII histogram. No log file rotation, no awk post-processing, no missed events because your userspace was paged out.
A few more I keep in my back pocket:
fsync? — bpftrace -e 'tracepoint:syscalls:sys_enter_fsync { @[comm] = count(); }'. Catches the daemon that's making your SSD sing.bpftrace -e 'kprobe:tcp_retransmit_skb { @[kstack] = count(); }'. Better than tcpdump for finding where the kernel decided to resend.bpftrace -e 'software:faults:1 { @[comm] = count(); }'. The mystery memory hog reveals itself.bpftrace -e 'tracepoint:signal:signal_generate /args->pid == 4242/ { printf("%s -> %d\n", comm, args->sig); }'. Finally know whose kill -9 it was.List every probe point your kernel exposes — there are tens of thousands — with bpftrace -l 'tracepoint:*' or bpftrace -l 'kprobe:tcp_*'. The probe namespace alone is an education: uprobe, uretprobe, and usdt let you hook userspace symbols and USDT markers (PostgreSQL, Python, OpenJDK all ship them).
Why this beats the mainstream tools:
strace ptraces, which means every syscall takes two context switches just to be observed. On a 50k-syscall/sec process, you've slowed it to a crawl. bpftrace runs your filter inside the kernel, on the syscall path, with the event never leaving ring 0 unless your script says so.perf samples; it tells you which stacks are hot, not why a specific event happened. bpftrace is deterministic — every event you ask about fires your handler.SystemTap, there's no kernel module to compile, no DKMS dance, no kernel panic if you typo a script.The catches: needs root (or CAP_BPF+CAP_PERFMON on 5.8+), needs a kernel with BTF for kprobe argument access by name (most distros ship it now), and you can hang yourself with an unbounded map. Use hist(), lhist(), and count() — don't printf a million events a second unless you want to watch a tree fall in a forest you can't observe.
Once it clicks, you stop reaching for strace entirely for production diagnostics. The Brendan Gregg book and the bpftrace reference guide on GitHub are the only docs you need.
What If Engineering
2026-06-08
Shipping containers are tempting building blocks: standardized, structural, and absurdly cheap at ~$3,000 used. Architects love stacking them four or five high for trendy apartments. But what if we went vertical — say, 100 stories? Let's see where the steel cries uncle.
The container as a column. A standard 40-foot ISO container weighs ~3,800 kg empty. Its corner posts — four vertical Corten steel tubes roughly 160×160 mm with ~6 mm walls — carry essentially all vertical load. ISO 1496 certifies each container to be stacked 9 high fully loaded (about 192,000 kg on the bottom corners), giving roughly 1.9 MN per container of rated capacity. The corner castings themselves are tested to ~848 kN each, so ~3.4 MN ultimate before destructive yield.
How tall before the bottom crushes? A container is 2.9 m tall. 100 stories = 290 m (roughly Aon Center in Chicago). If each upper container weighs 30,000 kg loaded (a modest apartment fit-out with furniture, plumbing, two people), the bottom container supports 99 × 30,000 kg × 9.81 m/s² ≈ 29.1 MN. That's 15× the rated stacking load and 8.5× the ultimate corner casting strength. The bottom row pancakes before you finish floor 30.
Working backwards: with rated 1.9 MN capacity and 30,000 kg per floor (294 kN), you get ~6 floors before yielding the corners. Even using the destructive ultimate (3.4 MN), you cap out at ~11 floors. This matches what's actually been built — the 11-story Containerwolf dorm in the Netherlands needed a separate steel skeleton inside.
Brute-force fix #1: thicken the corner posts. Replace the 6 mm Corten tubes with solid 160×160 mm steel billets. Cross-section becomes 0.0256 m². At A572 Grade 50 steel (345 MPa yield), each post handles 8.8 MN, four posts give 35 MN. Suddenly 100 stories pencils out — but each container now hides ~800 kg of solid steel per post, 3,200 kg total. You've turned a $3,000 box into a $30,000 box, and the "container" is mostly just a decorative skin around a conventional steel frame. You reinvented the skyscraper.
The real killer: lateral load. Wind on a 290 m × 30 m face at 50 m/s with drag coefficient 1.3 gives a force of 0.5 × 1.225 × 50² × 1.3 × 8,700 ≈ 17.3 MN. The overturning moment at the base is ~17.3 MN × 145 m ≈ 2.5 GN·m. Containers connect only at eight corner castings via twist-locks rated to ~250 kN in shear. To resist that moment across a 30 m base, you need tension/compression couples around 83 MN — 330× the twist-lock rating. Without a moment frame or diagonal bracing welded across the entire facade, the building peels apart like a Jenga tower in a stiff breeze.
Bonus problem: Corten's 2 mm/century corrosion rate is fine for an ocean voyage. Over a 50-year building life with thermal cycling and condensation between stacked walls (mold heaven), you lose ~1 mm of wall — 17% of the corner post thickness. Buckling capacity scales with t³, so you've lost ~43% of your structural margin to rust.
Wikipedia Rabbit Hole
2026-06-08
Wikipedia: Read the full article
Drive south from San Francisco on Interstate 280 and you'll cross over something genuinely strange: a two-mile-long building, perfectly straight, that for decades held the record as the longest building in the United States. Beneath it runs a tunnel where electrons are flung to within a whisker of the speed of light. Above it sits the klystron gallery — a corridor of high-powered vacuum tubes that pump microwave energy into the beamline below, like a 3.2-kilometer pipe organ playing to subatomic particles.
This is SLAC, the Stanford Linear Accelerator Center, and its history is studded with Nobel Prizes. Three of them, in fact, came directly from work done inside that beam pipe. In 1976, Burton Richter co-discovered the J/ψ particle at SLAC's SPEAR ring — independently and simultaneously with a team at Brookhaven — which confirmed the existence of the charm quark. Richard Taylor shared the 1990 prize for proving that protons aren't fundamental: they're made of quarks, which SLAC saw by firing electrons at them and watching how they bounced, in a beautiful echo of Rutherford's gold-foil experiment from 70 years earlier. Martin Perl won in 1995 for discovering the tau lepton, the heavy cousin of the electron.
But the klystrons are the unsung hero of the whole place. A klystron is a vacuum tube invented by the Varian brothers at Stanford in 1937 — the very same family Ansel Adams photographed sand dunes for in memoriam. It works by bunching up a beam of electrons with a microwave signal, then extracting amplified power as those bunches drift past resonant cavities. SLAC has rows upon rows of them, each one a refrigerator-sized device kicking out tens of megawatts of pulsed microwave power. Without klystrons, the linear accelerator simply doesn't accelerate.
And here's where things get delightfully unexpected:
The same klystron technology that powers SLAC also sits inside every radar dish that guided WWII bombers, every satellite ground station, and — in a smaller cousin form called the magnetron — the microwave oven in your kitchen. The line from "warming up leftover pizza" to "discovering the substructure of matter" is, remarkably, a fairly short one. Both rely on the same trick: take a beam of electrons, give it a gentle shove at just the right frequency, and harvest a tidal wave of microwave power.
Daily YT Documentary
2026-06-08
Channel: Avian Eddie (65 subscribers)
Of today's candidates, this bird-focused mini documentary stands out as the most genuinely educational. The Red-eyed Vireo (Vireo olivaceus) is one of North America's most abundant breeding songbirds, yet it's rarely seen because it spends most of its life high in the leafy canopy. That makes it a perfect subject for a focused mini-doc: most viewers have probably heard this bird without realizing it.
Small-channel birding videos like this one tend to be made by genuine enthusiasts who care about accuracy — and Avian Eddie's framing as a "facts" documentary suggests a structured tour through identification, song, range, and behavior rather than just pretty footage. Red-eyed Vireos are famous among ornithologists for their relentless singing (a single male has been recorded delivering over 20,000 songs in a day) and for their remarkable migration from North American forests down to the Amazon basin.
The other options today are largely school projects, vague "tribute" videos, or AI-generated channels with near-zero subscribers and no real expertise behind them. A short, specific natural-history piece on a single species is the kind of thing small YouTube does best — niche, sincere, and informative.
Daily YT Electronics
2026-06-08
Channel: TwentyTwoLab (1160 subscribers)
The BC547 is one of the most ubiquitous NPN transistors in hobbyist electronics, and understanding how to use it as a low-side switch is a foundational skill. This video walks through a practical breadboard build paired with a circuit diagram — the combination of theory and hands-on demonstration that's often missing from quick component overviews.
What makes transistor-as-switch tutorials worthwhile is that they tie together several concepts: base current limiting (why you need that resistor between your signal source and the base), saturation (driving the base hard enough to minimize Vce and waste less power as heat), and the relationship between hFE and load current. Once you internalize this with a BC547 driving an LED or small load, the same pattern scales up to MOSFETs driving motors, relays, or anything else.
For anyone learning Arduino or microcontroller projects, this is the bridge between "my GPIO can blink an LED" and "my GPIO can control real-world loads." The BC547 specifically handles up to ~100mA, making it perfect for driving small relays, buzzers, or the base of a larger Darlington pair. Watch for: the base resistor calculation and whether the demonstrator explains why saturation matters versus the linear region.
Daily YT Engineering
2026-06-08
Channel: STEPX Journal (149 subscribers)
Most of this week's batch is shorts and hashtag spam, but this one is a genuine lecture-format treatment of the general heat conduction equation — the partial differential equation that sits underneath every thermal engineering problem from CPU heatsinks to building insulation to spacecraft re-entry.
The video promises to work through Fourier's law (the constitutive relation that says heat flux is proportional to the negative temperature gradient, q = -k∇T), derive the conduction equation from an energy balance on a differential control volume, and then walk through the boundary conditions that turn an abstract PDE into a solvable problem: prescribed temperature (Dirichlet), prescribed heat flux (Neumann), and convective boundaries (Robin / mixed).
That boundary-condition piece is where a lot of self-taught learners get stuck. You can find Fourier's law on Wikipedia in thirty seconds, but knowing which boundary condition to apply to an insulated wall versus a fin in crossflow versus a heated surface is the practical skill that lets you actually solve thermal problems. A focused lecture from a small channel that takes time on this is more useful than a polished overview that skips it.
STEPX Journal is tiny (149 subs) and the production is clearly student-grade, but the topic scope is real engineering coursework, not pop-science.
Daily YT Maker
2026-06-08
Channel: MizPRedator (915 subscribers)
Dowels are one of those shop consumables that seem trivial until you need a specific diameter in a specific wood species and realize the hardware store only carries birch in three sizes. This short build tackles that problem with a shop-made dowel maker — essentially a jig that lets you feed roughly-square stock through a cutting aperture and pull out a clean cylindrical dowel on the other side.
The principle is the same one used by commercial dowel plates and rounders: a sharpened bore (often a sized hole in steel, or a router/chisel-style cutter mounted in wood) shaves the corners off a slightly oversized blank as it's driven or pulled through. The result is a dowel matching the bore's diameter, in whatever species of scrap you happen to have on the bench.
Why this is worth the 60 seconds: it's a tool that makes tools, which is the kind of leverage a small workshop benefits from most. Once you have it, you can produce custom-diameter dowels in walnut, maple, oak, or contrasting woods for through-tenon pegs, drawer pulls, miniature work, or repairs where matching the original wood matters. The build itself teaches a useful concept — that cutting geometry, not exotic materials, is what determines whether a jig works.
MizPRedator's channel sits at the sweet spot of small-shop pragmatism: real builds, no fluff, and ideas you can replicate with scrap and an afternoon.
Daily YT Welding
2026-06-08
Channel: John Shaw (777 subscribers)
Pricing custom fabrication work is one of the hardest skills for any small metal shop to develop, and it rarely gets discussed openly. John Shaw walks through a real job — an engraved planter that doubles as a grave marker — and breaks down how he arrives at a number that covers material, machine time, consumables, design work, and welding labor without scaring off the customer.
What makes this worth watching is the combination of shop process and business reasoning. You see the CAD layout, the plasma cut, the cleanup, and the TIG/MIG welding that brings the piece together, but the running commentary is about why each step costs what it costs. Shaw also touches on the Etsy side of his business, where pricing has to compete globally, versus local custom commissions where the relationship and turnaround time matter more.
For anyone running a CrossFire, Langmuir, or ArcDroid table out of a home shop and trying to turn it from hobby into side income, this is the kind of honest, numbers-on-the-table conversation that's hard to find. It's also useful for customers who want to understand why a custom metal piece isn't $20.
