24 newsletters today.
Abandoned Futures
2026-06-03
In May 1964, at Edwards Air Force Base, a stubby little jet with two enormous holes in its wings lifted straight up off the desert floor. The Ryan XV-5A Vertifan wasn't using rotors, tilting ducts, or vectored thrust. It was using fan-in-wing propulsion: two 62.5-inch-diameter lift fans buried flush inside the wings, plus a smaller one in the nose, all driven by tip-turbine exhaust diverted from a pair of General Electric J85-GE-5 turbojets. When the louvers above the fans closed, it was a conventional jet doing 547 mph. When they opened, it was a flying carpet.
The aircraft was built by Ryan Aeronautical in San Diego under a 1961 U.S. Army contract worth roughly $35 million for two prototypes (about $370 million today). The fan technology came from General Electric's X353-5B lift-fan system, which used a clever bit of fluid mechanics: hot exhaust from the J85s was ducted to the rim of each lift fan, where it spun small turbine buckets at the blade tips. This meant the fans produced about 2.6 times the thrust of the raw jet exhaust they consumed — a thrust augmentation ratio that made vertical takeoff actually affordable in fuel terms, unlike the Harrier's brute-force vectored thrust.
Why did it die?
Here's the case for 2026: the fan-in-wing concept's hardest problem was control authority during transition, and that problem is solved. Modern fly-by-wire systems run the F-35B's lift fan / roll-post / swivel-nozzle ballet at 400 Hz without breaking a sweat. Ryan's tip-turbine drive eliminates the F-35B's heavy shaft, clutch, and gearbox — the Rolls-Royce LiftSystem weighs about 4,500 lb and consumes 27,000 horsepower through a mechanical driveshaft running through the airframe. Tip-turbine drive replaces all that with ductwork.
Modern materials make it even more attractive. The original GE fans used steel rotors that ran hot and heavy. Ceramic matrix composite (CMC) tip turbines — already flying on the LEAP engine's HP turbine shrouds — could handle 2,400°F gas paths with a fraction of the weight. Carbon-fiber fan blades with hollow titanium leading edges (standard on the GE9X) would let the same fan diameter produce 40-60% more thrust.
The application that actually wants this: autonomous logistics VTOL. DARPA's ANCILLARY program and the Marines' Aerial Logistics Connector requirement both ask for 1,000-lb-payload VTOL with 300+ kt cruise. Tilt-rotors are mechanically baroque. Lift-plus-cruise quadcopters can't cruise efficiently. A fan-in-wing UAV with CMC tip turbines and modern FBW splits the difference — clean wing in cruise, no exposed rotors, no transition gymnastics that need a human pilot. Ryan was right; the louvers just needed software.
Daily Automotive Engines
2026-06-03
The turbo shaft spins at 150,000-280,000 RPM under boost. How you support that shaft determines spool time, durability, and how forgiving the turbo is to oil contamination. Two camps fight this out: journal bearings (hydrodynamic plain bearings) and ball bearings (angular-contact rolling elements).
Journal bearings are simple bronze or brass sleeves with a pressurized oil film between the shaft and bearing surface. The shaft never actually touches the bearing — it rides on a hydrodynamic wedge of oil maybe 0.0005" thick. Cheap to manufacture, tolerant of dirty oil, and self-damping (the oil film absorbs shaft vibration). Downside: significant parasitic friction at low RPM before the oil wedge fully forms, plus the oil film itself creates viscous drag.
Ball bearings use angular-contact ceramic or steel balls in a cartridge — typically one cartridge supporting both turbine and compressor ends. Rolling friction is roughly 50% lower than a journal bearing at spool-up RPM. The turbo accelerates faster, transient response sharpens noticeably, and you typically gain 15-20% faster spool time on the same hardware.
Real-world example: Garrett's GT2860RS (journal) vs GTX2860R Gen II (ball bearing) — same compressor wheel and turbine, identical A/R housings. Independent dyno testing on a 2.0L EJ20 Subaru shows full boost arriving at ~3,400 RPM with the journal vs ~2,900 RPM with the ball bearing. That's 500 RPM earlier on a turbo where the rotating assembly is otherwise identical.
The rule of thumb for spool time:
Tradeoffs ball bearings can't escape:
One nuance: ball bearing turbos need a 0.040"-0.062" restrictor in the oil feed line. Too much oil pressure floods the cartridge and the seals leak past the piston rings — the classic "ball bearing turbo smoking on overrun" complaint is almost always missing or wrong-size restrictor.
Daily Debugging Puzzle
sync.Mutex Value Receiver Trap: The Lock That Guards a Copy2026-06-03
This counter is supposed to be safe for concurrent use. A thousand goroutines each call Increment, and we expect to see 1000 at the end. We get 0. The mutex is taken, the addition runs, no data race is reported by -race. What went wrong?
package main
import (
"fmt"
"sync"
)
type SafeCounter struct {
mu sync.Mutex
count int
}
func (c SafeCounter) Increment() {
c.mu.Lock()
defer c.mu.Unlock()
c.count++
}
func (c SafeCounter) Value() int {
c.mu.Lock()
defer c.mu.Unlock()
return c.count
}
func main() {
c := SafeCounter{}
var wg sync.WaitGroup
for i := 0; i < 1000; i++ {
wg.Add(1)
go func() { defer wg.Done(); c.Increment() }()
}
wg.Wait()
fmt.Println("Count:", c.Value()) // prints 0
}
Look at the receivers: func (c SafeCounter) Increment(). That's a value receiver, not a pointer receiver. Every call to c.Increment() copies the entire SafeCounter struct — including the embedded sync.Mutex — onto the goroutine's stack. Each goroutine then locks its own private copy of the mutex, increments its own private copy of count, and discards both when the method returns. The original c.count in main is never touched.
Worse: copying a sync.Mutex is itself a bug. A mutex is a pair of state words; duplicating one mid-lifecycle can produce a copy that thinks it's locked when nothing holds it, or vice versa. Even when the copy starts in the zero state (as here), you've severed every caller from every other caller — the mutex protects nothing meaningful because no two goroutines share one.
The race detector won't catch it, because there is no race: every increment happens on a distinct memory location. go vet does catch it, emitting Increment passes lock by value: SafeCounter contains sync.Mutex. Most CI pipelines run vet by default — but the warning is easy to miss if you're skimming output, and the program compiles and runs without complaint.
Use pointer receivers for any method on a type containing a mutex (or any type whose identity matters):
func (c *SafeCounter) Increment() {
c.mu.Lock()
defer c.mu.Unlock()
c.count++
}
func (c *SafeCounter) Value() int {
c.mu.Lock()
defer c.mu.Unlock()
return c.count
}
Now c.Increment() passes the address of c, every goroutine locks the same mutex, and the increment is visible across the program. The output becomes 1000, as intended.
The deeper lesson is that Go's choice between value and pointer receivers is not stylistic — it's a correctness boundary. The rule of thumb most teams adopt:
sync.Mutex, sync.RWMutex, sync.Once, sync.WaitGroup, or atomic.*, use pointer receivers everywhere, including methods that don't touch the lock. Mixing receiver kinds invites accidental copies through interface satisfaction.Run go vet ./... in CI and fail the build on its output. It catches this, the copylocks analyzer is on by default, and the diagnostic is unambiguous. Trusting "go build passed" is not enough.
sync.Mutex silently copies the lock per call, so every goroutine ends up serializing against its own private mutex — always use pointer receivers for types with synchronization primitives, and make go vet failures break your build.
Daily Digital Circuits
2026-06-03
You already know combinational hazards exist — a logic output can momentarily glitch when inputs transition, even when both the start and end states agree the output shouldn't change. The follow-up question is: how do synthesis tools actually find and eliminate those hazards before tapeout? The answer is a formal procedure built on prime implicants and Boolean cube covering.
A static 1-hazard occurs when output should stay at 1, but a single input transition causes a momentary 0. Geometrically: two minterms that are adjacent on the Karnaugh map (differ in one variable) are covered by different product terms, with no overlapping cube bridging them. As the input flips, the first AND gate turns off before the second turns on — and the OR briefly sees all zeros.
The fix: add a redundant prime implicant that covers the transition boundary. The resulting cover is called a complete sum — every pair of adjacent minterms in the ON-set is covered by at least one common cube. Synthesis tools call this hazard-free two-level minimization.
Concrete example. F(A,B,C) = A'B + AC. ON-set: {010, 011, 101, 111}. Consider transition 011 → 111 (A flips 0→1, B and C stable). At 011, A'B=1, AC=0, F=1. At 111, A'B=0, AC=1, F=1. Mid-transition, with A in between, both products can be 0 → output glitches to 0. Adding BC (which covers both 011 and 111) eliminates the hazard: F = A'B + AC + BC. The extra term is logically redundant but topologically essential.
Dynamic hazards are worse — output transitions multiple times when it should transition once. They appear in multi-level logic where one input fans out through paths of different depths. Detection requires path sensitization analysis: trace every signal path from a transitioning input to the output and check if more than one path is simultaneously sensitized with opposing polarities.
Rule of thumb for synthesis: if your downstream logic is edge-triggered (samples on a clock edge after combinational settling), hazards don't matter — STA's setup-time check absorbs them. If downstream logic is level-sensitive (latches, asynchronous resets, clock-gating enables, async FIFO pointers), every hazard is a potential bug. Synthesis tools like Synopsys DC accept set_hazard_free attributes on those specific nets and add redundant cubes only where needed.
Real-world bite: a clock-gating enable computed as en = sel ? a : b with a glitchy sel can produce a runt clock pulse downstream, corrupting hundreds of flip-flops simultaneously — the exact reason ICG cells require glitch-free enables and CAD tools refuse to gate clocks with hazard-prone logic.
Daily Electrical Circuits
2026-06-03
When you cascade gain stages in an amplifier, each stage contributes a pole. With three or more poles inside a feedback loop, the phase shift can exceed 180° before the loop gain falls below unity — and your "amplifier" becomes an oscillator. Dominant-pole compensation is the most common fix: deliberately add a low-frequency pole so the gain rolls off at –20 dB/decade and crosses unity gain before the other poles add significant phase shift.
The idea is brutally simple. If your uncompensated amp has poles at 1 MHz, 10 MHz, and 50 MHz with an open-loop DC gain of 100 dB, you have ~270° of phase shift accumulating by 50 MHz — wildly unstable under feedback. Add a dominant pole at, say, 10 Hz, and the gain now rolls off cleanly from 100 dB at 10 Hz, hitting 0 dB at 1 MHz (the old first pole). At that crossover, only the dominant pole has contributed phase shift (–90°), giving you a healthy 90° phase margin.
How to implement it:
Concrete example: The classic LM301A has an external compensation pin. For unity-gain stability, datasheet says 30 pF. But if you only need a closed-loop gain of 10, you can drop to 3 pF — because feedback only attenuates the loop gain by 20 dB, you can let the dominant pole sit 10× higher in frequency. Result: 10× the bandwidth for the same phase margin. This is why "decompensated" op-amps exist — they trade unity-gain stability for bandwidth at higher gains.
Rule of thumb: Place the dominant pole such that the gain-bandwidth product equals the frequency of your second pole. So if the next pole is at 5 MHz and DC gain is 10⁵ (100 dB), put the dominant pole at 5 MHz / 10⁵ = 50 Hz. This guarantees the second pole hits exactly at the unity-gain crossover, giving ~45° phase margin — marginal but stable. For 60° margin (the usual target), push the dominant pole down another factor of 2.
The cost? Bandwidth. You're throwing away gigahertz of potential to buy stability. That's the deal — and it's almost always worth it.
Daily Engineering Lesson
2026-06-03
Many processes don't care about the absolute flow of any one stream — they care that two streams stay in a fixed proportion. Burner air-to-fuel, chlorine-to-water in disinfection, blending two reactants, diluting a concentrate. If the ratio drifts, you get incomplete combustion, undertreated water, off-spec product, or an explosion. Ratio control is the standard scheme that solves this.
The setup has two streams: a wild flow (uncontrolled, set by upstream demand) and a controlled flow (the one with a valve you can move). A flow transmitter measures the wild stream. That measurement is multiplied by a ratio setpoint R, and the result becomes the setpoint for the controlled stream's PID loop:
When the wild flow rises, the controlled setpoint rises with it. The PID loop chases that moving target and the ratio holds. The ratio block itself is just a multiplier — no integral action, no tuning. All the work is done by the underlying flow loop.
Real example — combustion air on a natural gas burner. Steam demand swings the fuel valve (fuel is wild, driven by a master pressure controller). You want roughly 10 standard cubic feet of air per cubic foot of methane for clean burn, plus 10% excess air for safety, so R ≈ 11. Fuel flow doubles? The air setpoint doubles, the damper opens, combustion stays clean. Without ratio control, the air loop would lag the fuel change — you'd get a brief fuel-rich pocket, soot, and possibly a flameout.
Cross-limiting is the safety twist on this scheme. On a fuel increase, you raise air first, then let fuel follow the actual measured air. On a fuel decrease, you cut fuel first, then let air follow. The controlled stream is always slaved to whichever side keeps the mix lean. Every industrial burner control system (BMS) does this — it's why startup sequences light air before fuel.
Rule of thumb: ratio control only works if the controlled stream's PID loop is faster than the wild stream's disturbances. If the wild flow can swing in 2 seconds but your control valve takes 10 seconds to stroke, the ratio will lag and you've got nothing. Pair fast-acting wild streams with fast valves (small, electric, or high-Cv pneumatic with a positioner), and add feedforward if the lag is unavoidable.
Watch for measurement units: if one flow is mass and the other is volumetric, your ratio constant has to include density. Compressible gases at varying pressure need temperature/pressure compensation or your "constant ratio" silently drifts with the weather.
Forgotten Darkroom
2026-06-03
Book: Practical hints on the daguerreotype: being simple directions for obtaining portraits, views, copies of engravings and drawings, sketches of machinery, etc., etc. by the Daguerreotype process by J. H. Croucher (1845)
Read it: Internet Archive
In the opening pages of his 1845 manual on the daguerreotype, J. H. Croucher slips in a small but startling aside about what the field of photography is actually called:
"Notwithstanding the many valuable discoveries with which the researches of Sir John Herschell, Mr. Fox Talbot, Mr. Robert Hunt, and other distinguished philosophers, native and foreign, have recently enriched the science of Photography, or, as it is now termed, Actino-Chemistry, the Daguerreotype process, first divulged in 1839, still retains the highest place in public estimation."
"As it is now termed, Actino-Chemistry." In 1845, six years after Louis Daguerre unveiled his miracle to the world, there was a serious push to rename the entire field. The word "photography" — light-writing, from the Greek phos and graphein — was considered by some philosophers to be too superficial. It described only the visible product. The new proposed name, actino-chemistry, from the Greek aktis (ray) plus chemistry, described what was actually happening: chemical reactions driven by radiation.
The book itself was published in London by Willats, opticians of Cheapside, as the second installment in their "Photographic Manuals" series. Croucher wrote for the burgeoning crowd of amateurs who had bought daguerreotype kits and were now exposed, as he gently puts it, "to frequent annoyance and disappointment." His goal was practical: short directions, no technicalities.
So why did "actino-chemistry" lose? Partly because Herschel's coinage "photography" was simply easier to say, and the public had already adopted it. But the deeper reason is that the term was too accurate. It correctly identified that photography was a subset of a much larger chemical phenomenon — light driving reactions — and the public didn't want a subset; they wanted a name for the magical thing that produced portraits.
But the root never quite died:
The 1845 philosophers were ahead of their time. They understood, decades before the photoelectric effect was formally described by Hertz (1887) and explained by Einstein (1905), that light was a chemical agent — not just a passive illuminator, but an active reagent that broke and made bonds. The "Practical Hints" book is, on its surface, a how-to for polishing silver plates and fuming them with iodine. But buried in its first paragraph is a glimpse of a road not taken: a parallel universe where every smartphone has a tiny actinochemical sensor, and where "actinographer" would be the name for someone who takes pictures.
Forgotten Patent
2026-06-03
On May 31, 1960, two Bell Labs engineers filed US Patent 3,102,230 — "Electric Field Controlled Semiconductor Device." The inventors were Mohamed M. "John" Atalla, an Egyptian-born engineer, and Dawon Kahng, a Korean-born physicist. Almost nobody remembers their names. But by some estimates, humans have manufactured more than 13 sextillion (1.3 × 10²²) copies of their invention — making the MOSFET the most-produced artifact in the history of our species, by orders of magnitude.
A MOSFET — Metal-Oxide-Semiconductor Field-Effect Transistor — is a switch with no moving parts. Apply a small voltage to a "gate" electrode, and a microscopic channel of charge forms underneath a thin layer of silicon dioxide insulator, letting current flow between two other terminals (source and drain). Remove the voltage and the channel vanishes. Off. On. Off. On.
Crucially, the gate is insulated. Almost no current flows into it. That means a MOSFET draws essentially zero power when idle — a property that, scaled up to billions of devices on a chip, is the only reason your phone doesn't melt.
Bell Labs had bet on the bipolar junction transistor (Shockley, 1948) and treated the MOSFET as a curiosity. The silicon-dioxide gate was finicky; surface contamination ruined yields. RCA, Fairchild, and a tiny startup called Intel quietly pushed it forward. The breakthroughs came:
Atalla and Kahng's original device had a channel about 25 micrometers long. Modern "3nm" process nodes have effective gate lengths under 20 nanometers — roughly 1,250× smaller. The basic structure in their 1960 patent diagram is still recognizable in today's FinFETs and gate-all-around transistors. The geometry got weirder; the physics got harder; but the fundamental idea — use a field, not a current, to control conduction — has not changed in 66 years.
It already has been, repeatedly — and we're approaching the limit. Below about 1 nm, quantum tunneling lets electrons leak through gate oxides regardless of voltage. The industry's response is to wrap the gate around the channel on three or four sides (FinFET, GAAFET, RibbonFET) and to use exotic dielectrics like hafnium oxide. But every solution is still a variation on Atalla and Kahng's central trick.
Atalla later founded a security company that pioneered the hardware security module — the device that protects nearly every ATM PIN transaction today. Kahng went on to invent the floating-gate cell that stores this very webpage. Neither received a Nobel Prize. Neither is a household name. Yet between them, they built the substrate of the digital civilization.
Daily GitHub Zero Stars
2026-06-03
Language: Python
sectum-ai tackles a problem that's only going to get more painful as multi-tenant AI products mature: how do you actually prove that Tenant A's data isn't bleeding into Tenant B's model outputs, RAG retrievals, embeddings, agent memories, or fine-tunes? The repo bills itself as a multi-tenant AI verification platform that provisions synthetic tenants, hunts for cross-tenant data leakage across every AI surface, and emits tamper-evident evidence of what it found.
That framing is unusually concrete for an AI-safety adjacent project. Most "AI red team" tooling stops at prompt injection and jailbreak corpora. Sectum is aiming a layer deeper — at the isolation boundary itself, which is where regulators, enterprise security teams, and SOC2/ISO auditors are starting to ask hard questions that few vendors can answer rigorously.
What makes the approach interesting:
Who should care: platform security engineers at SaaS companies bolting LLMs onto multi-tenant products, AI infrastructure teams building shared RAG or agent systems, and GRC/compliance folks trying to map traditional tenant-isolation controls onto stochastic systems. Independent consultants doing AI assurance work could also use it as a structured test harness rather than rolling bespoke probes per engagement.
At zero stars and a fresh push, this is clearly early — but the problem statement is sharp, the scope is honest, and the niche is genuinely underserved.
Daily Hardware Architecture
2026-06-03
You've seen hardware prefetchers before, but the stream buffer is a specific structure that solves a sharp problem: how do you prefetch aggressively without trashing the cache when you're wrong? The answer is to prefetch into a separate, small FIFO that sits beside L1, not inside it.
Norman Jouppi introduced stream buffers in 1990 as a complement to victim caches. The idea: when a load misses L1, allocate a stream buffer entry and start fetching sequential cache lines (line+1, line+2, line+3, line+4) into a small FIFO — typically 4 to 16 entries deep. The lines never enter L1 until the program actually demands them. On the next miss, the L1 controller checks the stream buffers in parallel; if the address matches the head of a stream, that line gets promoted into L1 and the FIFO advances, triggering another sequential prefetch at the tail.
Why a separate buffer instead of just prefetching into L1? Two reasons:
Concrete example: matrix copy. Consider memcpy of a 1 MB array. Each load misses, but the access pattern is perfectly sequential. With a 4-entry stream buffer, after the first miss you have 4 lines in flight. By the time the CPU asks for line+1, it's already waiting in the stream buffer — promote it to L1 in a couple cycles instead of waiting 200+ cycles for DRAM. Effective bandwidth approaches the DRAM peak instead of being latency-limited.
Rule of thumb: Stream buffer depth should cover the memory latency. If DRAM latency is 200 cycles and you consume one line every 16 cycles, you need 200/16 ≈ 13 entries to fully hide latency. Modern Intel CPUs implement variations called the "Streamer" prefetcher with similar depth, watching both forward and backward strides.
Multiple stream buffers run in parallel — typically 4 to 8 — so a workload touching several arrays simultaneously (think a stencil computation reading three rows) gets a dedicated FIFO per stream. When a new miss doesn't match any existing stream, the oldest stream gets evicted, much like cache replacement.
The elegance: stream buffers convert latency-bound sequential code into bandwidth-bound code, without ever risking the working set that's actually in L1.
Hacker News Deep Cuts
2026-06-03
Link: https://fgiesen.wordpress.com/2026/05/29/why-does-astc-use-ise-when-almost-nothing-else-does/
HN Discussion: 1 points, 0 comments
This is Fabian Giesen (ryg) — one of the most respected voices in low-level graphics and compression on the open web — writing about a genuinely obscure corner of GPU texture compression. If you don't already follow his blog, this single post is reason enough to start.
The puzzle: ASTC (Adaptive Scalable Texture Compression) is the modern standard for compressed textures on mobile and increasingly desktop GPUs. Buried inside its bitstream is something called Integer Sequence Encoding (ISE) — a scheme for packing sequences of small integers whose ranges aren't powers of two (think trits and quints: base-3 and base-5 digits packed alongside ordinary bits). Almost no other codec, compression format, or hardware spec reaches for this technique. So why ASTC?
Why this matters to a technical audience:
If you've ever wondered why production formats look weird up close — why BC7 has so many modes, why JPEG's quantization tables are what they are, why H.264 has a CABAC — this is the same kind of deep-dive, by someone who actually has the receipts.
HN Jobs Teardown
2026-06-03
Source: HN Who is Hiring
Posted by: ceckhaus
Of the ten postings in this batch, 10x Genomics is the most revealing because the framing itself is a tell. The pitch leads with "help our customers understand and eradicate Covid-19" and uses the word "urgently" — a word almost no senior engineering recruiter uses unless headcount approval just dropped from above. This posting is a pandemic-response hiring surge dressed in a careers page.
The stack: Java, Rust, and React. That trio is unusual and informative:
The split across Frontend, Full Stack, and Platform roles — all senior — tells you they have a working product and need to scale it under load, not greenfield it. Platform engineers in particular get hired when infrastructure pain has become bottleneck pain.
Green flags:
Red flags:
The deeper signal: in March 2020, biotech infrastructure companies discovered they were suddenly critical infrastructure. The Java-plus-Rust stack tells you 10x Genomics is not a startup figuring out product-market fit — it's a scale-up rewriting its performance-critical paths because customer data volumes are outgrowing the original architecture.
Daily Low-Level Programming
2026-06-03
Before 2012, a classic kernel exploit pattern was: trick the kernel into dereferencing a user-controlled pointer, point it at user memory you've prepared with a fake structure or shellcode, profit. The kernel ran in ring 0 with unrestricted access to every page in the address space — including yours. SMEP (Supervisor Mode Execution Prevention, Ivy Bridge 2012) and SMAP (Supervisor Mode Access Prevention, Broadwell 2014) close this door at the page-walk level.
The mechanism is the User/Supervisor bit (bit 2) already present in every page table entry. When SMEP is enabled (CR4.SMEP=1), the CPU faults if ring 0 tries to execute a page where U/S=1. When SMAP is enabled (CR4.SMAP=1), the CPU faults if ring 0 tries to read or write such a page. No new metadata — the bits were always there; the CPU just started enforcing them in the other direction.
But the kernel legitimately needs to touch user pages — that's what copy_from_user() does. The escape hatch is RFLAGS bit 18, AC (Alignment Check), repurposed as a per-instruction SMAP override. The kernel wraps user access with STAC (set AC) and CLAC (clear AC):
stac — RFLAGS.AC=1, SMAP suspended for this CPUmov from/to the user pointerclac — RFLAGS.AC=0, SMAP re-engagedThe window is two instructions wide. An attacker who can't control execution between STAC and CLAC can't bypass it. Linux's copy_from_user path on x86-64 is essentially one STAC, a rep movsb, and one CLAC.
Concrete example: CVE-2013-1763 (sock_diag) let an attacker get the kernel to call a function pointer in a user-allocated array. On a pre-SMEP machine, kernel execution jumped to user shellcode and got root. On a SMEP machine with the same bug, the indirect branch faulted instantly — the U/S=1 page wasn't executable from ring 0 — and the kernel oopsed instead of being owned.
Rule of thumb: if your kernel module crashes with a page fault at a user-looking address (below TASK_SIZE, typically below 0x00007fffffffffff) and the fault was a write or read, check whether you forgot copy_from_user and went straight through a pointer. SMAP turned a silent exploit primitive into a loud crash.
Check it on your machine: grep -o 'smep\|smap' /proc/cpuinfo. Disable for debugging with nosmep nosmap on the kernel command line — never in production.
RFC Deep Dive
2026-06-03
RFC: RFC 3428
Published: 2002
Authors: B. Campbell (Ed.), J. Rosenberg, H. Schulzrinne, C. Huitema, D. Gurle
In late 2002, the IETF faced an awkward problem: instant messaging had exploded — AIM, ICQ, MSN, Yahoo! Messenger — but every system was a walled garden running its own proprietary protocol. SIP was already winning the war for VoIP signaling, so a working group asked the obvious question: could SIP carry messages too? RFC 3428 is the answer, and it defines the MESSAGE method that still underpins SMS-over-IP, RCS, and most carrier-grade messaging today.
The core design decision: pager mode. The authors deliberately chose not to build a session-oriented chat protocol. A SIP MESSAGE request is a one-shot transaction — like a two-way pager. The message body (typically text/plain or message/cpim) rides inside the request itself, and the recipient's UA returns a 200 OK to acknowledge receipt. No INVITE, no dialog, no media negotiation. This kept the spec to fewer than 20 pages and let existing SIP proxies route IMs with zero changes.
Why pager mode and not sessions? Session-mode messaging (long-running chats with typing indicators, etc.) was deferred to MSRP (RFC 4975). The authors recognized that mobile and presence-driven IM patterns were inherently transactional — you send a line, it gets delivered, you're done. Forcing a SIP dialog for every "lol" would have been disastrous for battery life and proxy state.
The subtle bits worth knowing:
MESSAGE for high-volume chatter precisely because of this — congestion control is weaker than a real session would provide.MESSAGE request can be sent inside or outside a dialog. Outside-dialog is the common case — you address by AOR (sip:[email protected]) and the registrar/proxy fabric routes it like an INVITE.Why it matters in 2026. If you've used RCS (Rich Communication Services) on Android, every text message you sent rode RFC 3428. The GSMA's RCS Universal Profile uses SIP MESSAGE for short messages and MSRP for chat sessions, with the IMS (IP Multimedia Subsystem) core providing the proxy fabric. Every VoLTE phone has a SIP stack that speaks this method. When Apple finally added RCS support to iMessage in 2024, the underlying wire protocol was — quietly — RFC 3428 plus a pile of GSMA profiles on top.
The backstory. Dean Willis and the SIMPLE working group spent years fighting over whether IM should be a SIP extension or its own protocol (XMPP/Jabber being the alternative). The compromise was elegant: SIMPLE handled presence and messaging via SIP for the telecom world; XMPP won the consumer/enterprise side. Two decades later, both still coexist — XMPP in WhatsApp's roots and Zoom chat, SIP MESSAGE in every cellular network on Earth. Dmitry Gurle, one of the authors, came from Microsoft's MSN Messenger team, which tells you how seriously the IM-walled-garden crowd took this work.
MESSAGE request — a 20-page extension that quietly became the backbone of cellular messaging.
Stack Overflow Unanswered
2026-06-03
Stack Overflow: View Question
Tags: assembly, operating-system, kernel, bare-metal
Score: 0 | Views: 167
The asker has a hobby kernel that boots cleanly under QEMU but triple-faults on real hardware right around mov cr0, eax — the instruction that flips the CPU into protected mode. They're loading the kernel at 0x7E00 and chained through Ventoy from a custom ISO.
Why this is the canonical hobby-OS trap. QEMU is forgiving in ways real silicon is not. It tends to start with sane defaults: A20 already enabled, segment caches in a reasonable state, BIOS data areas untouched, and a GDT that "just works" even if your descriptors are slightly wrong. Real hardware, especially via a UEFI-CSM path that Ventoy uses, hands control off in a state that is technically legal but full of footguns.
Likely root causes, ranked by how often I've seen them bite people:
0x92), and verify by writing/reading across the 1 MiB boundary.cr0 write. If your GDT lives in a sector you didn't actually load, or your descriptors have the wrong granularity/limit, the very next far jump triple-faults. QEMU sometimes papers over a stale code segment cache; real CPUs do not.int 13h AH=02h has a per-call sector limit (typically 0x7F) and CHS quirks on real disks vs. QEMU's flat image. Use the LBA extension (AH=42h) and verify CF and AL after the call.0x7C00-ish. QEMU shrugs; real BIOS code may have left data there.dl may not be what you expect.Sketch of a debugging plan:
mov cr0, eax, write a recognizable byte to 0xB8000 (text VGA). If you don't see it on hardware, you never got there.cr0 write and the far jump, write a different byte. Now you know which side of the transition died.lgdt then dump the GDT bytes to screen and compare against your assembled image — confirms the loader actually copied them.0x92 unconditionally before enabling protected mode.Gotcha: "It works in QEMU" is not evidence of correctness — it's evidence that QEMU's defaults masked your bug. Always test under Bochs with -q and on at least one real machine before declaring a boot sequence done.
Daily Software Engineering
2026-06-03
Round-robin load balancing assumes every request costs the same. It doesn't. A search query might take 2ms; the next one might trigger a 4-second report generation. Round-robin will happily send the next 50 requests to the backend still chewing on that report, while a neighboring server sits idle. Least Connections fixes this by routing each new request to the backend currently handling the fewest in-flight connections.
The algorithm is dead simple: the load balancer tracks active connections per backend. When a request arrives, it picks the backend with the lowest count, increments it, and decrements when the response completes. No probes, no health scores, no prediction — just a counter.
Why it works: connection count is a lagging proxy for load. A backend stuck on slow requests accumulates connections; a fast backend drains them. The algorithm naturally steers traffic toward whoever's keeping up, without needing to know why they're keeping up.
Real-world example: Picture an image-processing API behind 4 workers. Most requests are thumbnail resizes (50ms), but 5% are full-page PDF rasterizations (8 seconds). With round-robin and 100 req/s, the PDF requests pile up on whichever worker drew the short straw — p99 latency for thumbnails spikes to 8+ seconds because they're queued behind PDFs. Switch to least connections: thumbnails route around the worker chewing on a PDF, and p99 drops back to ~80ms. The PDF worker still finishes in 8s, but it doesn't drag everyone else down.
The weighted variant: when backends have different capacities (e.g., mixed instance sizes), use weighted least connections: divide active connections by capacity weight, pick the lowest ratio. A backend with weight 4 handling 8 connections (ratio 2.0) gets the next request before a weight-1 backend handling 3 connections (ratio 3.0).
Rule of thumb: if your request cost variance (p99/p50 latency ratio) exceeds 5x, round-robin will hurt you. Switch to least connections. Below 2x, the difference is negligible — stick with round-robin for its O(1) simplicity and zero shared state.
Where it breaks: least connections assumes connections correlate with work. For long-lived connections (WebSockets, gRPC streams), it doesn't — a backend with 1000 idle WebSocket connections looks "busy" but isn't. Use connection-count-aware routing only for request/response workloads, or combine with active-request tracking (count in-flight requests, not connections) for multiplexed protocols.
HAProxy, Nginx Plus, AWS NLB, and Envoy all support it natively — usually a one-line config change.
Tool Nobody Knows
2026-06-03
You kicked off cp giant.iso /mnt/backup twenty minutes ago in another tmux pane. It's still going. Is it 5% done or 95%? You can't Ctrl-C and restart through pv — you'd lose the work. watch ls -la shows the destination file growing but tells you nothing about the source size. The grizzled answer is progress (formerly cv, the "Coreutils Viewer"), a tiny C program by Xfennec that attaches to any already-running process and reads its progress straight out of /proc.
Install it: apt install progress, brew install progress, or dnf install progress. Then point it at the running world:
# Show every recognized coreutils-ish job in flight
$ progress
[ 8409] cp /home/shaun/iso/ubuntu.iso
4.2 GiB / 5.7 GiB [=================>------] 73.6% 187 MiB/s eta 0:00:08
[12290] dd if=/dev/sda of=/mnt/img/sda.img
117 GiB / 953 GiB [===>---------------------] 12.3% 84 MiB/s eta 2:45:11
That's the killer feature — you didn't plan ahead. The cp was launched without pv, without --progress, without anything. progress walks /proc/$PID/fd/, finds the largest regular-file fd, reads /proc/$PID/fdinfo/$FD for the current seek position, and divides by the file's size. Pure user-space, no ptrace, no kernel module, no slowdown on the target.
Useful flags most people miss:
# Continuous monitor mode (like top) — refresh until you Ctrl-C
$ progress -m
# Wait for matching processes to appear, then track them
# Great for "the script will launch a dd somewhere, show me when it does"
$ progress -W 2 -c dd
# Pipe to another tool when the job finishes
$ progress -mp $(pgrep -f "tar.*backup") && notify-send "tar done"
# Quiet mode — just the bar, scriptable
$ progress -q -p 8409
73.6
# Track non-default commands (rsync, ffmpeg, openssl, etc.)
$ progress -a ffmpeg -a openssl
By default it watches a built-in list of ~25 commands (cp, mv, dd, tar, cat, gzip, gunzip, xz, zstd, rsync, shred, cpio, md5sum, sha*sum, …). Add yours with -a; ignore some with -i.
Why this beats the alternatives:
progress doesn't care; it joins the party late.dd for a progress bar is performing surgery to read a thermometer.One subtlety worth knowing: progress reports the largest open file descriptor by default, which is usually right but occasionally wrong (e.g. tar reading many sources). Use -W to disambiguate by command, or read /proc/$PID/io yourself for byte counters if you want true throughput including pipes.
It's a 1,500-line C program that does one thing the kernel has been telling you for free since 2.6 — and the only reason you needed it is that nobody wired the /proc/$PID/fdinfo pos field into the standard tools.
progress retrofits a progress bar onto any already-running cp/dd/tar/gzip by reading /proc/$PID/fdinfo — no replanning, no restart, no ptrace.
What If Engineering
2026-06-03
Silica aerogel is 99.8% air held together by a nanoscale silica skeleton. It's the lightest solid ever made (down to 3 kg/m³, lighter than air-by-volume relative to lead by a factor of ~3,800), the best thermal insulator known outside a vacuum, and — crucially — it's translucent. What if we stopped using it for spacesuits and Mars rovers and started laying it like masonry?
Pure silica aerogel is hilariously fragile: compressive strength around 16 kPa, less than wet cardboard. You could put your thumb through it. So we're not building with that. We're building with polymer-cross-linked aerogel (X-aerogel), which trades a bit of density (~150 kg/m³) for compressive strengths of 1–10 MPa. Still weak compared to fired clay brick (~7 MPa) or concrete (~30 MPa), but workable.
A two-story aerogel wall, 6 m tall, density 150 kg/m³, puts a base pressure of:
ρgh = 150 × 9.81 × 6 ≈ 8.8 kPa = 0.0088 MPa
That's ~1,000× below failure. Structurally fine for the wall itself. The roof is another story — you'd need a discrete timber or steel frame, because aerogel can't span. Think of the bricks as ultra-light infill, not load-bearers.
Thermal conductivity of X-aerogel is k ≈ 0.015 W/m·K, vs 0.04 for fiberglass batt and 0.6 for concrete. For a 30 cm aerogel wall with 20°C ΔT:
Q/A = k·ΔT/L = 0.015 × 20 / 0.3 = 1.0 W/m²
A whole 200 m² envelope leaks 200 W — less than two laptops. A code-minimum modern house at the same ΔT leaks roughly 1,500 W. You could heat the place with body warmth and a toaster. R-value works out to ~R-115 per foot, ten times spray foam.
Silica aerogel transmits 85–90% of visible light through thin samples, but Rayleigh-scatters blue wavelengths — that's why it looks faintly blue and casts a warm orange shadow. A 30 cm brick drops transmission to maybe 40–60%, and everything beyond the wall becomes a blurry impressionist painting. No windows needed: the entire house glows. Daytime interiors at ~3,000 lux without lamps. At night, the walls leak interior light outward — the neighborhood looks like a colony of Chinese lanterns.
$80–150 per liter at small scale. A 200 m² × 0.3 m envelope is 60 m³ = 60,000 L = $5–9 million per house. Mass production via ambient-pressure drying could plausibly drop that 20×, but you're still looking at $250K of walls.Verdict: structurally feasible as infill, thermally transformative, aesthetically surreal — and roughly the price of a Lamborghini per wall until manufacturing scales. The Passive House crowd should be paying attention.
Wikipedia Rabbit Hole
2026-06-03
Wikipedia: Read the full article
Imagine an engine with no pistons, no turbine blades, no moving parts whatsoever — just a tube of seawater, a magnetic field, and electricity. That's a magnetohydrodynamic converter, and the Soviets, Americans, and Japanese all quietly built working versions during the Cold War.
The physics is elegantly simple. When a conductive fluid (plasma, liquid metal, or even salt water) flows through a magnetic field, the field exerts a force perpendicular to both the flow and itself — the same Lorentz force that makes electric motors spin. Run it one way and you generate electricity from a hot, fast-moving fluid. Run it the other way — inject current into the fluid inside a magnetic field — and the fluid gets pushed. No propeller required.
This is where it gets wonderfully weird. In 1992, Mitsubishi launched the Yamato 1, a 30-meter ship propelled entirely by MHD thrusters. It sucked in seawater, zapped it with a superconducting magnet, and squirted it out the back. It worked. It just wasn't very efficient — about 1.5% — because seawater is a mediocre conductor and you need enormous magnetic fields to make up for it.
You may recognize this concept from Tom Clancy's The Hunt for Red October, where the Soviet submarine's silent "caterpillar drive" is an MHD thruster. That wasn't pure fiction — the U.S. Navy genuinely investigated MHD propulsion for submarines precisely because it produces no cavitation noise. Spinning propellers create collapsing bubbles that sonar can hear from miles away; a magnetic field pushing water makes essentially no sound.
But the converter's most interesting application runs in reverse. In an MHD generator, you blast superheated plasma (or seeded combustion gases at 2,500+ K) through a magnetic field and harvest electricity directly from the moving ions — skipping the entire turbine-and-generator stage of a normal power plant. The Soviet U-25 facility outside Moscow ran one at 25 MW for years, feeding actual power into the Moscow grid. Theoretical efficiencies push past 60%, beating any conventional steam cycle.
The catch is brutal:
Yet the dream won't die. NASA studied MHD generators for spacecraft. Fusion reactors like ITER will need MHD-adjacent physics to manage their plasmas. And every time someone proposes a hypersonic aircraft, MHD shows up again as a way to control the plasma sheath that forms around the vehicle at Mach 10+.
Daily YT Documentary
2026-06-03
Channel: Film Fest USA (68 subscribers)
Financing is the wall that stops most aspiring filmmakers before a single frame is shot. This guide from Film Fest USA walks through the practical mechanics of raising money for a debut independent feature — the part of filmmaking that rarely gets covered in cinematography tutorials or screenwriting books.
Expect a breakdown of the real funding paths a first-time director actually has available: crowdfunding campaigns (Kickstarter, Indiegogo, Seed&Spark), private equity investors and the SEC rules around soliciting them, grants from organizations like the Sundance Institute and state film commissions, tax incentive programs that vary wildly between shooting locations, and pre-sales to distributors based on attached talent. Each route has different paperwork, different expectations on return, and different risks for a filmmaker with no track record.
The channel is tiny (68 subscribers), so this is an early-stage creator sharing knowledge rather than a polished production — but the topic itself is unusually concrete and actionable compared to the vague "follow your dream" content that dominates the indie film space. If you've ever wondered how a $50k–$500k indie actually gets made before festival season, this is a useful primer.
Note: there are two near-identical uploads of this video on the channel from the same day — pick whichever loads.
Daily YT Electronics
2026-06-03
Channel: Nirman Labs (184 subscribers)
The pickings were slim today — most candidates were YouTube Shorts, hashtag-spam thumbnails, or vague "tutoring program" advertisements. This Smart Blind Stick project from Nirman Labs stands out as the one entry that promises a complete, purpose-driven build rather than a 30-second clip.
The project pairs an Arduino Uno with an HC-SR04 ultrasonic distance sensor and a buzzer (and often a vibration motor) to detect obstacles in front of a white cane and alert the user before they collide with them. It's a classic beginner assistive-technology build, but a genuinely useful one — you learn how to read pulse-echo timing from the ultrasonic sensor, convert microseconds to centimeters, and translate distance thresholds into tiered feedback (slow beep for far, rapid beep for close).
What makes this a worthwhile watch for a beginner is that it ties together three fundamentals in one short project: analog sensing, conditional logic, and actuator output. Once those three pieces click, the same skeleton scales to parking sensors, robot obstacle avoidance, and proximity-triggered lighting.
Caveat: the channel is tiny and the description is thin, so depth of explanation is unknown. Treat it as a starting wiring reference rather than a deep electronics lesson.
Daily YT Engineering
2026-06-03
Channel: Enginuity (79 subscribers)
Most of today's candidates were clickbait-heavy aviation compilations or hashtag-spam shorts. This one stands out as a genuine engineering project: a solo builder designing a fully 3D-printable, modular RC aircraft and open-sourcing the whole thing.
This installment focuses on the motor mount assembly — a deceptively tricky part of any RC plane. The motor mount has to transfer thrust loads into the airframe, absorb vibration, survive prop strikes, and stay light. Designing one in printed plastic (rather than the usual aluminum or plywood) means thinking carefully about layer orientation, stress concentrations around fastener holes, and how the mount interfaces with the firewall and nacelle.
Because the channel is documenting an ongoing build, you get to see the iteration — the prototype, the rationale for design choices, and how the part fits into the larger modular airframe philosophy. That's the kind of context you almost never get from polished maker content, and it makes this a useful watch for anyone interested in CAD, FDM design-for-manufacture, or amateur aircraft engineering.
At 79 subscribers, this is exactly the kind of small, technical channel worth surfacing before it gets bigger.
Daily YT Maker
2026-06-03
Channel: Tom (680 subscribers)
Pocket hole joinery is one of those techniques that looks deceptively simple in product marketing but can be fussy in practice — drill depth, clamp pressure, and edge distance all matter, and the wrong combination gives you blown-out faces or screws that punch through finished surfaces. Tom's video is worth watching because it shows the Trend Twin jig in the context of an actual build (a pull-out bed for his camper), not a sterile demo on a scrap of pine.
The Trend Twin is interesting in its own right: it's a two-hole jig that lets you drill paired pockets in a single setup, which speeds up repetitive joinery like face-frame and box construction considerably compared to single-hole jigs like the Kreg R3. Seeing it used on a real camper-furniture build — where weight, racking strength, and concealed joints all matter — is more useful than a tool-review unboxing.
For anyone considering pocket-hole tooling or building lightweight cabinetry for a van/RV conversion, this is a practical look at how the jig actually performs in a constrained, real-world project. Tom is a genuinely small channel (680 subs), and the description points to a clear, focused walkthrough rather than hashtag-driven filler despite the tag-heavy title.
Daily YT Welding
2026-06-03
Channel: Buzz Box Fab (7790 subscribers)
A welding table is the single most important fixture in any fabrication shop — it's your reference plane, your ground, your clamping surface, and your heat sink all at once. This build from Buzz Box Fab tackles the familiar problem that commercial fixture tables run thousands of dollars, putting them out of reach for hobbyists and one-person shops. The video walks through constructing a professional-grade table without that price tag.
What makes this one stand out from the flood of welding table content is that the creator includes downloadable DXF files, a STEP file, and a PDF plan. That's a big deal: it means you can take the design straight to a plasma table or laser cutter, or just print the drawings and work from measured prints rather than guessing dimensions from paused video frames. For anyone learning fabrication, having a real engineering drawing to read alongside the build is genuinely educational.
Expect coverage of frame squaring, leg assembly, top flatness considerations, and likely some discussion of fixture hole patterns — the details that separate a wobbly garage table from something you can actually trust as a reference surface. At ~7.8k subscribers, Buzz Box Fab sits in the sweet spot of small-channel craftsmanship: enough experience to do it right, small enough to still show the honest steps rather than a polished montage.
