Stories on this page

7 case studies

deadlock-finder-and-fixergdb-for-debuggingmulti-pass-bug-hunting

Tokio's Famous Deadlock Pattern, Caught and Fixed in an Async Runtime

/deadlock-finder-and-fixer ran an exhaustive audit pass over asupersync (Jeffrey's Tokio replacement) and filed 8 skill-tagged beads: 5 findings of specific concurrency hazards (each with a shipped fix in git) and 3 clean audits over whole subsystems. The headline finding (`asupersync-0x7fdb`, watch::send_modify calling a user closure while holding the write lock) is the canonical Rust async deadlock pattern. It was found in the project's own code and fixed via a clone-modify-update refactor in commit `3a6ad1ea8`.

Read storyvia @doodlestein

simplify-and-refactor-code-isomorphicallyextreme-software-optimizationcass

36 Isomorphic Refactor Passes Across a Production Rust Codebase

Across one week, /simplify-and-refactor-code-isomorphically ran 36 numbered passes on asupersync (Jeffrey's Tokio replacement), with artifacts in `refactor/artifacts/` and every change gated on cargo test + cargo check + cargo clippy -- -D warnings. The biggest single landed pass deleted 381 net lines from the trace event subsystem with no observable behavior change. Same skill also ran against frankensqlite, franken-engine, franken-node, ntm, and jeffreysprompts across this machine plus the css and csd boxes.

Read storyvia @doodlestein

documentation-website-for-software-project

A Full Docs Site for FrankenTUI in One Shot

Jeffrey ran /documentation-website-for-software-project against his FrankenTUI Rust project the day after the skill shipped. After about an hour of autonomous work, the result was docs.frankentui.com: a complete Nextra site with full-text search, mermaid diagrams, mobile layout, and a live Vercel deployment, all from a single invocation.

Read storyvia @doodlestein

xfcass

Building the Testimonials Wall by Searching Our Own X Archive

How we replaced 28 generic pre-launch testimonials with 40 paying-subscriber-specific quotes by dogfooding the xf skill against Jeffrey's local X data export. 17 paginated search calls, 7,371 unique liked tweets, an afternoon of human curation, all offline.

Read storyvia Jeffrey Emanuel

agent-mail

Agent Mail on a Real Project

A developer dogfoods Agent Mail MCP for multi-agent coordination and reports an immediate 'wow' moment on a real project.

Read storyvia @telecasterrok

cassbvubs

The Productivity Stack

A cohesive workflow combining cass, bv, UBS, and Agent Mail to move fast without losing context or shipping regressions.

Read storyvia @mjs527

bvbeads

Beads + bv: Getting Unstuck on What Next?

A user adopts beads + bv and immediately feels the difference: clearer priorities, less thrash, faster execution.

Read storyvia @__preacherman__

How we discovered these stories (with our own tools)

These testimonials were discovered by dogfooding the same premium skills subscribers receive: xf (X archive search) and cass (session history search). Meta? Yes. Effective? Absolutely.

xfX archive search cassSession history search

Commands we ran

xf stats --format json
xf search "@doodlestein skill" --types like --mode lexical --format json --limit 50
xf search "agent mail" --types like --mode lexical --format json --limit 50
xf search "ultimate bug scanner" --types like --mode lexical --format json --limit 20
cass search "testimonials" --workspace /data/projects/jeffreys-skills.md --json --fields minimal --limit 5

deadlock-finder-and-fixer gdb-for-debugging multi-pass-bug-hunting cass

Tokio's Famous Deadlock Pattern, Caught and Fixed in an Async Runtime

Source:@doodlesteinDate:Apr 16, 2026

Source

Tokio's Famous Deadlock Pattern, Caught and Fixed in an Async Runtime

"I am increasingly delivering very big, sophisticated skills like the saas security audit skill, the deadlock finder skill, the tax skill, etc."

-- @doodlestein on X (2026-04-16)

1) The Hook

There is a canonical concurrency bug in the Rust async ecosystem. watch::send_modify calls a user-supplied closure while the channel's internal write-lock guard is held. If that closure ever touches another watch channel that someone else is also writing, you get a textbook AB-BA deadlock. The pattern is well-known and explicitly called out in Tokio's watch module documentation.

Asupersync (Jeffrey's Tokio replacement, the runtime under several FrankenSuite projects) shipped its own version of the same bug. The deadlock-finder-and-fixer skill caught it, filed bead asupersync-0x7fdb with the title [deadlock-audit] watch::send_modify runs caller closure inside value.write(), and a fix shipped (3a6ad1ea8) refactoring send_modify to a clone-modify-update pattern.

That same skill ran an exhaustive audit pass over asupersync and produced eight tagged beads in total: five finding specific concurrency hazards (each with a corresponding fix in git) and three reporting clean audits over whole subsystems. The three clean audits matter as much as the five findings; they're the proof that someone walked the code and didn't just stop at the first scary-looking pattern.

2) The Challenge

Concurrency bugs are the worst category of software defect. They survive entire test suites green. They emerge under load in production, often months after the offending commit. They produce hung processes, flaky tests, lost wakeups, livelocks, silent message drops, and database "is locked" timeouts that nobody can reproduce on demand. The community has decades of literature about lock-ordering, await-holding-lock, lost-wakeup, reader-starvation, and the dozen other named pitfalls. None of that literature walks YOUR code looking for YOUR specific instances.

Manual audit gets you a few good hits and then runs out of attention budget. Static analyzers that flag every Mutex::lock() produce hundreds of false positives. Dynamic tools (TSAN, lockdep) catch what they observe, which is whatever your tests happen to exercise.

3) The Discovery

/deadlock-finder-and-fixer is a structured audit workflow rather than a generic "look for bugs" prompt. The SKILL.md opens with two explicit rules that govern every finding:

The Universal Rule. When you think you found the deadlock and fixed the three instances you could see, there is almost always a fourth. Keep searching until you can prove exhaustively, by code audit, that no hazard remains.

The False-Positive Rule. Every finding must survive: "Can I construct a concrete interleaving of real threads that reaches this state?" If you cannot, it is not a bug; it is a pattern match.

Around those rules sits a Symptom Triage Table mapping observed runtime behavior to bug class, plus separate sections for each class:

Observed symptom	Bug class
Process 0% CPU, threads in `futex_wait` / `__lll_lock_wait`	Classic AB-BA / self-deadlock
Async tasks pending, all tokio workers in `epoll_wait`	Mutex held across `.await`, channel cycle
100% CPU, futex spam, no progress	Livelock, retry storm, broken condvar
`database is locked`, `SQLITE_BUSY`, timeouts	SQLite WAL contention, long-transaction writer fight
Hang during library load, `strlen` / `malloc` hangs	LD_PRELOAD / runtime-init reentrancy

The runtime triage half of the workflow lives in the gdb-for-debugging sibling skill (gdb backtraces of all threads, lock-graph construction, async runtime analysis, TSAN/rr). This skill handles the parts that don't need a running process: taxonomy, static-audit discovery, fix catalog, prevention-by-design.

Importantly, the skill files its findings as beads with explicit [deadlock-audit] or [deadlock-finder] tags, so future code review and AI agents can grep for them. Clean audits are filed too. A bead titled "src/sync/* + src/channel/* deep audit complete — no findings" is a positive assertion that someone audited those files and didn't find issues, which is a real piece of evidence about the code's safety.

4) The Transformation

The canonical worked example is asupersync-0x7fdb. The skill walked src/channel/watch.rs, found the closure-under-write-lock pattern, and filed the bead with that exact wording. The branch tag [br-asupersync-0x7fdb] was used by the agent that took the bead and shipped the fix. Commit 3a6ad1ea8 (touching src/channel/watch.rs):

[br-asupersync-0x7fdb] Fix watch::send_modify deadlock by avoiding closure under write lock

Refactored send_modify to use clone-modify-update pattern instead of calling
user closure while holding the write lock. This prevents deadlocks when user
closures try to access other watch channels.

Old implementation: f(&mut guard.0) called while holding write lock
New implementation: Clone value, call f() without locks, then atomically update

The doc comment on send_modify was updated in the same commit to spell out the new contract for users:

To avoid deadlocks, this method clones the current value, releases the lock, applies the closure to the clone, then reacquires the lock to update the value. This prevents user closures from running while holding the write lock.

That's the canonical shape: bead with a precise location and pattern name, branch tag tying the fix back to the bead, fix that adopts a known-good pattern (clone-modify-update) instead of just sprinkling more locks, doc comment that documents the new invariant for future readers.

5) The Results

Eight skill-attributed beads on asupersync, all closed:

Findings (the skill caught a real concurrency hazard):

Bead	Finding
`asupersync-0x7fdb`	`[deadlock-audit]` watch::send_modify runs caller closure inside value.write()
`asupersync-df28bg`	`[deadlock-audit]` CrashController emits evidence while holding controller state lock
`asupersync-ryqcl6`	`[deadlock-audit]` SharedIoDriver invokes on_event while holding inner mutex
`asupersync-xgujaf`	`[deadlock-audit]` runtime/state.rs:2556-2558 read-then-write on cancel_waker is non-atomic TOCTOU
`asupersync-iwqn3q`	`[deadlock-finder]` src/lab/runtime.rs:1801-1808 lock-ordering hazard: scheduler.lock held across cx_inner.read()

Clean audits (the skill ran and proved no hazards in those modules):

Bead	Outcome
`asupersync-3sduke`	`[deadlock-finder]` src/sync/* + src/channel/* deep audit complete — no findings
`asupersync-drggpw`	`[deadlock-finder]` Sweep async cancel paths for await-holding-lock across channel/, sync/, obligation/, combinator/
`asupersync-zhrk5y`	`[deadlock-audit]` sync/* + scheduler/three_lane.rs + runtime/state.rs — clean except 1 TOCTOU finding

The TOCTOU finding from asupersync-zhrk5y is the same one filed separately as asupersync-xgujaf, which is what the False-Positive Rule looks like in practice: the skill reports the broader area as clean and surfaces the single real hazard inside it as its own bead, instead of flooding the project with pattern-match noise.

Every one of the five findings has a corresponding fix in git log:

Finding bead	Fix commit
`asupersync-0x7fdb`	`3a6ad1ea8` (clone-modify-update; explicit `[br-asupersync-0x7fdb]` tag)
`asupersync-df28bg`	`15da98895` (CrashController drops state lock before evidence emission; bead `closed` with `close_reason: "Already fixed in 15da98895"`)
`asupersync-ryqcl6`	`99043ae8e fix(io_driver): prevent deadlock in on_event callbacks`
`asupersync-xgujaf`	`12187f2a4 fix(runtime/state): atomic single-write clear of cancel_waker on task completion [br-asupersync-xgujaf]`
`asupersync-iwqn3q`	`dc69ed4e8 fix(lab/runtime): hoist cx_inner.read() out of scheduler.lock() scope to repair lock-ordering inversion [br-asupersync-iwqn3q]`

None of these fixes are macro one-liners; each one rewrites the offending code path to a known-good concurrency idiom (wake-outside-lock, clone-then-modify, atomic compare-exchange instead of read-then-write, hoisting an inner lock out of the scope of an outer one). One bead (df28bg) was retroactively credited to a prior commit that already fixed the issue. That's what the bead system is supposed to do: record the audit finding even when the fix landed first.

In context, asupersync's .beads/issues.jsonl carries dozens of additional concurrency-class beads filed across the project's lifetime: lost-wakeup races in the parker, scheduler starvation under continuous write load, mutex wake-under-lock in barrier.rs, sync-primitives cancellation deadlocks, and so on. The eight skill-attributed beads are not the only concurrency work on the project, but they are the ones where the discovery itself can be attributed cleanly to a single audit pass with an explicit rule about what counts as a real finding.

6) The Meta Layer

The author's framing of this skill, in the X post that opened this case study: a "very big, sophisticated skill" alongside the SaaS security audit and the tax preparation skills. The size comes from how much codified concurrency lore the SKILL.md carries (taxonomy, anti-patterns, fix catalog, validation gates) and how many companion skills it composes with. The runtime-side of debugging happens in gdb-for-debugging. After a major fix lands, multi-pass-bug-hunting runs as the deeper sweep. cass mines prior concurrency-bug sessions to surface precedent. All ship in the same subscription.

The most valuable secondary effect is the same one the isomorphic-refactor case study highlighted: artifacts. Every closed bead with a [deadlock-audit] or [deadlock-finder] tag is an explicit, searchable record of where someone walked the code and what they found. New contributors can grep the beads file and learn the project's concurrency hazards as a first-class part of onboarding, instead of having to discover them by hanging a CI run.

Source: https://x.com/doodlestein/status/2044648265438654745 · Skill page: https://jeffreys-skills.md/skills/deadlock-finder-and-fixer · Sibling skills: gdb-for-debugging, multi-pass-bug-hunting

simplify-and-refactor-code-isomorphically extreme-software-optimization cass ubs

36 Isomorphic Refactor Passes Across a Production Rust Codebase

Source:@doodlesteinDate:Apr 24, 2026

Source

36 Isomorphic Refactor Passes Across a Production Rust Codebase

"It basically helps to 'de-slopify' and refactor code that's been written by agents, looking for ways to simplify and reduce the amount of code without changing the behavior."

-- @doodlestein on X (2026-04-24), announcing /simplify-and-refactor-code-isomorphically

1) The Hook

Asupersync, Jeffrey's Tokio replacement and the runtime that powers a growing list of FrankenSuite projects, has gone through 36 numbered isomorphic refactor passes in a single week (isomorphic-pass-001 through isomorphic-pass-036, all sitting in refactor/artifacts/ inside the repo). The biggest single landed pass deleted 381 net lines of code from one file (the trace event subsystem) without changing a single observable behavior. Every change rode through cargo test, cargo check, and cargo clippy -- -D warnings before merging.

That work was driven by /simplify-and-refactor-code-isomorphically, a new skill on the catalog. Asupersync on css is the most concentrated example, with the formal refactor/artifacts/ directory tracking every pass. The same skill has also produced behavior-preserving cleanup commits on frankensqlite locally; the git log there shows the same shape (named commits with explicit isomorphism guarantees) even without the artifacts directory.

2) The Challenge

Code generated by agents grows. Each iteration tends to add wrappers around existing wrappers, near-identical constructors that exist because the agent didn't see the one already there, and helper functions that duplicate logic the agent forgot was already imported. Multiply that across hundreds of commits in a single project and what comes out is what Jeffrey calls "AI slop": code that works, has tests, and is technically correct, but carries substantial extractable redundancy that no individual change felt big enough to address.

The usual responses to that are all bad:

Live with it. Tech debt compounds; readers spend more time skimming wrappers than reading logic.
Rewrite from scratch. Highest-risk option; you lose the test coverage that proved the old code worked.
Ad-hoc cleanup PRs. Easy to introduce regressions because there is no protocol for proving "this change is behavior-identical."

The skill takes that third option and adds the missing protocol on top of it: prove behavior identical, then remove lines. Done well, ad-hoc cleanup turns into a rigorous, reviewable, and reversible workflow.

3) The Discovery

/simplify-and-refactor-code-isomorphically ships a single hard rule:

Prove behavior identical, then remove lines. No proof, no delete.

Around that rule it adds an 8-phase loop with explicit artifacts at each step (an "isomorphism card" per change, a duplication map, an opportunity matrix scoring (LOC_saved × Confidence) / Risk ≥ 2.0, a LOC ledger, a rejection log). Skipping a phase is treated as not having done that phase; the artifacts are the handoff to reviewers, and a "this is cleaner" PR without them is indistinguishable from a drive-by rewrite.

The phases:

0. BOOTSTRAP   → check installed sibling skills (cass, ubs, multi-pass-bug-hunting)
1. BASELINE    → tests green, goldens captured, LOC snapshot, typecheck clean
2. MAP         → duplication scan (jscpd / similarity-ts / scc / rg / ast-grep)
3. MATRIX      → score each candidate; reject anything below the threshold
4. PROVE       → isomorphism card per change BEFORE editing
5. COLLAPSE    → one lever per commit, Edit only, no script-based codemods
6. VERIFY      → tests green, goldens bit-identical, typecheck clean, LOC delta recorded
7. LEDGER      → metrics dashboard, rejection log, per-candidate row
8. REPEAT      → re-scan (new duplicates surface once noise clears)

The "no script-based codemods" rule is the unusual one. Most refactor tooling lives or dies on regex/AST search-and-replace, which is exactly how the skill forbids you to work. Edits are made manually (or via parallel subagents), one lever per commit, with the isomorphism proof living next to the diff.

4) The Transformation

Concrete example from asupersync, commit 582b65fdf (2026-04-27):

refactor(trace): macro-generate event constructors

  2 files changed, 243 insertions(+), 624 deletions(-)

The body of that commit is what makes it provable rather than vibes:

Isomorphism: public TraceEvent constructor names, visibility, argument order, attributes, and payload variants are preserved; generated wrappers call the same Self::new or worker_lifecycle paths as the removed hand-written wrappers.

Validation: rustfmt --edition 2024 --check src/trace/event.rs; git diff --check -- src/trace/event.rs refactor/artifacts/2026-04-27-isomorphic-pass-031/ledger.md; rch exec -- cargo test ... worker_lifecycle_constructors_preserve_payload_shape; rch exec -- cargo check ...; rch exec -- cargo clippy ... -- -D warnings.

That's the canonical shape: identify the behavioral surface (constructor names + visibility + argument order + attributes + payload variants), explicitly state what's being preserved, link to the ledger artifact, and list the exact gates the diff cleared. Every one of the 36 passes has the same skeleton, with run IDs incrementing sequentially (isomorphic-pass-001, -002, …, -036).

The supporting commits on asupersync show the same shape applied to other subsystems:

Commit	Subsystem	Net LOC
`582b65fdf`	trace event constructors	−381
`7cb69aa50`	cancel reason constructors	net flat (consolidation)
`fc55322c3`	grpc status constructors	−6
`fcdf29d34`	length-delimited codec setters	+27 (macro setup, paid back later)
`d4be3ae08`	http2 settings setters	+18
`73d636d8c`	obligation resolution transition	+43 (centralized; removed callers in follow-ups)
`0062401d0`	runtime epoch GC defer helper	−4
`7364f789a`	distributed replica assignment	+25 (centralization; net negative across the family of 4 commits)
`ae3fd0212`	mpsc functional multiset	−2
`24003f407`	mpsc producer handle creation	+146 (broke open and reorganized; eliminated 4 sites of duplication)

Two patterns to notice: the small standalone wins (the [simplify] mpsc commits) and the macro/centralization wins where a single commit adds machinery that subsequent commits then exploit to remove duplication elsewhere. The skill's matrix scoring (LOC_saved × Confidence / Risk ≥ 2.0) is what decides which family of moves is worth it.

The frankensqlite repo (a from-scratch SQLite-compatible MVCC engine) shows the same pattern at smaller scale:

Commit	What	Net LOC
`527c71ca`	morsel-parallel insert schedule indexing math + condense test assertions	−24
`2ac8f707`	compress 4 booleans into typed `ScenarioFlags` bitfield + tighten Eq	+30 (typed scaffolding)
`f71ceb0e`	WAL checksum transform recurrence	+141 (removed ad-hoc sites in follow-up commits)
`bfc432e5`	clippy-surfaced readability touch-ups across 3 crates	−6

5) The Results

Verified usages from the past week:

asupersync (on css): 36 numbered passes with full artifacts in refactor/artifacts/ (2026-04-25-isomorphic-pass-001 through the latest run dated 2026-04-27). The trace pass alone (582b65fdf) saved 381 lines.
frankensqlite (local): named simplification commits in git with explicit isomorphism notes (527c71ca, 2ac8f707, f71ceb0e, bfc432e5; see table above) across fsqlite-core, fsqlite-mvcc, fsqlite-pager, the perf harness, and the e2e suite. No formal artifacts directory yet, but the commit-level discipline is present.

The qualitative results match the X announcement post: agent-generated code shrinks substantially without behavior changing. Code review attention shifts from "is this an actual change?" (impossible to tell from a +243 -624 diff without proof) to "does the isomorphism card hold?" (a much more answerable question, because the card lists specific behavioral invariants the diff preserves).

The most important secondary effect: each pass leaves behind artifacts that future refactors can reference. By pass 36, the asupersync repo has a written record of which moves worked, which got rejected (and why), and which deltas in LOC came from which subsystems. That ledger is itself the documentation a new contributor needs to understand what kind of changes are welcomed in the codebase.

6) The Meta Layer

The skill composes well with several siblings on the catalog: cass (mine prior refactor precedent), ubs (surface bug-smells while mapping), multi-pass-bug-hunting (depth bug-hunt after a major pass), and extreme-software-optimization (the mirror-image skill that trades structure for speed). All ship as part of the same subscription. The bootstrap phase detects which siblings are installed and offers jsm install for the missing ones; nothing is a hard prerequisite.

The skill itself was authored by Jeffrey on his private skills repo, then dogfooded across several of his most active projects in the same week. The fact that the artifacts directory exists with 36 sequential run IDs is the strongest evidence that the workflow holds up under sustained use rather than only in demo runs.

The original X announcement frames it precisely: a tool to "de-slopify and refactor code that's been written by agents, looking for ways to simplify and reduce the amount of code without changing the behavior." After 36 documented passes on a production runtime, the claim is no longer hypothetical.

Source: https://x.com/doodlestein/status/2047808489838329993 · Skill page: https://jeffreys-skills.md/skills/simplify-and-refactor-code-isomorphically · Sibling skill: extreme-software-optimization

documentation-website-for-software-project

A Full Docs Site for FrankenTUI in One Shot

Source:@doodlesteinDate:Apr 23, 2026

Source

A Full Docs Site for FrankenTUI in One Shot

"It's the best documentation site I've ever made for one of my projects, with everything explained perfectly, full-text search, mobile-friendly, perfect formatting, diagrams. Including documentation sites that I really labored over manually."

-- @doodlestein on X (2026-04-23), about docs.frankentui.com

1) The Hook

A new skill called /documentation-website-for-software-project shipped on 2026-04-21. The day after, Jeffrey ran it for the first time against FrankenTUI, one of his larger Rust projects. It cranked autonomously for over an hour, then produced a complete, polished, deployed documentation site at docs.frankentui.com. One invocation. No follow-up rounds, no human edits. The result is a Nextra-powered docs site with full-text search, mobile responsiveness, mermaid diagrams, conceptual articles alongside reference pages, and a working production deployment on Vercel.

The author's own assessment: it's better than docs sites he had labored over manually for past projects.

2) The Challenge

Writing project documentation is the most-postponed task in software. You either (a) have one engineer who knows the project well enough to write the docs but is too busy shipping features, or (b) hand it to someone who has the time but doesn't know the codebase. Neither path produces good docs. The result is a README that hasn't been updated since launch, a docs/ folder of half-finished markdown, and an issue tracker full of "documentation" labels.

Worse, "good docs" isn't just a wall of text. A real docs site needs:

A logical information architecture (Getting Started → Concepts → Reference → How-to)
Code examples that match the current API surface
Diagrams for non-trivial concepts
Search that actually works
A mobile layout that doesn't fall apart on a phone
A deploy pipeline so each new commit republishes the site

Most projects ship with maybe two of those.

3) The Discovery

/documentation-website-for-software-project is a skill that produces all of the above as a single autonomous run, starting from a software project on disk and ending with a deployed Vercel URL. It works in three phases:

Phase 1: Research. It uses the same approach as the /codebase-archaeology and /codebase-report skills to map the project: read the README, walk the source tree, identify entry points, summarize each subsystem, extract public APIs, find the test suite to see how things are actually used, and note the project's architectural conventions.

Phase 2: Authoring. From that mental model, it writes documentation pages from scratch in MDX. Not a single template-filled stub per directory; actual prose explaining what each subsystem does, why it exists, how the pieces fit, and how to use them. Conceptual articles ("philosophy", "how the runtime works") sit alongside reference pages. Code examples are pulled from real source paths, not invented.

Phase 3: Site generation + deploy. It clones the Nextra Docs Starter Kit into a working directory, slots the authored MDX in, configures the sidebar, theme, search, and metadata, and (optionally) runs vercel deploy --prod to push the result live. The output is a Next.js + Nextra site with all of Nextra's defaults wired up: client-side full-text search, dark mode, mobile nav, syntax-highlighted code blocks, MDX components for callouts and diagrams.

The skill itself was created using /sc, /sw, and /operationalizing-expertise (the meta-skills for authoring skills) on 2026-04-21 at 21:30 UTC, and tested for the first time the next day.

4) The Transformation

For the FrankenTUI test run:

From inside /data/projects/frankentui, the skill was invoked.
It read every crate in the workspace (ftui-backend, ftui-core, ftui-runtime, ftui-render, ftui-tty, ftui-web, ftui-widgets, ftui-demo-showcase, plus the doctor_frankentui tool), studied the AGENTS.md and README.md, and built an index of public types, functions, and conventions.
It generated a multi-section docs tree covering getting-started flows, conceptual explainers (why FrankenTUI is structured the way it is), feature-by-feature reference, mermaid diagrams of the runtime architecture, and how-to guides for common tasks.
It cloned the Nextra starter, dropped the MDX in, configured the theme + sidebar, and ran vercel deploy.
About an hour after invocation, the alias frankentui-docs.vercel.app resolved to a production deployment. Shortly after that, docs.frankentui.com was attached as the custom domain.

The Vercel deployment record (project: frankentui-docs, deployment alias frankentui-docs-5mj16vdpf) shows the custom domain getting attached on 2026-04-23 around 02:07 UTC, which lines up with the X post that went out about 12 hours later announcing the result.

5) The Results

The live site at https://docs.frankentui.com is the proof. From the X post, by Jeffrey's own count:

Best docs site he's made for any of his projects, including ones he previously hand-wrote
Full-text search works
Mobile-friendly
Perfect formatting and diagrams
Logical groupings of features and functionality
Conceptual articles about philosophy and design, not just reference

Quote from the announcement post: "I was optimistic that it would work well because I put a ton of thought and many different phases into the workflow, but I'm still totally amazed by the results."

The honest qualifier in that same post is also worth surfacing: the skill can be re-run multiple times to polish further, and the FrankenTUI version is what came out of a single shot. The bar for "single shot" was set deliberately to demonstrate the floor of what the skill can do, not the ceiling.

6) The Meta Layer

This is one of the highest-leverage skills in the catalog. Most software projects never get the docs they deserve, because the activation energy of writing them well is too high. A skill that gets you a 90th-percentile Nextra site with one invocation changes that calculus entirely.

A few practical notes for subscribers planning to run it on their own projects:

It works on any language. The research phase reads code structurally rather than relying on Rust- or TypeScript-specific tooling.
The Vercel deploy step is optional. If you skip it, you get a clean Next.js project ready to commit to your own repo and deploy wherever.
The skill page itself documents the full workflow and the underlying Nextra layout primitives the run uses (editLink, feedback, lastUpdated, toc.float, etc.), so a follow-up polish pass over the generated MDX is straightforward when you want to tune the result.

The X announcement closes on a sharper note: "Anyone still saying that skills aren't useful is going to be left in the dust by people who understand how to leverage these effectively." That's the case for the whole subscription, but /documentation-website-for-software-project is a particularly clean demonstration of why.

Source: https://x.com/doodlestein/status/2047325477967106067 · Live result: https://docs.frankentui.com/getting-started/quick-run · Skill page: https://jeffreys-skills.md/skills/documentation-website-for-software-project

xf cass

Building the Testimonials Wall by Searching Our Own X Archive

Source:Jeffrey EmanuelDate:Apr 27, 2026

Source

How We Built the Testimonials Wall by Searching Our Own X Archive

"Actual (unsponsored) tweets from people paying $20/month for Jeffrey's Skills. Discovered by dogfooding xf on Jeffrey's X archive."

-- the homepage, after this case study landed

1) The Hook

A site selling premium skills should be able to prove itself with quotes from real, paying users, not generic "love your work" praise that could be about any of the author's open-source projects. The wall of 40 testimonials on the jeffreys-skills.md homepage is the proof. Every single one was extracted from Jeffrey's own X archive in an afternoon by running the same xf skill that subscribers receive.

The product collected its own social proof.

2) The Challenge

Testimonials are easy to fake. The original landing-page wall had 28 entries collected before the skills site even launched, so almost all of them were about adjacent tools (beads_viewer, Agent Mail, cass, the prompts site). Useful for general credibility, but a poor fit for a page selling the paid skills subscription specifically. After roughly two months of paying-subscriber feedback rolling in on X, the right call was to throw out the entire pre-launch wall and rebuild it from quotes that actually mention the skills site or its CLI (jsm).

That meant searching across:

Tweets and replies authored by anyone responding to Jeffrey
Likes (where most third-party endorsements end up)
DMs that opted into being public-quoted
Grok chats (skipped, since they're never the right shape for a testimonial)

…across an export spanning 2009 through April 2026, totalling 16,953 tweets, 54,725 likes, 7,137 DMs, and 4,068 Grok messages. North of 80,000 documents, all stored locally.

3) The Discovery

xf is one of the included skills. It runs entirely on your own machine, indexes your X data export with a hybrid BM25 + vector search engine (Tantivy under the hood), and answers queries in under a millisecond. The whole index is offline, so there are no network calls and no per-request quotas to worry about. After downloading a fresh archive zip from x.com, one command does the whole rebuild:

xf import --force /path/to/twitter-2026-04-26-<hash>.zip

That single invocation wipes the previous extraction in ~/my_x_history, drops the SQLite metadata DB and the Tantivy index, re-extracts the new zip (3.7 GB into 8,706 files), and rebuilds the whole index in about 17 seconds for 82,650 documents.

Once the index was warm, we ran broad, paginated sweeps over the four query terms most likely to surface skills-site feedback. The default xf search --limit is 500 hits per call, so we used --offset to walk past it:

for off in 0 500 1000 1500 2000; do
  xf search "skill"  --types like --format json --limit 500 --offset "$off" > sk-$off.json
  xf search "skills" --types like --format json --limit 500 --offset "$off" > sks-$off.json
done

for off in 0 500 1000; do
  xf search "jsm" --types like --format json --limit 500 --offset "$off" > jsm-$off.json
done

for off in 0 500 1000 1500; do
  xf search "doodlestein" --types like --format json --limit 500 --offset "$off" > dood-$off.json
done

Seventeen calls total. After dedup across overlapping result sets: 7,371 unique liked tweets to filter through.

4) The Transformation

Filtering 7,371 likes down to 40 testimonials was the slow part, and the part AI couldn't fully automate. The candidate pool included a lot of noise:

Bug reports (positive sentiment, but not testimonials)
"skill issue" jokes (zero signal)
Praise for skills that aren't part of Jeffrey's catalog (the popular frontend-design skill from another vendor, etc.)
Replies from users who hadn't actually subscribed yet ("I swear I'm going to sign up soon!")
Quotes with AI-slop fingerprints (em-dash rhythms, three-clause rhythmic listings, pseudo-profound openers)
Anything that read as performative rather than genuine

Each surviving quote had to:

Mention the skills site or the JSM CLI specifically (not generic "love your work")
Come from someone who had actually subscribed (or was clearly speaking from inside the product)
Have a punchy, quotable phrase that earns its slot on a wall of social proof

Once the 40 candidates were chosen, the remaining work was deterministic. For every entry we pulled the full canonical metadata via https://api.fxtwitter.com/status/{id}:

The author's screen_name (the @-handle)
The author's name (the real display name; "Gracie Terzian" instead of @gracieterzian)
The canonical https://x.com/{handle}/status/{id} URL

X archive likes don't carry timestamps, only the original tweet ID. The post date for each like was recovered by decoding the snowflake epoch:

const TWITTER_EPOCH_MS = 1288834974657;
const dateMs = Number(BigInt(tweetId) >> 22n) + TWITTER_EPOCH_MS;
const sourceDate = new Date(dateMs);

That gave every card on the wall a real publish date (Mar 22, 2026, Apr 18, 2026), formatted via UTC-pinned Intl.DateTimeFormat so the displayed date matches the snowflake-derived ISO and is identical on server and client. No hydration mismatch for visitors outside UTC.

5) The Results

Forty real testimonials. Every one links back to a public X post that anyone can verify in two clicks. The breakdown:

7 hero quotes, including:
- "@doodlestein is the GOAT. Subscribe to his skills and support this legend. It's worth $100k of energy and time minimum if not $1M, only $20/month."
- "J skills are unironically one of the best $20 you can spend."
- "Jeffreys-skills.md is my new home page on every browser."
- "I gave it my relatively complex returns and it saved $1500."
11 supporting quotes, including:
- "I ran it this morning and it found $70k in K-1 disallowed losses that I mistakenly didn't claim in 2024 w/ TurboTax (~$20k refund if I amend). Love your skills site man."
- "DCG has saved my work and stopped Claude from nuking my VPS so many times. That alone is worth the subscription."
- "Shoutout to @doodlestein for the insanely fast fix. I hit an install issue getting Jeffrey's Skills running on my NVIDIA DGX Spark because the arm64 Linux binary was missing. He added the aarch64-unknown-linux-gnu target to the build pipeline, cross-compiled it…"
22 social-proof entries, the steady drumbeat of "Subscribed instantly", "Was I the 250th subscriber?", "Just subscribed", "I've recommended the skills to several people", "Big fan", and so on.

Source dates span January 26, 2026 (Taylor Bell's DCG-saved-my-VPS testimonial) through April 25, 2026 (the most recent one captured before the new archive was indexed).

For the cards, optional highlights[] substrings are wrapped in <strong> so the eye lands on the actually-quotable line inside each quote. The matcher walks every highlight phrase in the quote, with longer phrases winning any overlap, and the same phrase is correctly tracked across multiple occurrences via a range-based algorithm that never drops the second match.

6) The Meta Layer

This is what dogfooding looks like end-to-end: a paying-subscriber testimonial wall on a site selling skills, where each testimonial was found by running one of those skills against the founder's own data. The same xf install that surfaced these quotes ships to every subscriber on day one.

The companion skill, cass, does local search across saved coding-agent sessions (Claude Code, Codex, Gemini, etc.). We used it to recall what queries had worked in earlier extraction passes and to sanity-check the filtering choices. Both ship as part of the standard subscription and both run fully offline.

The underlying procedure is repeatable: download a new zip from x.com/settings/download_your_data, run xf import --force <zip>, replay the four paginated sweeps, and the curation effort begins again. The xf re-index takes seconds. The filtering takes the rest of the afternoon.

Source: every testimonial on https://jeffreys-skills.md/ links to its underlying X post. Tracker: rebuilt 2026-04-27.

agent-mail

Agent Mail on a Real Project

A developer dogfoods Agent Mail MCP for multi-agent coordination and reports an immediate 'wow' moment on a real project.

Source:@telecasterrokDate:Nov 8, 2025

Source

Agent Mail on a Real Project: An Immediate "Wow" Moment

"Thanks for sharing all of this, Jeffrey. I used your Agent Mail MCP on a project yesterday and 🤯"

-- @telecasterrok on X (2025-11-08)

1) The Hook

Multi-agent work collapses into chaos faster than people expect. You get duplicated effort and clobbered files, and every few minutes someone has to ask "wait, who changed this?" One developer tried Agent Mail on a real project and the day-after reaction was that single 🤯 emoji. The kind of reaction you only post when something actually worked the first time.

2) The Challenge

When you are working with multiple agents (or even just multiple humans), coordination is the whole game:

Who is working on what?
Who owns which files right now?
How do you avoid two parallel edits fighting each other?
How do you keep decisions and context from getting lost between sessions?

Without explicit coordination primitives, the workflow becomes brittle. You can make progress, but the system is always one unlucky merge away from losing time and trust.

3) The Discovery

Agent Mail is a simple idea done aggressively well: give every agent a durable identity, a shared mailbox, and advisory file reservations so agents do not stomp each other while moving fast.

The key shift is making coordination a first-class artifact (messages, threads, reservations), instead of leaving it as tribal knowledge that lives only inside whichever human happens to remember the last conversation.

4) The Transformation

We do not know the author's exact project details, but the minimal workflow that makes Agent Mail click looks like this:

Register an identity for the repo/project.
Reserve the files you intend to edit (exclusive intent, with a TTL).
Post a short thread message when you start and when you land changes.
Release reservations when done.

In practice, this single loop removes most of the "agent collision" failure modes:

Fewer merge conflicts (or at least fewer surprising ones)
Less duplicated work
Clearer ownership while the work is in motion
Better handoffs when switching context

5) The Results

The only claim we can make from the source is the reaction itself: the experience was strong enough to warrant a public 🤯 the day after using it.

In day-to-day engineering terms, that usually maps to one (or more) of the following:

A coordination problem that used to feel "inevitable" suddenly becomes solvable.
The team stops paying the tax of rework from parallel edits.
The mental load drops, so more time goes into building and less into negotiating.

6) The Meta Layer

We discovered this quote by running our own discovery workflow: searching an X archive with the xf skill and curating sources into a testimonials dataset. The point isn't marketing cleverness. It's that the tools are useful enough to power their own evidence gathering.

Source: https://x.com/telecasterrok/status/1987266230680510707

cass bv ubs agent-mail

The Productivity Stack

A cohesive workflow combining cass, bv, UBS, and Agent Mail to move fast without losing context or shipping regressions.

Source:@mjs527Date:Dec 8, 2025

Source

The Productivity Stack: cass + bv + UBS + Agent Mail

"Jeff is one of the most high quality follows around right now -- cass, beads viewer, ultimate bug scanner, agent mail..."

-- @mjs527 on X (2025-12-08)

1) The Hook

Most "AI tooling" writeups read as either hype or one-off novelty. The post above describes something different: a repeatable toolchain (cass, bv, Ultimate Bug Scanner, and Agent Mail) that the author keeps reaching for every day as productivity boosters.

2) The Challenge

Agentic coding creates a new class of problems:

You can generate changes quickly, but it is easy to lose the thread of why changes were made.
Bugs slip in when you move fast across many files.
Backlogs become graphs (dependencies, blockers), not checklists.
Multi-agent work amplifies coordination overhead.

If you do not deliberately build a workflow around those failure modes, "speed" turns into churn.

3) The Discovery

Each tool in the stack targets a specific failure mode, and together they form a coherent loop:

cass: search across prior sessions and decisions so you do not re-solve problems
bv: graph-aware triage so you pick the highest-leverage work next
UBS: production-minded scanning so regressions get caught before they ship
Agent Mail: identities, messaging, and file reservations that prevent agents from clobbering each other

4) The Transformation

We do not claim these were the author's exact steps, but the "stack" pattern usually looks like:

Find the right next task (bv) instead of guessing.
Recover context (cass) instead of re-reading the whole codebase.
Make a bounded change (small diff, clear ownership).
Scan for mistakes (UBS on changed files).
Coordinate safely (Agent Mail reservations + threaded updates) when multiple agents are involved.

The point isn't to slow down. It's to move fast without rolling the dice on every change.

5) The Results

The strongest part of the original post is the framing: these tools are "productivity boosters" in a literal, recurring sense. They aren't interesting one weekend and then forgotten; they pay back time every working day.

In practice, stacks like this typically deliver:

Less time lost to regressions and "oops" fixes
Less time wasted figuring out "what should I do next?"
Fewer dead ends from missing context
Better multi-agent throughput with fewer collisions

6) The Meta Layer

We found this quote by dogfooding our discovery workflow (X archive search via xf). If the workflow is good, it should make it easier to build and validate the product itself.

Source: https://x.com/mjs527/status/1997860328894525463

bv beads

Beads + bv: Getting Unstuck on What Next?

A user adopts beads + bv and immediately feels the difference: clearer priorities, less thrash, faster execution.

Source:@__preacherman__

Source

Beads + bv: Getting Unstuck on "What Next?"

"Been following your workflow recently. Today I started with beads and bv. Ho lee sheet"

-- @preacherman on X (date unknown)

1) The Hook

"What should I do next?" sounds simple. In real engineering work (especially with AI agents generating options at high speed) it is one of the most expensive questions you can ask. This user tried beads + bv and immediately felt the difference.

2) The Challenge

AI-assisted development changes the bottleneck:

You can create tasks faster than you can sequence them.
Dependencies are easy to miss.
Work-in-progress multiplies (ideas, partial diffs, unfinished threads).
The backlog becomes a dependency graph, not a flat TODO list.

If you treat it like a checklist, you get constant context-switching and stalled progress.

3) The Discovery

Beads gives you a local-first issue graph (tasks, dependencies, status). bv layers graph math on top (PageRank, bottleneck detection, critical path) so the backlog itself can tell you where leverage lives.

4) The Transformation

The core loop is straightforward:

Capture work as beads (small issues, explicit dependencies).
Ask bv what is most important right now (robot mode).
Claim one issue and finish it.
Repeat.

The payoff isn't extra planning. It's a sharp reduction in thrash. You stop guessing and start executing.

5) The Results

The original post is short, but the signal is clear: the author had an immediate, visceral reaction after adopting the workflow.

In practice, this tends to show up as:

Shorter time-to-first-action in each work session
Fewer "half done" threads
Faster convergence on the work that actually unlocks everything else

6) The Meta Layer

This quote was collected from an X archive and curated into our testimonials dataset as part of dogfooding the xf workflow. We did not have enough source context to derive the exact timestamp or query provenance for this one yet.

Source: https://x.com/i/web/status/2001856410154602662

Ready to transform your workflow?

Join hundreds of developers using these same skills to ship faster, catch more bugs, and maintain better context.

Subscribe — $20/mo Explore all skills

Cancel renewal anytime. Access stays through your paid period.

Back to homepage