1
0
mirror of https://github.com/bitcoin/bips.git synced 2026-06-01 17:15:27 +00:00
Files
bips/bip-0450.mediawiki
Yuri S Villas Boas d293ae1a10 BIP450: Formosa—Seed Encoding per Themed Mnemonic Stories (#2108)
* Formosa as BIP

Mnemonic *sentences* instead of words proposed as forwards- and backwards-compatible expansion to BIP39, itself as Bitcoin Improvement Proposal.

* Update bip.mediawiki

Co-authored-by: Mark "Murch" Erhardt <murch@murch.one>

* Update bip.mediawiki

Satisfying requirement of title in fewer than 50 characters.

* Formosa: address PR #2108 review feedback

Restructure the draft to follow BIP-3 conventions and resolve the issues
raised by reviewers in https://github.com/bitcoin/bips/pull/2108:

- Introduce explicit Specification section with a Terminology subsection
  that distinguishes 'word', 'category', 'theme', 'sentence' and
  'mnemonic' / 'mnemonic story', removing the ambiguity of using
  'sentence' at two different scales.
- Replace the unclear 'if the category is led by another category'
  wording with an explicit LED_BY field description and a step-by-step
  algorithm that covers both the leaderless and led cases.
- Reflow the theme-property list (previously a/b/c/d/e split by an
  intervening paragraph) into a single numbered list so it renders as a
  list rather than as code blocks.
- Add a dedicated Rationale section covering the 33-bit sentence size,
  themed sentences, free-form theme schema, the LED_BY mechanism, the
  re-encoding-through-BIP-39 design, and why custom themes are
  discouraged.
- Add a dedicated Backwards Compatibility section describing
  compatibility at the mnemonic, entropy, and seed levels.
- Add a worked Example section showing a 128-bit entropy being encoded
  into a 4-sentence mnemonic story under a small illustrative theme,
  including bit splitting, FILLING_ORDER vs NATURAL_ORDER, and the
  LED_BY lookup.
- Tighten the Abstract and Motivation; clarify that BIP-39 is itself a
  Formosa theme.

* Formosa: spell out abbreviated table labels

Reviewer on PR #2108 asked for no abbreviations in table labels. Replace:

- ENT / CS / S / MS column headers with 'Initial entropy bits',
  'Checksum bits', 'Total bits', 'Number of sentences', 'Mnemonic
  words (6-word theme)' and 'Mnemonic words (BIP-0039)'.
- 'List size / Bits / Chars to identify / Density (bits/char)' with
  'Wordlist size / Bits per word / Characters to identify / Density
  (bits per character)'.
- ADJ. with ADJECTIVE in the example bit-assignment diagram, and the
  surrounding narrative ENT/MS uses with the spelled-out forms.

The accompanying formulas now use the expanded names too, so the
algorithm description and the table column headers stay consistent.

* Formosa: rebuild Example on the real medieval_fantasy theme

Replace the previous hypothetical 5-category example with one that
mirrors the medieval_fantasy theme actually shipped at
https://github.com/Yuri-SVB/formosa/tree/master/src/mnemonic/themes,
including:

- the real 6 categories with their actual BIT_LENGTHs
  (VERB=5, SUBJECT=6, OBJECT=6, ADJECTIVE=5, WILDCARD=6, PLACE=5,
  summing to 33);
- the real FILLING_ORDER and NATURAL_ORDER;
- the real lead tree (VERB → SUBJECT; SUBJECT → OBJECT and WILDCARD;
  OBJECT → ADJECTIVE; WILDCARD → PLACE), showing that a single
  leader can have several dependent categories;
- a 33-bit block whose decoded indices (28, 32, 63, 27, 46, 29)
  pick existing words and existing sub-list entries: VERB[28]
  =unveil, SUBJECT_under_unveil[32]=king, OBJECT_under_king[63]
  =wine, ADJECTIVE_under_wine[27]=sweet, WILDCARD_under_king[46]
  =queen, PLACE_under_queen[29]=throne_room, yielding the sentence
  'king unveil sweet wine queen throne_room'.

This keeps the worked example faithful to the reference
implementation rather than to a fabricated theme, so that anyone can
reproduce the encoding by parsing medieval_fantasy.json.

* Formosa: explain LED_BY as a primitive next-word predictor

Add a paragraph to the LED_BY rationale clarifying that a Formosa theme
behaves as a primitive language model (next-word predictor): each LED_BY relation
skews the conditional distribution over the next word so that probability
mass falls only on the 2^BIT_LENGTH words compatible with the already-
chosen leader, and zero elsewhere. The theme designer plays the role of
training data, hand-curating which combinations are semantically coherent.
This framing makes explicit why themes produce sentences that 'sound right'
while still covering all 2^33 bit patterns of a sentence.

* Cite the companion project Mooncake (https://github.com/T3-Infosec/mooncake)
which builds on this property by rendering each Formosa category as an
on-screen table whose rows and columns are permuted per input session.

Combined with the randomized-indexation property,
an attacker watching only the screen still learns nothing without also
recovering the press sequence.

Add a Rationale paragraph explaining a further benefit of splitting the
vocabulary into several short wordlists (32-128 entries each): such tables
fit on a mobile-device screen and admit input via on-screen lookup, which
a single 2048-word list does not.

The randomized indexation:

- defeats pure key-logging (keystrokes alone don't reveal words; the
  attacker also needs the session permutation),
- raises the bar for shoulder surfing (same as key-logging: only keys
  AND session's permutation suffice. Either alone is uniformative).

This gives an operational, security-focused argument for the
many-small-lists design that complements the existing memorization and
information-density arguments.

Formosa: document Mooncake's volume-key input on mobile

Add a paragraph to the Mooncake rationale describing the proposed mobile
input mechanism: reuse of the volume-up / volume-down keys as a two-button
binary selector. Because every Formosa category is sized 2^BIT_LENGTH and
the on-screen table is laid out in rows, sub-rows and columns whose counts
are powers of two, narrowing to a single cell takes exactly BIT_LENGTH
presses (5 for a 32-entry category, 6 for 64, 7 for 128). The per-category
press count is invariant therefore uninformative, and equal to the bits of
entropy encoded, and the 'one bit per press' bound matches the existing
side-channel argument.

Add three concrete reasons why volume-key input on mobile resists visual

shoulder surfing better than an on-screen keyboard:

- Subtler input motions: a single finger pressing a side rocker, much
  harder to read from a distance than multi-finger taps on a glass
  keyboard.
- Easy occlusion with the second hand: both volume keys are on one edge
  of the device, so the free hand (or the holding hand's thumb) can
  cover them without obscuring the screen for the user.
- Pocket input via headphone volume buttons: because the protocol is
  purely binary, headphone volume controls are sufficient, letting the
  user keep the buttons in a pocket while operating it by feel and
  removing the input motion from the observer's field of view entirely.

* Update bip.mediawiki

Fixed typo from "dektop"  to "desktop"
Fixed agreement of number from "Those of a mobile device" to "Those of mobile devices"

* Update bip.mediawiki

Substituted triple hyphen for —

Co-authored-by: Murch <murch@murch.one>

* Update bip.mediawiki

Updated title to mention Formosa and be more self-explanatory.

Co-authored-by: Murch <murch@murch.one>

* renamed bip.mediawiki to bip-0450.mediawiki
added 450 to BIP number in preamble
added assigned date to 2023-05-02 (date of first mention in email group) in preamble
added correspondent entry on README.md table

* fixed assignment dated
shortened title

* BIP-450: fix CI lint failures (field order + README filename)

Two issues caused Build-Table-Checks and Diff-Checks to fail on PR #2108:

1. Preamble field order: scripts/buildtable.pl enforces @FieldOrder
   (...License, Discussion, ..., Requires...). The preamble had Requires
   before Discussion, causing buildtable.pl to die "Field order is
   incorrect", which fails Build-Table-Checks and cascades into
   Diff-Checks. Moved the Discussion block above Requires.

2. README table row referenced bip-0450.md, but the file is
   bip-0450.mediawiki. buildtable.pl emits the .mediawiki name, so the
   README row never matched the generated table and Diff-Checks failed.
   Corrected the link target to bip-0450.mediawiki.

Verified locally: buildtable.pl exits 0, diffcheck.sh reports "README
table matches expected table from BIP files", link-format-chk.sh passes.

* bip450: Add dates to discussion header
2026-05-20 12:51:22 -07:00

25 KiB