bitcoin/bips - bips - Gitea: Git with a cup of rooibos tea

bitcoin/bips

Fork 0

mirror of https://github.com/bitcoin/bips.git synced 2026-06-01 17:15:27 +00:00

Commit Graph

Author	SHA1	Message	Date
Yuri S Villas Boas	d293ae1a10	BIP450: Formosa—Seed Encoding per Themed Mnemonic Stories (#2108 ) * Formosa as BIP Mnemonic sentences instead of words proposed as forwards- and backwards-compatible expansion to BIP39, itself as Bitcoin Improvement Proposal. * Update bip.mediawiki Co-authored-by: Mark "Murch" Erhardt <murch@murch.one> * Update bip.mediawiki Satisfying requirement of title in fewer than 50 characters. * Formosa: address PR #2108 review feedback Restructure the draft to follow BIP-3 conventions and resolve the issues raised by reviewers in https://github.com/bitcoin/bips/pull/2108: - Introduce explicit Specification section with a Terminology subsection that distinguishes 'word', 'category', 'theme', 'sentence' and 'mnemonic' / 'mnemonic story', removing the ambiguity of using 'sentence' at two different scales. - Replace the unclear 'if the category is led by another category' wording with an explicit LED_BY field description and a step-by-step algorithm that covers both the leaderless and led cases. - Reflow the theme-property list (previously a/b/c/d/e split by an intervening paragraph) into a single numbered list so it renders as a list rather than as code blocks. - Add a dedicated Rationale section covering the 33-bit sentence size, themed sentences, free-form theme schema, the LED_BY mechanism, the re-encoding-through-BIP-39 design, and why custom themes are discouraged. - Add a dedicated Backwards Compatibility section describing compatibility at the mnemonic, entropy, and seed levels. - Add a worked Example section showing a 128-bit entropy being encoded into a 4-sentence mnemonic story under a small illustrative theme, including bit splitting, FILLING_ORDER vs NATURAL_ORDER, and the LED_BY lookup. - Tighten the Abstract and Motivation; clarify that BIP-39 is itself a Formosa theme. * Formosa: spell out abbreviated table labels Reviewer on PR #2108 asked for no abbreviations in table labels. Replace: - ENT / CS / S / MS column headers with 'Initial entropy bits', 'Checksum bits', 'Total bits', 'Number of sentences', 'Mnemonic words (6-word theme)' and 'Mnemonic words (BIP-0039)'. - 'List size / Bits / Chars to identify / Density (bits/char)' with 'Wordlist size / Bits per word / Characters to identify / Density (bits per character)'. - ADJ. with ADJECTIVE in the example bit-assignment diagram, and the surrounding narrative ENT/MS uses with the spelled-out forms. The accompanying formulas now use the expanded names too, so the algorithm description and the table column headers stay consistent. * Formosa: rebuild Example on the real medieval_fantasy theme Replace the previous hypothetical 5-category example with one that mirrors the medieval_fantasy theme actually shipped at https://github.com/Yuri-SVB/formosa/tree/master/src/mnemonic/themes, including: - the real 6 categories with their actual BIT_LENGTHs (VERB=5, SUBJECT=6, OBJECT=6, ADJECTIVE=5, WILDCARD=6, PLACE=5, summing to 33); - the real FILLING_ORDER and NATURAL_ORDER; - the real lead tree (VERB → SUBJECT; SUBJECT → OBJECT and WILDCARD; OBJECT → ADJECTIVE; WILDCARD → PLACE), showing that a single leader can have several dependent categories; - a 33-bit block whose decoded indices (28, 32, 63, 27, 46, 29) pick existing words and existing sub-list entries: VERB[28] =unveil, SUBJECT_under_unveil[32]=king, OBJECT_under_king[63] =wine, ADJECTIVE_under_wine[27]=sweet, WILDCARD_under_king[46] =queen, PLACE_under_queen[29]=throne_room, yielding the sentence 'king unveil sweet wine queen throne_room'. This keeps the worked example faithful to the reference implementation rather than to a fabricated theme, so that anyone can reproduce the encoding by parsing medieval_fantasy.json. * Formosa: explain LED_BY as a primitive next-word predictor Add a paragraph to the LED_BY rationale clarifying that a Formosa theme behaves as a primitive language model (next-word predictor): each LED_BY relation skews the conditional distribution over the next word so that probability mass falls only on the 2^BIT_LENGTH words compatible with the already- chosen leader, and zero elsewhere. The theme designer plays the role of training data, hand-curating which combinations are semantically coherent. This framing makes explicit why themes produce sentences that 'sound right' while still covering all 2^33 bit patterns of a sentence. * Cite the companion project Mooncake (https://github.com/T3-Infosec/mooncake) which builds on this property by rendering each Formosa category as an on-screen table whose rows and columns are permuted per input session. Combined with the randomized-indexation property, an attacker watching only the screen still learns nothing without also recovering the press sequence. Add a Rationale paragraph explaining a further benefit of splitting the vocabulary into several short wordlists (32-128 entries each): such tables fit on a mobile-device screen and admit input via on-screen lookup, which a single 2048-word list does not. The randomized indexation: - defeats pure key-logging (keystrokes alone don't reveal words; the attacker also needs the session permutation), - raises the bar for shoulder surfing (same as key-logging: only keys AND session's permutation suffice. Either alone is uniformative). This gives an operational, security-focused argument for the many-small-lists design that complements the existing memorization and information-density arguments. Formosa: document Mooncake's volume-key input on mobile Add a paragraph to the Mooncake rationale describing the proposed mobile input mechanism: reuse of the volume-up / volume-down keys as a two-button binary selector. Because every Formosa category is sized 2^BIT_LENGTH and the on-screen table is laid out in rows, sub-rows and columns whose counts are powers of two, narrowing to a single cell takes exactly BIT_LENGTH presses (5 for a 32-entry category, 6 for 64, 7 for 128). The per-category press count is invariant therefore uninformative, and equal to the bits of entropy encoded, and the 'one bit per press' bound matches the existing side-channel argument. Add three concrete reasons why volume-key input on mobile resists visual shoulder surfing better than an on-screen keyboard: - Subtler input motions: a single finger pressing a side rocker, much harder to read from a distance than multi-finger taps on a glass keyboard. - Easy occlusion with the second hand: both volume keys are on one edge of the device, so the free hand (or the holding hand's thumb) can cover them without obscuring the screen for the user. - Pocket input via headphone volume buttons: because the protocol is purely binary, headphone volume controls are sufficient, letting the user keep the buttons in a pocket while operating it by feel and removing the input motion from the observer's field of view entirely. * Update bip.mediawiki Fixed typo from "dektop" to "desktop" Fixed agreement of number from "Those of a mobile device" to "Those of mobile devices" * Update bip.mediawiki Substituted triple hyphen for — Co-authored-by: Murch <murch@murch.one> * Update bip.mediawiki Updated title to mention Formosa and be more self-explanatory. Co-authored-by: Murch <murch@murch.one> * renamed bip.mediawiki to bip-0450.mediawiki added 450 to BIP number in preamble added assigned date to 2023-05-02 (date of first mention in email group) in preamble added correspondent entry on README.md table * fixed assignment dated shortened title * BIP-450: fix CI lint failures (field order + README filename) Two issues caused Build-Table-Checks and Diff-Checks to fail on PR #2108: 1. Preamble field order: scripts/buildtable.pl enforces @FieldOrder (...License, Discussion, ..., Requires...). The preamble had Requires before Discussion, causing buildtable.pl to die "Field order is incorrect", which fails Build-Table-Checks and cascades into Diff-Checks. Moved the Discussion block above Requires. 2. README table row referenced bip-0450.md, but the file is bip-0450.mediawiki. buildtable.pl emits the .mediawiki name, so the README row never matched the generated table and Diff-Checks failed. Corrected the link target to bip-0450.mediawiki. Verified locally: buildtable.pl exits 0, diffcheck.sh reports "README table matches expected table from BIP files", link-format-chk.sh passes. * bip450: Add dates to discussion header	2026-05-20 12:51:22 -07:00

Author

SHA1

Message

Date

Yuri S Villas Boas

d293ae1a10

BIP450: Formosa—Seed Encoding per Themed Mnemonic Stories (#2108 )

* Formosa as BIP

Mnemonic *sentences* instead of words proposed as forwards- and backwards-compatible expansion to BIP39, itself as Bitcoin Improvement Proposal.

* Update bip.mediawiki

Co-authored-by: Mark "Murch" Erhardt <murch@murch.one>

* Update bip.mediawiki

Satisfying requirement of title in fewer than 50 characters.

* Formosa: address PR #2108 review feedback

Restructure the draft to follow BIP-3 conventions and resolve the issues
raised by reviewers in https://github.com/bitcoin/bips/pull/2108:

- Introduce explicit Specification section with a Terminology subsection
that distinguishes 'word', 'category', 'theme', 'sentence' and
'mnemonic' / 'mnemonic story', removing the ambiguity of using
'sentence' at two different scales.
- Replace the unclear 'if the category is led by another category'
wording with an explicit LED_BY field description and a step-by-step
algorithm that covers both the leaderless and led cases.
- Reflow the theme-property list (previously a/b/c/d/e split by an
intervening paragraph) into a single numbered list so it renders as a
list rather than as code blocks.
- Add a dedicated Rationale section covering the 33-bit sentence size,
themed sentences, free-form theme schema, the LED_BY mechanism, the
re-encoding-through-BIP-39 design, and why custom themes are
discouraged.
- Add a dedicated Backwards Compatibility section describing
compatibility at the mnemonic, entropy, and seed levels.
- Add a worked Example section showing a 128-bit entropy being encoded
into a 4-sentence mnemonic story under a small illustrative theme,
including bit splitting, FILLING_ORDER vs NATURAL_ORDER, and the
LED_BY lookup.
- Tighten the Abstract and Motivation; clarify that BIP-39 is itself a
Formosa theme.

* Formosa: spell out abbreviated table labels

Reviewer on PR #2108 asked for no abbreviations in table labels. Replace:

- ENT / CS / S / MS column headers with 'Initial entropy bits',
'Checksum bits', 'Total bits', 'Number of sentences', 'Mnemonic
words (6-word theme)' and 'Mnemonic words (BIP-0039)'.
- 'List size / Bits / Chars to identify / Density (bits/char)' with
'Wordlist size / Bits per word / Characters to identify / Density
(bits per character)'.
- ADJ. with ADJECTIVE in the example bit-assignment diagram, and the
surrounding narrative ENT/MS uses with the spelled-out forms.

The accompanying formulas now use the expanded names too, so the
algorithm description and the table column headers stay consistent.

* Formosa: rebuild Example on the real medieval_fantasy theme

Replace the previous hypothetical 5-category example with one that
mirrors the medieval_fantasy theme actually shipped at
https://github.com/Yuri-SVB/formosa/tree/master/src/mnemonic/themes,
including:

- the real 6 categories with their actual BIT_LENGTHs
(VERB=5, SUBJECT=6, OBJECT=6, ADJECTIVE=5, WILDCARD=6, PLACE=5,
summing to 33);
- the real FILLING_ORDER and NATURAL_ORDER;
- the real lead tree (VERB → SUBJECT; SUBJECT → OBJECT and WILDCARD;
OBJECT → ADJECTIVE; WILDCARD → PLACE), showing that a single
leader can have several dependent categories;
- a 33-bit block whose decoded indices (28, 32, 63, 27, 46, 29)
pick existing words and existing sub-list entries: VERB[28]
=unveil, SUBJECT_under_unveil[32]=king, OBJECT_under_king[63]
=wine, ADJECTIVE_under_wine[27]=sweet, WILDCARD_under_king[46]
=queen, PLACE_under_queen[29]=throne_room, yielding the sentence
'king unveil sweet wine queen throne_room'.

This keeps the worked example faithful to the reference
implementation rather than to a fabricated theme, so that anyone can
reproduce the encoding by parsing medieval_fantasy.json.

* Formosa: explain LED_BY as a primitive next-word predictor

Add a paragraph to the LED_BY rationale clarifying that a Formosa theme
behaves as a primitive language model (next-word predictor): each LED_BY relation
skews the conditional distribution over the next word so that probability
mass falls only on the 2^BIT_LENGTH words compatible with the already-
chosen leader, and zero elsewhere. The theme designer plays the role of
training data, hand-curating which combinations are semantically coherent.
This framing makes explicit why themes produce sentences that 'sound right'
while still covering all 2^33 bit patterns of a sentence.

* Cite the companion project Mooncake (https://github.com/T3-Infosec/mooncake)
which builds on this property by rendering each Formosa category as an
on-screen table whose rows and columns are permuted per input session.

Combined with the randomized-indexation property,
an attacker watching only the screen still learns nothing without also
recovering the press sequence.

Add a Rationale paragraph explaining a further benefit of splitting the
vocabulary into several short wordlists (32-128 entries each): such tables
fit on a mobile-device screen and admit input via on-screen lookup, which
a single 2048-word list does not.

The randomized indexation:

- defeats pure key-logging (keystrokes alone don't reveal words; the
attacker also needs the session permutation),
- raises the bar for shoulder surfing (same as key-logging: only keys
AND session's permutation suffice. Either alone is uniformative).

This gives an operational, security-focused argument for the
many-small-lists design that complements the existing memorization and
information-density arguments.

Formosa: document Mooncake's volume-key input on mobile

Add a paragraph to the Mooncake rationale describing the proposed mobile
input mechanism: reuse of the volume-up / volume-down keys as a two-button
binary selector. Because every Formosa category is sized 2^BIT_LENGTH and
the on-screen table is laid out in rows, sub-rows and columns whose counts
are powers of two, narrowing to a single cell takes exactly BIT_LENGTH
presses (5 for a 32-entry category, 6 for 64, 7 for 128). The per-category
press count is invariant therefore uninformative, and equal to the bits of
entropy encoded, and the 'one bit per press' bound matches the existing
side-channel argument.

Add three concrete reasons why volume-key input on mobile resists visual

shoulder surfing better than an on-screen keyboard:

- Subtler input motions: a single finger pressing a side rocker, much
harder to read from a distance than multi-finger taps on a glass
keyboard.
- Easy occlusion with the second hand: both volume keys are on one edge
of the device, so the free hand (or the holding hand's thumb) can
cover them without obscuring the screen for the user.
- Pocket input via headphone volume buttons: because the protocol is
purely binary, headphone volume controls are sufficient, letting the
user keep the buttons in a pocket while operating it by feel and
removing the input motion from the observer's field of view entirely.

* Update bip.mediawiki

Fixed typo from "dektop" to "desktop"
Fixed agreement of number from "Those of a mobile device" to "Those of mobile devices"

* Update bip.mediawiki

Substituted triple hyphen for —

Co-authored-by: Murch <murch@murch.one>

* Update bip.mediawiki

Updated title to mention Formosa and be more self-explanatory.

Co-authored-by: Murch <murch@murch.one>

* renamed bip.mediawiki to bip-0450.mediawiki
added 450 to BIP number in preamble
added assigned date to 2023-05-02 (date of first mention in email group) in preamble
added correspondent entry on README.md table

* fixed assignment dated
shortened title

* BIP-450: fix CI lint failures (field order + README filename)

Two issues caused Build-Table-Checks and Diff-Checks to fail on PR #2108:

1. Preamble field order: scripts/buildtable.pl enforces @FieldOrder
(...License, Discussion, ..., Requires...). The preamble had Requires
before Discussion, causing buildtable.pl to die "Field order is
incorrect", which fails Build-Table-Checks and cascades into
Diff-Checks. Moved the Discussion block above Requires.

2. README table row referenced bip-0450.md, but the file is
bip-0450.mediawiki. buildtable.pl emits the .mediawiki name, so the
README row never matched the generated table and Diff-Checks failed.
Corrected the link target to bip-0450.mediawiki.

Verified locally: buildtable.pl exits 0, diffcheck.sh reports "README
table matches expected table from BIP files", link-format-chk.sh passes.

* bip450: Add dates to discussion header

2026-05-20 12:51:22 -07:00

1 Commits