2017-06-01 11:59:19 -07:00 
										
									 
								 
							 
							
								
							 
							
								 
							
							
								<pre>
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								  BIP: 158
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								  Layer: Peer Services
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								  Title: Compact Block Filters for Light Clients
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								  Author: Olaoluwa Osuntokun <laolu32@gmail.com>
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								          Alex Akselrod <alex@akselrod.org>
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								  Comments-Summary: None yet
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								  Comments-URI: https://github.com/bitcoin/bips/wiki/Comments:BIP-0158
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								  Status: Draft
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								  Type: Standards Track
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								  Created: 2017-05-24
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								  License: CC0-1.0
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								</pre>
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								== Abstract ==
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								This BIP describes a structure for compact filters on block data, for use in the
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								BIP 157 light client protocol<ref>bip-0157.mediawiki</ref>. The filter
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								construction proposed is an alternative to Bloom filters, as used in BIP 37,
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								that minimizes filter size by using Golomb-Rice coding for compression. This
							 
						 
					
						
							
								
									
										
										
										
											2019-09-24 15:05:03 -04:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								document specifies one initial filter type based on this construction that
							 
						 
					
						
							
								
									
										
										
										
											2017-06-01 11:59:19 -07:00 
										
									 
								 
							 
							
								
							 
							
								 
							
							
								enables basic wallets and applications with more advanced smart contracts.
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								== Motivation ==
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								[[bip-0157.mediawiki|BIP 157]] defines a light client protocol based on
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								deterministic filters of block content. The filters are designed to
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								minimize the expected bandwidth consumed by light clients, downloading filters
							 
						 
					
						
							
								
									
										
										
										
											2019-06-13 12:19:57 -04:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								and full blocks. This document defines the initial filter type ''basic''
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								that is designed to reduce the filter size for regular wallets.
							 
						 
					
						
							
								
									
										
										
										
											2017-06-01 11:59:19 -07:00 
										
									 
								 
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								== Definitions ==
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								<code>[]byte</code> represents a vector of bytes.
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								<code>[N]byte</code> represents a fixed-size byte array with length N.
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								''CompactSize'' is a compact encoding of unsigned integers used in the Bitcoin
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								P2P protocol.
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								''Data pushes'' are byte vectors pushed to the stack according to the rules of
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								Bitcoin script.
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								''Bit streams'' are readable and writable streams of individual bits. The
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								following functions are used in the pseudocode in this document:
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								* <code>new_bit_stream</code> instantiates a new writable bit stream
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								* <code>new_bit_stream(vector)</code> instantiates a new bit stream reading data from <code>vector</code>
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								* <code>write_bit(stream, b)</code> appends the bit <code>b</code> to the end of the stream
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								* <code>read_bit(stream)</code> reads the next available bit from the stream
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								* <code>write_bits_big_endian(stream, n, k)</code> appends the <code>k</code> least significant bits of integer <code>n</code> to the end of the stream in big-endian bit order
							 
						 
					
						
							
								
									
										
										
										
											2018-07-07 15:14:27 -07:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								* <code>read_bits_big_endian(stream, k)</code> reads the next available <code>k</code> bits from the stream and interprets them as the least significant bits of a big-endian integer
							 
						 
					
						
							
								
									
										
										
										
											2017-06-01 11:59:19 -07:00 
										
									 
								 
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD",
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								"SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								interpreted as described in RFC 2119.
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								== Specification ==
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								=== Golomb-Coded Sets ===
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								For each block, compact filters are derived containing sets of items associated
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								with the block (eg. addresses sent to, outpoints spent, etc.). A set of such
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								data objects is compressed into a probabilistic structure called a
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								''Golomb-coded set'' (GCS), which matches all items in the set with probability
							 
						 
					
						
							
								
									
										
										
										
											2018-07-09 23:54:12 -07:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								1, and matches other items with probability <code>1/M</code> for some
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								integer parameter <code>M</code>. The encoding is also parameterized by
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								<code>P</code>, the bit length of the remainder code. Each filter defined
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								specifies values for <code>P</code> and <code>M</code>.
							 
						 
					
						
							
								
									
										
										
										
											2017-06-01 11:59:19 -07:00 
										
									 
								 
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								At a high level, a GCS is constructed from a set of <code>N</code> items by:
							 
						 
					
						
							
								
									
										
										
										
											2018-05-30 17:18:24 -07:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								# hashing all items to 64-bit integers in the range <code>[0, N * M)</code>
							 
						 
					
						
							
								
									
										
										
										
											2017-06-01 11:59:19 -07:00 
										
									 
								 
							 
							
								
							 
							
								 
							
							
								# sorting the hashed values in ascending order
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								# computing the differences between each value and the previous one
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								# writing the differences sequentially, compressed with Golomb-Rice coding
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								The following sections describe each step in greater detail.
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								==== Hashing Data Objects ====
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								The first step in the filter construction is hashing the variable-sized raw
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								items in the set to the range <code>[0, F)</code>, where <code>F = N *
							 
						 
					
						
							
								
									
										
										
										
											2018-05-30 17:18:24 -07:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								M</code>. Customarily, <code>M</code> is set to <code>2^P</code>. However, if
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								one is able to select both Parameters independently, then more optimal values
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								can be
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								selected<ref>https://gist.github.com/sipa/576d5f09c3b86c3b1b75598d799fc845</ref>.
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								Set membership queries against the hash outputs will have a false positive rate
							 
						 
					
						
							
								
									
										
										
										
											2018-07-09 23:54:12 -07:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								of <code>M</code>. To avoid integer overflow, the number of items <code>N</code>
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								MUST be <2^32 and <code>M</code> MUST be <2^32.
							 
						 
					
						
							
								
									
										
										
										
											2017-06-01 11:59:19 -07:00 
										
									 
								 
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								The items are first passed through the pseudorandom function ''SipHash'', which
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								takes a 128-bit key <code>k</code> and a variable-sized byte vector and produces
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								a uniformly random 64-bit output. Implementations of this BIP MUST use the
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								SipHash parameters <code>c = 2</code> and <code>d = 4</code>.
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								The 64-bit SipHash outputs are then mapped uniformly over the desired range by
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								multiplying with F and taking the top 64 bits of the 128-bit result. This
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								algorithm is a faster alternative to modulo reduction, as it avoids the
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								expensive division
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								operation<ref>https://lemire.me/blog/2016/06/27/a-fast-alternative-to-the-modulo-reduction/</ref>.
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								Note that care must be taken when implementing this reduction to ensure the
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								upper 64 bits of the integer multiplication are not truncated; certain
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								architectures and high level languages may require code that decomposes the
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								64-bit multiplication into four 32-bit multiplications and recombines into the
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								result.
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								<pre>
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								hash_to_range(item: []byte, F: uint64, k: [16]byte) -> uint64:
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								    return (siphash(k, item) * F) >> 64
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2018-05-30 17:18:24 -07:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								hashed_set_construct(raw_items: [][]byte, k: [16]byte, M: uint) -> []uint64:
							 
						 
					
						
							
								
									
										
										
										
											2017-06-01 11:59:19 -07:00 
										
									 
								 
							 
							
								
							 
							
								 
							
							
								    let N = len(raw_items)
							 
						 
					
						
							
								
									
										
										
										
											2018-05-30 17:18:24 -07:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								    let F = N * M
							 
						 
					
						
							
								
									
										
										
										
											2017-06-01 11:59:19 -07:00 
										
									 
								 
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								    let set_items = []
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								    for item in raw_items:
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								        let set_value = hash_to_range(item, F, k)
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								        set_items.append(set_value)
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								    return set_items
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								</pre>
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								==== Golomb-Rice Coding ====
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								Instead of writing the items in the hashed set directly to the filter, greater
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								compression is achieved by only writing the differences between successive
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								items in sorted order. Since the items are distributed uniformly, it can be
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								shown that the differences resemble a geometric
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								distribution<ref>https://en.wikipedia.org/wiki/Geometric_distribution</ref>.
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								''Golomb-Rice''
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								''coding''<ref>https://en.wikipedia.org/wiki/Golomb_coding#Rice_coding</ref>
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								is a technique that optimally compresses geometrically distributed values.
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								With Golomb-Rice, a value is split into a quotient and remainder modulo
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								<code>2^P</code>, which are encoded separately. The quotient <code>q</code> is
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								encoded as ''unary'', with a string of <code>q</code> 1's followed by one 0. The
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								remainder <code>r</code> is represented in big-endian by P bits. For example,
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								this is a table of Golomb-Rice coded values using <code>P=2</code>:
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								{| class="wikitable"
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								! n !! (q, r) !! c
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								|-
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								| 0 || (0, 0) || <code>0 00</code>
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								|-
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								| 1 || (0, 1) || <code>0 01</code>
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								|-
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								| 2 || (0, 2) || <code>0 10</code>
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								|-
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								| 3 || (0, 3) || <code>0 11</code>
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								|-
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								| 4 || (1, 0) || <code>10 00</code>
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								|-
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								| 5 || (1, 1) || <code>10 01</code>
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								|-
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								| 6 || (1, 2) || <code>10 10</code>
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								|-
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								| 7 || (1, 3) || <code>10 11</code>
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								|-
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								| 8 || (2, 0) || <code>110 00</code>
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								|-
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								| 9 || (2, 1) || <code>110 01</code>
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								|}
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								<pre>
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								golomb_encode(stream, x: uint64, P: uint):
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								    let q = x >> P
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								    while q > 0:
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								        write_bit(stream, 1)
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								        q--
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								    write_bit(stream, 0)
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								    write_bits_big_endian(stream, x, P)
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								golomb_decode(stream, P: uint) -> uint64:
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								    let q = 0
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								    while read_bit(stream) == 1:
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								        q++
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								    let r = read_bits_big_endian(stream, P)
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								    let x = (q << P) + r
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								    return x
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								</pre>
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								==== Set Construction ====
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2018-07-09 23:54:12 -07:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								A GCS is constructed from four parameters:
							 
						 
					
						
							
								
									
										
										
										
											2017-06-01 11:59:19 -07:00 
										
									 
								 
							 
							
								
							 
							
								 
							
							
								* <code>L</code>, a vector of <code>N</code> raw items
							 
						 
					
						
							
								
									
										
										
										
											2018-07-09 23:54:12 -07:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								* <code>P</code>, the bit parameter of the Golomb-Rice coding
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								* <code>M</code>, the target false positive rate
							 
						 
					
						
							
								
									
										
										
										
											2017-06-01 11:59:19 -07:00 
										
									 
								 
							 
							
								
							 
							
								 
							
							
								* <code>k</code>, the 128-bit key used to randomize the SipHash outputs
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								The result is a byte vector with a minimum size of <code>N * (P + 1)</code>
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								bits.
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								The raw items in <code>L</code> are first hashed to 64-bit unsigned integers as
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								specified above and sorted. The differences between consecutive values,
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								hereafter referred to as ''deltas'', are encoded sequentially to a bit stream
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								with Golomb-Rice coding. Finally, the bit stream is padded with 0's to the
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								nearest byte boundary and serialized to the output byte vector.
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								<pre>
							 
						 
					
						
							
								
									
										
										
										
											2018-05-30 17:18:24 -07:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								construct_gcs(L: [][]byte, P: uint, k: [16]byte, M: uint) -> []byte:
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								    let set_items = hashed_set_construct(L, k, M)
							 
						 
					
						
							
								
									
										
										
										
											2017-06-01 11:59:19 -07:00 
										
									 
								 
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								    set_items.sort()
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								    let output_stream = new_bit_stream()
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								    let last_value = 0
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								    for item in set_items:
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								        let delta = item - last_value
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								        golomb_encode(output_stream, delta, P)
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								        last_value = item
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								    return output_stream.bytes()
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								</pre>
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								==== Set Querying/Decompression ====
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								To check membership of an item in a compressed GCS, one must reconstruct the
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								hashed set members from the encoded deltas. The procedure to do so is the
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								reverse of the compression: deltas are decoded one by one and added to a
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								cumulative sum. Each intermediate sum represents a hashed value in the original
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								set. The queried item is hashed in the same way as the set members and compared
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								against the reconstructed values. Note that querying does not require the entire
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								decompressed set be held in memory at once.
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								<pre>
							 
						 
					
						
							
								
									
										
										
										
											2018-05-30 17:18:24 -07:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								gcs_match(key: [16]byte, compressed_set: []byte, target: []byte, P: uint, N: uint, M: uint) -> bool:
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								    let F = N * M
							 
						 
					
						
							
								
									
										
										
										
											2017-06-01 11:59:19 -07:00 
										
									 
								 
							 
							
								
							 
							
								 
							
							
								    let target_hash = hash_to_range(target, F, k)
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								    stream = new_bit_stream(compressed_set)
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								    let last_value = 0
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								    loop N times:
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								        let delta = golomb_decode(stream, P)
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								        let set_item = last_value + delta
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								        if set_item == target_hash:
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								            return true
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								        // Since the values in the set are sorted, terminate the search once
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								        // the decoded value exceeds the target.
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								        if set_item > target_hash:
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								            break
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								        last_value = set_item
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								    return false
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								</pre>
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								Some applications may need to check for set intersection instead of membership
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								of a single item. This can be performed far more efficiently than checking each
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								item individually by leveraging the sorted structure of the compressed GCS.
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								First the query elements are all hashed and sorted, then compared in order
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								against the decompressed GCS contents. See
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								[[#golomb-coded-set-multi-match|Appendix B]] for pseudocode.
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								=== Block Filters ===
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2018-05-30 17:16:25 -07:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								This BIP defines one initial filter type:
							 
						 
					
						
							
								
									
										
										
										
											2017-06-01 11:59:19 -07:00 
										
									 
								 
							 
							
								
							 
							
								 
							
							
								* Basic (<code>0x00</code>)
							 
						 
					
						
							
								
									
										
										
										
											2018-08-26 22:05:29 +02:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								** <code>M = 784931</code>
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								** <code>P = 19</code>
							 
						 
					
						
							
								
									
										
										
										
											2017-06-01 11:59:19 -07:00 
										
									 
								 
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								==== Contents ====
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								The basic filter is designed to contain everything that a light client needs to
							 
						 
					
						
							
								
									
										
										
										
											2018-07-03 19:06:41 -05:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								sync a regular Bitcoin wallet. A basic filter MUST contain exactly the
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								following items for each transaction in a block:
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								* The previous output script (the script being spent) for each input, except
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								  for the coinbase transaction.
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								* The scriptPubKey of each output, aside from all <code>OP_RETURN</code> output
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								  scripts.
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2018-08-27 21:41:24 -07:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								Any "nil" items MUST NOT be included into the final set of filter elements.
							 
						 
					
						
							
								
									
										
										
										
											2018-07-03 19:06:41 -05:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2019-02-12 21:14:29 -08:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								We exclude all outputs that start with <code>OP_RETURN</code> in order to allow
							 
						 
					
						
							
								
									
										
										
										
											2019-02-12 19:57:51 -08:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								filters to easily be committed to in the future via a soft-fork. A likely area
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								for future commitments is an additional <code>OP_RETURN</code> output in the
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								coinbase transaction similar to the current witness commitment
							 
						 
					
						
							
								
									
										
										
										
											2019-09-24 15:05:03 -04:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								<ref>https://github.com/bitcoin/bips/blob/master/bip-0141.mediawiki</ref>. By
							 
						 
					
						
							
								
									
										
										
										
											2018-07-03 19:06:41 -05:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								excluding all <code>OP_RETURN</code> outputs we avoid a circular dependency
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								between the commitment, and the item being committed to.
							 
						 
					
						
							
								
									
										
										
										
											2017-06-01 11:59:19 -07:00 
										
									 
								 
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								==== Construction ====
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2018-05-30 17:18:24 -07:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								The basic type is constructed as Golomb-coded sets with the following
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								parameters.
							 
						 
					
						
							
								
									
										
										
										
											2017-06-01 11:59:19 -07:00 
										
									 
								 
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2018-05-30 17:18:24 -07:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								The parameter <code>P</code> MUST be set to <code>19</code>, and the parameter
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								<code>M</code> MUST be set to <code>784931</code>. Analysis has shown that if
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								one is able to select <code>P</code> and <code>M</code> independently, then
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								setting <code>M=1.497137 * 2^P</code> is close to optimal
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								<ref>https://gist.github.com/sipa/576d5f09c3b86c3b1b75598d799fc845</ref>.
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								Empirical analysis also shows that was chosen as these parameters minimize the
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								bandwidth utilized, considering both the expected number of blocks downloaded
							 
						 
					
						
							
								
									
										
										
										
											2018-08-27 21:41:24 -07:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								due to false positives and the size of the filters themselves.
							 
						 
					
						
							
								
									
										
										
										
											2017-06-01 11:59:19 -07:00 
										
									 
								 
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2018-08-27 21:41:24 -07:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								The parameter <code>k</code> MUST be set to the first 16 bytes of the hash
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								(in standard little-endian representation) of the block for which the filter is
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								constructed. This ensures the key is deterministic while still varying from
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								block to block.
							 
						 
					
						
							
								
									
										
										
										
											2017-06-01 11:59:19 -07:00 
										
									 
								 
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								Since the value <code>N</code> is required to decode a GCS, a serialized GCS
							 
						 
					
						
							
								
									
										
										
										
											2018-05-30 17:18:24 -07:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								includes it as a prefix, written as a <code>CompactSize</code>. Thus, the
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								complete serialization of a filter is:
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								* <code>N</code>, encoded as a <code>CompactSize</code>
							 
						 
					
						
							
								
									
										
										
										
											2017-06-01 11:59:19 -07:00 
										
									 
								 
							 
							
								
							 
							
								 
							
							
								* The bytes of the compressed filter itself
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								==== Signaling ====
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								This BIP allocates a new service bit:
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								{| class="wikitable"
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								|-
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								| NODE_COMPACT_FILTERS
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								| style="white-space: nowrap;" | <code>1 << 6</code>
							 
						 
					
						
							
								
									
										
										
										
											2019-09-24 15:05:03 -04:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								| If enabled, the node MUST respond to all BIP 157 messages for filter type <code>0x00</code>
							 
						 
					
						
							
								
									
										
										
										
											2017-06-01 11:59:19 -07:00 
										
									 
								 
							 
							
								
							 
							
								 
							
							
								|}
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								== Compatibility ==
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								This block filter construction is not incompatible with existing software,
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								though it requires implementation of the new filters.
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								== Acknowledgments ==
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								We would like to thank bfd (from the bitcoin-dev mailing list) for bringing the
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								basis of this BIP to our attention, Greg Maxwell for pointing us in the
							 
						 
					
						
							
								
									
										
										
										
											2018-05-30 17:18:24 -07:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								direction of Golomb-Rice coding and fast range optimization, Pieter Wullie for
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								his analysis of optimal GCS parameters, and Pedro
							 
						 
					
						
							
								
									
										
										
										
											2017-06-01 11:59:19 -07:00 
										
									 
								 
							 
							
								
							 
							
								 
							
							
								Martelletto for writing the initial indexing code for <code>btcd</code>.
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								We would also like to thank Dave Collins, JJ Jeffrey, and Eric Lombrozo for
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								useful discussions.
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								== Reference Implementation ==
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								Light client: [https://github.com/lightninglabs/neutrino]
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								Full-node indexing: https://github.com/Roasbeef/btcd/tree/segwit-cbf
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2019-06-13 13:52:10 -05:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								Golomb-Rice Coded sets: https://github.com/btcsuite/btcutil/blob/master/gcs
							 
						 
					
						
							
								
									
										
										
										
											2017-06-01 11:59:19 -07:00 
										
									 
								 
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								== Appendix A: Alternatives ==
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								A number of alternative set encodings were considered before Golomb-coded
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								sets were settled upon. In this appendix section, we'll list a few of the
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								alternatives along with our rationale for not pursuing them.
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								==== Bloom Filters ====
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								Bloom Filters are perhaps the best known probabilistic data structure for
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								testing set membership, and were introduced into the Bitcoin protocol with BIP
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								37. The size of a Bloom filter is larger than the expected size of a GCS with
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								the same false positive rate, which is the main reason the option was rejected.
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								==== Cryptographic Accumulators ====
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								Cryptographic
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								accumulators<ref>https://en.wikipedia.org/wiki/Accumulator_(cryptography)</ref>
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								are a cryptographic data structures that enable (amongst other operations) a one
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								way membership test. One advantage of accumulators are that they are constant
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								size, independent of the number of elements inserted into the accumulator.
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								However, current constructions of cryptographic accumulators require an initial
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								trusted set up. Additionally, accumulators based on the Strong-RSA Assumption
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								require mapping set items to prime representatives in the associated group which
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								can be preemptively expensive.
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								==== Matrix Based Probabilistic Set Data Structures ====
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								There exist data structures based on matrix solving which are even more space
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								efficient compared to Bloom
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								filters<ref>https://arxiv.org/pdf/0804.1845.pdf</ref>. We instead opted for our
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								GCS-based filters as they have a much lower implementation complexity and are
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								easier to understand.
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								== Appendix B: Pseudocode ==
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								=== Golomb-Coded Set Multi-Match ===
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								<pre>
							 
						 
					
						
							
								
									
										
										
										
											2018-05-30 17:18:24 -07:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								gcs_match_any(key: [16]byte, compressed_set: []byte, targets: [][]byte, P: uint, N: uint, M: uint) -> bool:
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								    let F = N * M
							 
						 
					
						
							
								
									
										
										
										
											2017-06-01 11:59:19 -07:00 
										
									 
								 
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								    // Map targets to the same range as the set hashes.
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								    let target_hashes = []
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								    for target in targets:
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								        let target_hash = hash_to_range(target, F, k)
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								        target_hashes.append(target_hash)
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								    // Sort targets so matching can be checked in linear time.
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								    target_hashes.sort()
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								    stream = new_bit_stream(compressed_set)
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								    let value = 0
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								    let target_idx = 0
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								    let target_val = target_hashes[target_idx]
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								    loop N times:
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								        let delta = golomb_decode(stream, P)
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								        value += delta
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								        inner loop:
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								            if target_val == value:
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								                return true
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								            // Move on to the next set value.
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								            else if target_val > value:
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								                break inner loop
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								            // Move on to the next target value.
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								            else if target_val < value:
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								                target_idx++
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								                // If there are no targets left, then there are no matches.
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								                if target_idx == len(targets):
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								                    break outer loop
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								                target_val = target_hashes[target_idx]
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								    return false
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								</pre>
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								== Appendix C: Test Vectors ==
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2018-08-27 21:32:22 -07:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								Test vectors for basic block filters on five testnet blocks, including the filters and filter headers, can be found [[bip-0158/testnet-19.json|here]]. The code to generate them can be found [[bip-0158/gentestvectors.go|here]].
							 
						 
					
						
							
								
									
										
										
										
											2017-06-01 11:59:19 -07:00 
										
									 
								 
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								== References ==
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								<references/>
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								== Copyright ==
							 
						 
					
						
							
								
							 
							
								
							 
							
								 
							
							
								
							 
						 
					
						
							
								
									
										
										
										
											2018-08-26 22:05:29 +02:00 
										
									 
								 
							 
							
								
									
										 
								
							 
							
								 
							
							
								This document is licensed under the  Creative Commons CC0 1.0 Universal license.