Bech32 encoder can be coerced into producing a string that can't be decoded #314

jonathanknowles · 2019-05-24T07:56:06Z

Context

When encode is applied to a HumanReadablePart containing one or more upper-case characters, it results in an invalid Bech32 string that cannot be decoded with the decode function.

Steps to Reproduce

$ stack ghci bech32
> lower = fromRight undefined $ mkHumanReadablePart "test"
> upper = fromRight undefined $ mkHumanReadablePart "TEST"

Expected behaviour

> encode lower mempty
Right "test12hrzfj"
> encode upper mempty
Right "test12hrzfj" -- SAME

# Should give the same output:
> decode $ fromRight undefined $ encode lower mempty
Right (HumanReadablePart "test","")
> decode $ fromRight undefined $ encode upper mempty
Right (HumanReadablePart "test","") -- SUCCESS

Actual behaviour

# Should give the same output:
> encode lower mempty
Right "test12hrzfj"
> encode upper mempty
Right "test13jgcyw" -- DIFFERENT

# Should give the same output:
> decode $ fromRight undefined $ encode lower mempty
Right (HumanReadablePart "test","")
> decode $ fromRight undefined $ encode upper mempty
Left (StringToDecodeContainsInvalidChars []) -- FAILURE

Cause

This appears to be caused by the encode function. Internally, it calculates a checksum based on the supplied HumanReadablePart, which can be upper or lower case (or mixed case). Unfortunately, the checksum calculation is case-sensitive. Later on, the human-readable part is converted to lower case before inclusion in the final output. If the original HumanReadablePart contained one or more upper case characters, the checksum will be inconsistent with the human readable part of the final output.

-- | Encode a human-readable string and data payload into a Bech32 string.      
encode :: HumanReadablePart -> ByteString -> Either EncodingError ByteString    
encode hrp@(HumanReadablePart hrpBytes) payload = do    
    let payload5 = toBase32 (BS.unpack payload)
    let payload' = payload5 ++ bech32CreateChecksum hrp payload5    
    let rest = map (word5ToChar Arr.!) payload'
    let output = B8.map toLower hrpBytes <> B8.pack "1" <> B8.pack rest   
    guardE (BS.length output <= encodedStringMaxLength) EncodedStringTooLong    
    return output

Why this is definitely a bug

The Bech32 specification states:

"The lowercase form is used when determining a character's value for checksum purposes."

Also, comparing with another official (JavaScript) implementation bitcoinjs/bech32

We have the following behavior:

> bech32.encode('test', [])
'test12hrzfj'
> bech32.encode('Test', [])
'test12hrzfj'
> bech32.encode('TEST', [])
'test12hrzfj'
> bech32.encode('TEsT', [])
'test12hrzfj'
> bech32.encode('TesT', [])
'test12hrzfj'

> bech32.decode("test12hrzfj")
{ prefix: 'test', words: [] }
> bech32.decode("test13jgcyw")
Error: Invalid checksum for test13jgcyw

which is quite aligned with what we now expect and suggest the existence of a bug in the haskell implementation.

Addendum

This bug appears to also affect the Haskell reference implementation for Bech32.

QA

Added an extra regression test to highlight the bug described above: Codec/Binary/Bech32Spec
Also adjusted our arbitrary generator to produce uppercased and lowercased human-readable-part and have roundtrip properties check for both (was only producing lowercased HRP before).
Behavior is now aligned with another (more maintained) reference implementation a.k.a bitcoinjs/bech32

The text was updated successfully, but these errors were encountered:

jonathanknowles · 2019-05-24T09:20:26Z

Raised this issue as a bug against the Haskell Bech32 reference implementation here: sipa/bech32#49

jonathanknowles · 2019-05-24T09:28:16Z

Fix available in #312

piotr-iohk · 2019-05-24T13:07:37Z

👍

jonathanknowles changed the title ~~Bech32 encode function can be coerced into producing a string that can't be decoded with decode.~~ Bech32 encoder can be coerced into producing a string that can't be decoded May 24, 2019

jonathanknowles mentioned this issue May 24, 2019

Fix Bech32 decoder + encoder #312

Merged

6 tasks

jonathanknowles self-assigned this May 24, 2019

jonathanknowles added the BUG label May 24, 2019

KtorZ added this to the Bugs - Sprint 19-20 milestone May 24, 2019

piotr-iohk closed this as completed May 24, 2019

rvl mentioned this issue Dec 11, 2020

Fix flaky WAI middleware logging test cases #2377

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bech32 encoder can be coerced into producing a string that can't be decoded #314

Bech32 encoder can be coerced into producing a string that can't be decoded #314

jonathanknowles commented May 24, 2019 •

edited by KtorZ

Loading

jonathanknowles commented May 24, 2019

jonathanknowles commented May 24, 2019 •

edited

Loading

piotr-iohk commented May 24, 2019

Bech32 encoder can be coerced into producing a string that can't be decoded #314

Bech32 encoder can be coerced into producing a string that can't be decoded #314

Comments

jonathanknowles commented May 24, 2019 • edited by KtorZ Loading

Context

Steps to Reproduce

Expected behaviour

Actual behaviour

Cause

Why this is definitely a bug

Addendum

QA

jonathanknowles commented May 24, 2019

jonathanknowles commented May 24, 2019 • edited Loading

piotr-iohk commented May 24, 2019

jonathanknowles commented May 24, 2019 •

edited by KtorZ

Loading

jonathanknowles commented May 24, 2019 •

edited

Loading