Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Code samples for the LCP Automatic Key Retrieval #51

Open
mickael-menu opened this issue Oct 20, 2021 · 13 comments
Open

Code samples for the LCP Automatic Key Retrieval #51

mickael-menu opened this issue Oct 20, 2021 · 13 comments

Comments

@mickael-menu
Copy link
Member

mickael-menu commented Oct 20, 2021

I'm opening this issue to gather code samples showing how to encode or decode the lcp_hashed_passphrase key as described in Readium LCP Automatic Key Retrieval.

Encoding

PHP

Courtesy of the Internet Archive.

$lcpHashedPassphrase = base64_encode(hash('sha256', $passphrase, true));

Decoding

Kotlin

Vanilla JVM

import java.util.Base64

val passphrase = Base64.getDecoder().decode(lcpHashedPassphrase)
   .map { String.format("%02x", it) }
   .joinToString(separator = "")

Android

import android.util.Base64

val passphrase = Base64.decode(lcpHashedPassphrase, Base64.DEFAULT)
    .map { String.format("%02x", it) }
    .joinToString(separator = "")

Swift

let passphrase = Data(base64Encoded: lcpHashedPassphrase)
    .map { [UInt8]($0) }?
    .map { String(format: "%02x", $0) }
    .joined() ?? self
@mickael-menu
Copy link
Member Author

Thorium seems to have some heuristic to fallback if the LCP hashed passphrase is not properly encoded.

https://github.com/edrlab/thorium-reader/blob/b282b1187d47caed8d7f25bf92810aa3d80760db/src/main/converter/opds.ts#L92

@danielweck Did you encounter some issues in the wild?

@danielweck
Copy link
Member

"demoreader" @ cantookstation adds a base64 layer to the lcp_hashed_passphrase hex encoding.

@mickael-menu
Copy link
Member Author

Okay that's expected according to the spec:

Note about the computation of the base64-encoded value of the hashed passphrase: from the hashed value of the passphrase, expressed as an hex-encoded string, calculate a byte array (32-bytes / 256-bits binary buffer); for instance, “4981AA…” becomes [49, 81, 170, …]. The expected value is the Base64 encoding of this byte array. Note that a base64 conversion is usually implicitly applied to byte arrays when converted to json structures.

@danielweck
Copy link
Member

Ah yes, I have to update the console messages in Thorium :)
https://readium.org/lcp-specs/notes/lcp-key-retrieval.html#the-lcp_hashed_passphrase-element

We've had this dual hex / base64+hex string handling code in Thorium for a while, because at the time the test OPDS feeds implemented different syntaxes. From memory, I'm not sure which ones used the correct base64+hex method.

@danielweck
Copy link
Member

danielweck commented Oct 20, 2021

Encoding

NodeJS

const b64Str = Buffer.from(hexStr, "hex").toString("base64");

Decoding

NodeJS

const hexStr = Buffer.from(b64Str, "base64").toString("hex");

NodeJS example

(execute on the command line with node filename.js)

const crypto = require("crypto");
const pass = "LCP passphrase この世界の謎 \uD83D\uDE00"; // 0x1F600
const checkSum = crypto.createHash("sha256");
checkSum.update(pass);

// "ee06d6201c88d0e51c123cb0464a68bb8b0d4c6b0c413400320375c36b887759"
const hexStr = checkSum.digest("hex");

// ENCODE:

// "7gbWIByI0OUcEjywRkpou4sNTGsMQTQAMgN1w2uId1k="
const b64Str = Buffer.from(hexStr, "hex").toString("base64");
console.log(`--------- ENCODE:
"${pass}"
=> "${hexStr}"
=> "${b64Str}"`);

// DECODE:

// "ee06d6201c88d0e51c123cb0464a68bb8b0d4c6b0c413400320375c36b887759"
const hexStr_ = Buffer.from(b64Str, "base64").toString("hex");
console.log(`--------- DECODE:
"${b64Str}"
=> "${hexStr_}"`);

@danielweck
Copy link
Member

danielweck commented Oct 20, 2021

Encoding

Javascript (Web Browser, not NodeJS)

const b64Str = self.btoa((new Uint8Array(hexStr.match(/.{1,2}/g).map(b => parseInt(b, 16)))).reduce((s, b) => s + String.fromCharCode(b), ""));

Decoding

Javascript (Web Browser, not NodeJS)

const hexStr = Array.from(self.atob(b64Str)).map(c => c.charCodeAt(0).toString(16).padStart(2, '0')).join('');

Javascript example (Web Browser, not NodeJS)

(copy+paste into Web Inspector to execute, see console log)

(async () => {
    const pass = "LCP passphrase この世界の謎 \uD83D\uDE00"; // 0x1F600
    const checkSumArrayBuffer = await crypto.subtle.digest('SHA-256', (new TextEncoder("utf-8")).encode(pass));

    // "ee06d6201c88d0e51c123cb0464a68bb8b0d4c6b0c413400320375c36b887759"
    const hexStr = Array.from(new Uint8Array(checkSumArrayBuffer)).map(b => b.toString(16).padStart(2, '0')).join('');

    // ENCODE:

    // "7gbWIByI0OUcEjywRkpou4sNTGsMQTQAMgN1w2uId1k="

    // Method 1 (from hex string)
    const b64Str1 = self.btoa((new Uint8Array(hexStr.match(/.{1,2}/g).map(b => parseInt(b, 16)))).reduce((s, b) => s + String.fromCharCode(b), ""));

    // Method 2 (from hex buffer, and simplified fromCharCode apply)
    const b64Str2 = self.btoa(String.fromCharCode.apply(null, new Uint8Array(checkSumArrayBuffer)));

    console.log(`--------- ENCODE:
    "${pass}"
    => "${hexStr}"
    => "${b64Str1}"
    => "${b64Str2}"`);

    // DECODE:

    // "ee06d6201c88d0e51c123cb0464a68bb8b0d4c6b0c413400320375c36b887759"
    const hexStr_ = Array.from(self.atob(b64Str1)).map(c => c.charCodeAt(0).toString(16).padStart(2, '0')).join('');
    console.log(`--------- DECODE:
    "${b64Str1}"
    => "${hexStr_}"`);
})();

@mickael-menu
Copy link
Member Author

Thanks Daniel! These JavaScript code samples could be really useful to build a static HTML page to help validate an implementation with a dynamic form to decode/encode an input.

@danielweck
Copy link
Member

I think that the key takeaway is that the base64 encoding layer applies to the hex buffer, not to the hex string (mistake easily made). So once this is clear, it really just boils down to using a suitable standard library or third-party APIs to convert the string / byte sequence. There are a few different ways of doing this on the Open Web Platform, if you know a simpler method please chime in :)
(my current proposal is quite convoluted)

I need to update my NodeJS and WebJS examples with unicode characters and surrogate pairs (in the original non-hashed LCP passphrase), just to make sure I am converting the byte sequence correctly.

@danielweck
Copy link
Member

FYI, I updated both the NodeJS and WebJS examples with unicode (i.e. Japanese characters and a smile emoji 0x1F600 represented by its surrogate pair):
"LCP passphrase この世界の謎 \uD83D\uDE00"

@danielweck
Copy link
Member

danielweck commented Oct 21, 2021

Here is an alternative Kotlin example, using java.util.Base64 instead of android.util.Base64:

import java.util.Base64

    val hexStr = Base64.getDecoder().decode(base64Str)
    	.map { String.format("%02x", it) }
    	.joinToString(separator = "")

Kotlin REPL:

https://pl.kotl.in/G_pMqv90L

import java.util.Base64

fun main() {

    val hexStr = Base64.getDecoder().decode("7gbWIByI0OUcEjywRkpou4sNTGsMQTQAMgN1w2uId1k=")
    	.map { String.format("%02x", it) }
    	.joinToString(separator = "")

    println(hexStr) // "ee06d6201c88d0e51c123cb0464a68bb8b0d4c6b0c413400320375c36b887759"
}

@danielweck
Copy link
Member

danielweck commented Oct 21, 2021

Swift REPL(it):

https://replit.com/@danielweck/LCPPassBase64Hex

import Foundation

let hexStr = Data(base64Encoded: "7gbWIByI0OUcEjywRkpou4sNTGsMQTQAMgN1w2uId1k=")
    .map { [UInt8]($0) }?
    .map { String(format: "%02x", $0) }
    .joined()

print(hexStr) // "ee06d6201c88d0e51c123cb0464a68bb8b0d4c6b0c413400320375c36b887759"

@danielweck
Copy link
Member

PHP REPL(it):

https://replit.com/@danielweck/LCPPassBase64HexPHP

<?php
$hexStr = base64_encode(hash('sha256', "LCP passphrase この世界の謎 😀", true));
echo $hexStr; // "7gbWIByI0OUcEjywRkpou4sNTGsMQTQAMgN1w2uId1k="

@danielweck
Copy link
Member

A somewhat related issue: #52

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants