Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(x/data)!: enable off-chain coordination of supported algorithms #2102

Merged
merged 25 commits into from
Jan 30, 2024
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1,943 changes: 1,718 additions & 225 deletions api/regen/data/v1/types.pulsar.go

Large diffs are not rendered by default.

86 changes: 76 additions & 10 deletions proto/regen/data/v1/types.proto
Original file line number Diff line number Diff line change
Expand Up @@ -5,21 +5,39 @@ package regen.data.v1;
option go_package = "github.com/regen-network/regen-ledger/x/data";

// ContentHash specifies a hash-based content identifier for a piece of data.
// Exactly one of its fields must be set so this message behaves like a oneof.
// A protobuf oneof was not used because this caused compatibility issues with
// amino signing.
message ContentHash {

// Deprecated: use RawV2 instead.
// Raw specifies "raw" data which does not specify a deterministic, canonical
// encoding. Users of these hashes MUST maintain a copy of the hashed data
// which is preserved bit by bit. All other content encodings specify a
// deterministic, canonical encoding allowing implementations to choose from a
// variety of alternative formats for transport and encoding while maintaining
// the guarantee that the canonical hash will not change. The media type for
// "raw" data is defined by the MediaType enum.
Raw raw = 1;
Raw raw = 1 [deprecated = true];

// Deprecated: use GraphV2 instead.
// Graph specifies graph data that conforms to the RDF data model.
// The canonicalization algorithm used for an RDF graph is specified by
// GraphCanonicalizationAlgorithm.
Graph graph = 2;
Graph graph = 2 [deprecated = true];

// raw_v2 specifies "raw" data which does not specify a deterministic, canonical
// encoding. Users of these hashes MUST maintain a copy of the hashed data
// which is preserved bit by bit. All other content encodings specify a
// deterministic, canonical encoding allowing implementations to choose from a
// variety of alternative formats for transport and encoding while maintaining
// the guarantee that the canonical hash will not change. The media type for
// "raw" data is defined by the MediaType enum.
aaronc marked this conversation as resolved.
Show resolved Hide resolved
RawV2 raw_v2 = 3;

// graph_v2 specifies graph data that conforms to the RDF data model.
// The canonicalization algorithm used for an RDF graph is specified by
// GraphCanonicalizationAlgorithm.
GraphV2 graph_v2 = 4;

// Raw is the content hash type used for raw data.
message Raw {
Expand Down Expand Up @@ -50,21 +68,64 @@ message ContentHash {
// merkle_tree is the merkle tree type used for the graph hash, if any.
GraphMerkleTree merkle_tree = 4;
}

// RawV2 is the content hash type used for raw data.
message RawV2 {
// hash represents the hash of the data based on the specified
// digest_algorithm. It must be at least 20 bytes long and at most 64 bytes long.
bytes hash = 1;

// digest_algorithm represents the hash digest algorithm and should be a non-zero value from the DigestAlgorithm enum.
uint32 digest_algorithm = 2;

// file_extension represents the file extension for raw data. It can be
// must be between 2-6 characters long, must be all lower-case and should represent
aaronc marked this conversation as resolved.
Show resolved Hide resolved
// the canonical extension for the media type.
//
// A list of canonical extensions which should be used is provided here
// and SHOULD be used by implementations: txt, json, csv, xml, pdf, tiff,
// jpg, png, svg, webp, avif, gif, apng, mpeg, mp4, webm, ogg, heic, raw.
//
// The above list should be updated as new media types come into common usage
// especially when there are two or more possible extensions (i.e. jpg vs jpeg or tif vs tiff).
string file_extension = 3;
}

// GraphV2 is the content hash type used for RDF graph data.
message GraphV2 {
// hash represents the hash of the data based on the specified
// digest_algorithm. It must be at least 20 bytes long and at most 64 bytes long.
bytes hash = 1;

// digest_algorithm represents the hash digest algorithm and should be a non-zero value from the DigestAlgorithm enum.
uint32 digest_algorithm = 2;

// graph_canonicalization_algorithm represents the RDF graph
// canonicalization algorithm and should be a non-zero value from the GraphCanonicalizationAlgorithm enum.
uint32 canonicalization_algorithm = 3;

// merkle_tree is the merkle tree type used for the graph hash, if any and should be a value from the GraphMerkleTree enum
// or left unspecified.
uint32 merkle_tree = 4;
}
}

// DigestAlgorithm is the hash digest algorithm
//
// With V2 of raw and graph hash, this enum is no longer validated on-chain.
// However, this enum SHOULD still be used and updated as a registry of known digest
// algorithms and all implementations should coordinate on these values.
enum DigestAlgorithm {

// unspecified and invalid
DIGEST_ALGORITHM_UNSPECIFIED = 0;

// BLAKE2b-256
DIGEST_ALGORITHM_BLAKE2B_256 = 1;
}

// Deprecated: use RawV2 instead.
// RawMediaType defines MIME media types to be used with a ContentHash.Raw hash.
enum RawMediaType {

// RAW_MEDIA_TYPE_UNSPECIFIED can be used for raw binary data
RAW_MEDIA_TYPE_UNSPECIFIED = 0;

Expand Down Expand Up @@ -127,25 +188,30 @@ enum RawMediaType {
}

// GraphCanonicalizationAlgorithm is the graph canonicalization algorithm
//
// With V2 of the graph hash, this enum is no longer validated on-chain.
// However, this enum SHOULD still be used and updated as a registry of known canonicalization
// algorithms and all implementations should coordinate on these values.
enum GraphCanonicalizationAlgorithm {

// unspecified and invalid
GRAPH_CANONICALIZATION_ALGORITHM_UNSPECIFIED = 0;

// URDNA2015 graph hashing
GRAPH_CANONICALIZATION_ALGORITHM_URDNA2015 = 1;
}

// GraphMerkleTree is the graph merkle tree type used for hashing, if any
// GraphMerkleTree is the graph merkle tree type used for hashing, if any.
//
// With V2 of the graph hash, this enum is no longer validated on-chain.
// However, this enum SHOULD still be used and updated as a registry of known merkle tree
// types and all implementations should coordinate on these values.
enum GraphMerkleTree {

// unspecified and valid
GRAPH_MERKLE_TREE_NONE_UNSPECIFIED = 0;
}

// ContentHashes contains list of content ContentHash.
message ContentHashes {

// data is a list of content hashes which the resolver claims to serve.
repeated ContentHash content_hashes = 1;
}
}
2 changes: 0 additions & 2 deletions scripts/protocgen.sh
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,4 @@ cd ..
cp -r github.com/regen-network/regen-ledger/* ./
rm -rf github.com

go mod tidy
aaronc marked this conversation as resolved.
Show resolved Hide resolved

./scripts/protocgen2.sh
2 changes: 1 addition & 1 deletion scripts/protocgen2.sh
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ set -eo pipefail
protoc_gen_install() {
go install github.com/cosmos/cosmos-proto/cmd/protoc-gen-go-pulsar@latest #2>/dev/null
go install google.golang.org/grpc/cmd/protoc-gen-go-grpc@latest #2>/dev/null
go install github.com/cosmos/cosmos-sdk/orm/cmd/protoc-gen-go-cosmos-orm@latest #2>/dev/null
go install github.com/cosmos/cosmos-sdk/orm/cmd/protoc-gen-go-cosmos-orm@v1.0.0-alpha.12 #2>/dev/null
aaronc marked this conversation as resolved.
Show resolved Hide resolved
}

protoc_gen_install
Expand Down
4 changes: 2 additions & 2 deletions x/data/features/types_content_hash.feature
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ Feature: Types
{}
"""
When the content hash is validated
Then expect the error "content hash must be one of raw type or graph type: invalid request"
Then expect the error "exactly one of ContentHash's fields should be set: invalid request"

Scenario: an error is returned if content hash includes both raw type and graph type
Given the content hash
Expand All @@ -46,7 +46,7 @@ Feature: Types
}
"""
When the content hash is validated
Then expect the error "content hash must be one of raw type or graph type: invalid request"
Then expect the error "exactly one of ContentHash's fields should be set: invalid request"

Scenario: an error is returned if raw content hash is empty
Given the content hash
Expand Down
Loading
Loading