Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

H-3543: Add interface to read closed multi-entity type schemas #5578

Conversation

TimDiekmann
Copy link
Member

🌟 What is the purpose of this PR?

We are able to query closed multi-entity types from the graph but it would be more helpful if it could be passed when querying entities.

🔍 What does this change?

  • Add an includeClosedMultiEntityTypes = false to the parameters when requesting entities or an entity subgraph
  • Create getClosedMultiEntityTypesFromResponse in the hash-graph-sdk which takes the response object and the entity type ids. That should give the same type as a query to the graph with the same entity type ids.

Pre-Merge Checklist 🚀

🚢 Has this modified a publishable library?

This PR:

  • does not modify any publishable blocks or libraries, or modifications do not need publishing

📜 Does this require a change to the docs?

The changes in this PR:

  • are internal and do not require a docs change

🕸️ Does this require a change to the Turbo Graph?

The changes in this PR:

  • do not affect the execution graph

⚠️ Known issues

The graph implementation might not be the fastest, but until we benchmark this code path there is no point in optimizing this.

🛡 What tests cover this?

Tests were added to compare the results of the returned query against the schema returned from the graph when querying the types independently.

@TimDiekmann TimDiekmann self-assigned this Nov 4, 2024
@github-actions github-actions bot added area/apps > hash* Affects HASH (a `hash-*` app) area/libs Relates to first-party libraries/crates/packages (area) type/eng > backend Owned by the @backend team area/tests New or updated tests area/tests > integration New or updated integration tests area/apps area/apps > hash-graph labels Nov 4, 2024
@TimDiekmann TimDiekmann marked this pull request as ready for review November 4, 2024 16:15
@TimDiekmann TimDiekmann requested a review from CiaranMn November 4, 2024 16:15
Comment on lines 481 to 490
const [firstEntityTypeId, ...restEntityTypesIds] = entityTypesIds.sort();
let currentClosedMultiEntityTypeMap =
response.closedMultiEntityTypes[firstEntityTypeId];

for (const id of restEntityTypesIds) {
if (!currentClosedMultiEntityTypeMap?.inner) {
return;
}
currentClosedMultiEntityTypeMap = currentClosedMultiEntityTypeMap.inner[id];
}
Copy link
Member

@CiaranMn CiaranMn Nov 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would like to understand what is going on here, I would have thought a closed multi-entity type would be keyed under some combination of all its typeIds but here we have items keyed by a single typeId and are returning whatever the item in the map with the last entityTypeId in the list is (unless one is missing, in which case we return nothing). Why is it like this, and what happens if there are multiple multi-type entities which have the same entityTypeId amongst their types?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, it's a nested map which is nested to however many levels there are numbers of types for the entity... is this what we want long-term, or is it just temporary? I thought we'd just sort and concatenate the typeIds or something. Is there a benefit to doing it this way?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if we would keep it longer term. This is the reason why I have "hidden" the implementation behind the response object. I'm currently writing a few comments to explain this more. Also, I try to rewrite it to use reduce instead.
The reason why I used this is because it's easier to build and does not need to reallocate strings when concatenating strings. It can be iteratively be built in the graph as it's layered.

return;
}

const [firstEntityTypeId, ...restEntityTypesIds] = entityTypesIds.sort();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure it really matters here, but to note that sort mutates the original array, we can use this newish alternative to avoid mutating the input array

Suggested change
const [firstEntityTypeId, ...restEntityTypesIds] = entityTypesIds.sort();
const [firstEntityTypeId, ...restEntityTypesIds] = entityTypesIds.toSorted();

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is one of the reasons why I like Rust, if there is a side effect it's impossible to have an existing reference to entityTypeIds, so it's possible to just sort the array and profit from it later again if we call sort() again (I assume O(n) for sorted arrays).
I changed it to toSorted but had to add ! to firstEntityTypeId as it does not retain the exact type of the array.

Copy link

codecov bot commented Nov 5, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 20.09%. Comparing base (fb84b51) to head (84352e5).
Report is 67 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #5578      +/-   ##
==========================================
- Coverage   20.15%   20.09%   -0.06%     
==========================================
  Files         509      514       +5     
  Lines       17319    17368      +49     
  Branches     2538     2545       +7     
==========================================
  Hits         3490     3490              
- Misses      13791    13840      +49     
  Partials       38       38              
Flag Coverage Δ
apps.hash-ai-worker-ts 1.38% <ø> (ø)
apps.hash-api 1.17% <ø> (-0.02%) ⬇️
local.hash-backend-utils 8.80% <ø> (ø)
local.hash-isomorphic-utils 1.05% <ø> (-0.01%) ⬇️
local.hash-subgraph 24.54% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

github-actions bot commented Nov 5, 2024

Benchmark results

@rust/graph-benches – Integrations

scaling_read_entity_complete_one_depth

Function Value Mean Flame graphs
entity_by_id 25 entities $$75.6 \mathrm{ms} \pm 377 \mathrm{μs}\left({\color{gray}0.606 \mathrm{\%}}\right) $$ Flame Graph
entity_by_id 5 entities $$25.3 \mathrm{ms} \pm 312 \mathrm{μs}\left({\color{gray}-0.543 \mathrm{\%}}\right) $$ Flame Graph
entity_by_id 1 entities $$19.9 \mathrm{ms} \pm 116 \mathrm{μs}\left({\color{gray}-1.571 \mathrm{\%}}\right) $$ Flame Graph
entity_by_id 10 entities $$31.1 \mathrm{ms} \pm 215 \mathrm{μs}\left({\color{lightgreen}-40.786 \mathrm{\%}}\right) $$ Flame Graph
entity_by_id 50 entities $$558 \mathrm{ms} \pm 1.66 \mathrm{ms}\left({\color{red}103 \mathrm{\%}}\right) $$ Flame Graph

representative_read_entity

Function Value Mean Flame graphs
entity_by_id entity type ID: https://blockprotocol.org/@alice/types/entity-type/book/v/1 $$16.5 \mathrm{ms} \pm 193 \mathrm{μs}\left({\color{gray}0.771 \mathrm{\%}}\right) $$ Flame Graph
entity_by_id entity type ID: https://blockprotocol.org/@alice/types/entity-type/block/v/1 $$16.0 \mathrm{ms} \pm 189 \mathrm{μs}\left({\color{gray}-2.598 \mathrm{\%}}\right) $$ Flame Graph
entity_by_id entity type ID: https://blockprotocol.org/@alice/types/entity-type/organization/v/1 $$16.9 \mathrm{ms} \pm 208 \mathrm{μs}\left({\color{gray}-0.432 \mathrm{\%}}\right) $$ Flame Graph
entity_by_id entity type ID: https://blockprotocol.org/@alice/types/entity-type/song/v/1 $$16.9 \mathrm{ms} \pm 215 \mathrm{μs}\left({\color{gray}2.46 \mathrm{\%}}\right) $$ Flame Graph
entity_by_id entity type ID: https://blockprotocol.org/@alice/types/entity-type/page/v/2 $$17.3 \mathrm{ms} \pm 208 \mathrm{μs}\left({\color{red}6.88 \mathrm{\%}}\right) $$ Flame Graph
entity_by_id entity type ID: https://blockprotocol.org/@alice/types/entity-type/person/v/1 $$16.5 \mathrm{ms} \pm 162 \mathrm{μs}\left({\color{gray}0.819 \mathrm{\%}}\right) $$ Flame Graph
entity_by_id entity type ID: https://blockprotocol.org/@alice/types/entity-type/uk-address/v/1 $$17.4 \mathrm{ms} \pm 189 \mathrm{μs}\left({\color{red}5.28 \mathrm{\%}}\right) $$ Flame Graph
entity_by_id entity type ID: https://blockprotocol.org/@alice/types/entity-type/building/v/1 $$16.7 \mathrm{ms} \pm 185 \mathrm{μs}\left({\color{gray}4.00 \mathrm{\%}}\right) $$ Flame Graph
entity_by_id entity type ID: https://blockprotocol.org/@alice/types/entity-type/playlist/v/1 $$16.2 \mathrm{ms} \pm 168 \mathrm{μs}\left({\color{gray}1.04 \mathrm{\%}}\right) $$ Flame Graph

scaling_read_entity_complete_zero_depth

Function Value Mean Flame graphs
entity_by_id 25 entities $$3.29 \mathrm{ms} \pm 11.4 \mathrm{μs}\left({\color{gray}2.75 \mathrm{\%}}\right) $$ Flame Graph
entity_by_id 5 entities $$1.94 \mathrm{ms} \pm 7.45 \mathrm{μs}\left({\color{gray}0.561 \mathrm{\%}}\right) $$ Flame Graph
entity_by_id 1 entities $$1.85 \mathrm{ms} \pm 7.59 \mathrm{μs}\left({\color{gray}0.636 \mathrm{\%}}\right) $$ Flame Graph
entity_by_id 10 entities $$2.11 \mathrm{ms} \pm 12.8 \mathrm{μs}\left({\color{gray}0.561 \mathrm{\%}}\right) $$ Flame Graph
entity_by_id 50 entities $$4.24 \mathrm{ms} \pm 16.0 \mathrm{μs}\left({\color{lightgreen}-19.828 \mathrm{\%}}\right) $$ Flame Graph

scaling_read_entity_linkless

Function Value Mean Flame graphs
entity_by_id 1 entities $$1.86 \mathrm{ms} \pm 7.83 \mathrm{μs}\left({\color{gray}-1.345 \mathrm{\%}}\right) $$ Flame Graph
entity_by_id 10000 entities $$13.6 \mathrm{ms} \pm 72.4 \mathrm{μs}\left({\color{gray}-0.247 \mathrm{\%}}\right) $$ Flame Graph
entity_by_id 100 entities $$2.05 \mathrm{ms} \pm 6.84 \mathrm{μs}\left({\color{gray}0.200 \mathrm{\%}}\right) $$ Flame Graph
entity_by_id 1000 entities $$2.77 \mathrm{ms} \pm 11.4 \mathrm{μs}\left({\color{gray}-0.511 \mathrm{\%}}\right) $$ Flame Graph
entity_by_id 10 entities $$1.89 \mathrm{ms} \pm 8.77 \mathrm{μs}\left({\color{gray}-0.094 \mathrm{\%}}\right) $$ Flame Graph

representative_read_multiple_entities

Function Value Mean Flame graphs
link_by_source_by_property depths: DT=255, PT=255, ET=255, E=255 $$108 \mathrm{ms} \pm 476 \mathrm{μs}\left({\color{gray}0.525 \mathrm{\%}}\right) $$ Flame Graph
link_by_source_by_property depths: DT=0, PT=0, ET=2, E=2 $$89.5 \mathrm{ms} \pm 312 \mathrm{μs}\left({\color{gray}0.200 \mathrm{\%}}\right) $$ Flame Graph
link_by_source_by_property depths: DT=2, PT=2, ET=2, E=2 $$98.5 \mathrm{ms} \pm 474 \mathrm{μs}\left({\color{gray}0.486 \mathrm{\%}}\right) $$ Flame Graph
link_by_source_by_property depths: DT=0, PT=0, ET=0, E=0 $$43.2 \mathrm{ms} \pm 197 \mathrm{μs}\left({\color{gray}0.814 \mathrm{\%}}\right) $$ Flame Graph
link_by_source_by_property depths: DT=0, PT=2, ET=2, E=2 $$94.4 \mathrm{ms} \pm 361 \mathrm{μs}\left({\color{gray}1.03 \mathrm{\%}}\right) $$ Flame Graph
link_by_source_by_property depths: DT=0, PT=0, ET=0, E=2 $$82.5 \mathrm{ms} \pm 364 \mathrm{μs}\left({\color{gray}1.77 \mathrm{\%}}\right) $$ Flame Graph
entity_by_property depths: DT=255, PT=255, ET=255, E=255 $$69.2 \mathrm{ms} \pm 356 \mathrm{μs}\left({\color{gray}0.430 \mathrm{\%}}\right) $$ Flame Graph
entity_by_property depths: DT=0, PT=0, ET=2, E=2 $$50.6 \mathrm{ms} \pm 288 \mathrm{μs}\left({\color{gray}1.16 \mathrm{\%}}\right) $$ Flame Graph
entity_by_property depths: DT=2, PT=2, ET=2, E=2 $$59.0 \mathrm{ms} \pm 263 \mathrm{μs}\left({\color{gray}-0.105 \mathrm{\%}}\right) $$ Flame Graph
entity_by_property depths: DT=0, PT=0, ET=0, E=0 $$40.6 \mathrm{ms} \pm 220 \mathrm{μs}\left({\color{gray}1.39 \mathrm{\%}}\right) $$ Flame Graph
entity_by_property depths: DT=0, PT=2, ET=2, E=2 $$54.7 \mathrm{ms} \pm 234 \mathrm{μs}\left({\color{gray}0.420 \mathrm{\%}}\right) $$ Flame Graph
entity_by_property depths: DT=0, PT=0, ET=0, E=2 $$45.0 \mathrm{ms} \pm 179 \mathrm{μs}\left({\color{gray}0.820 \mathrm{\%}}\right) $$ Flame Graph

representative_read_entity_type

Function Value Mean Flame graphs
get_entity_type_by_id Account ID: d4e16033-c281-4cde-aa35-9085bf2e7579 $$1.38 \mathrm{ms} \pm 5.65 \mathrm{μs}\left({\color{gray}-0.706 \mathrm{\%}}\right) $$ Flame Graph

@TimDiekmann TimDiekmann added this pull request to the merge queue Nov 7, 2024
Merged via the queue into main with commit 76bc7bd Nov 7, 2024
102 checks passed
@TimDiekmann TimDiekmann deleted the t/h-3543-allow-querying-for-closed-entity-types-alongside-the branch November 7, 2024 10:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/apps > hash* Affects HASH (a `hash-*` app) area/apps > hash-graph area/apps area/libs Relates to first-party libraries/crates/packages (area) area/tests > integration New or updated integration tests area/tests New or updated tests type/eng > backend Owned by the @backend team
Development

Successfully merging this pull request may close these issues.

2 participants