You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Every time the data is updated the persistent_assignee_disambig and persistent_inventor_disambig tables grow an additional column which seems like an unnecessary burden on users and updaters.
In problem domains where you have the luxury of being able to assign permanent IDs to entities, this problem is fairly easy to solve, but I can understand why this problem domain could make that tricky (impossible?).
Maybe the structure of these tables could be made invariant by using columns that could capture the history without changing the schema: some version of current_id, previous_id, start_date, end_date.
Below I sketch a couple of versions that occur to me. Some require greater/lesser code changes, but the code changes could be weighed against greater ease of updating the data and less burden on users.
Thanks for considering.
Probably easiest from end-users' perspective
Requires update of all records in the history of given id
Sacrifices space efficiency and update ease for end-user ease
create table if not exists patentsview_post_grant.toy_table2(
current_id int,
previous_id int NULL,
start_date date,
end_date date NULL
);
Hi Scott! Thank you for the suggestion. We are currently reviewing the table set up for ALL of our bulk download files and we will consider your ideas as feedback into that review. When/if we do push a redesign, it will likely be sometime closer to this coming summer.
Thanks, Emily.
Nice to hear from you.
Scott
On Wednesday, January 20, 2021, 10:00:07 AM PST, emelluso <notifications@github.com> wrote:
Hi Scott! Thank you for the suggestion. We are currently reviewing the table set up for ALL of our bulk download files and we will consider your ideas as feedback into that review. When/if we do push a redesign, it will likely be sometime closer to this coming summer.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub, or unsubscribe.
Every time the data is updated the persistent_assignee_disambig and persistent_inventor_disambig tables grow an additional column which seems like an unnecessary burden on users and updaters.
In problem domains where you have the luxury of being able to assign permanent IDs to entities, this problem is fairly easy to solve, but I can understand why this problem domain could make that tricky (impossible?).
Maybe the structure of these tables could be made invariant by using columns that could capture the history without changing the schema: some version of current_id, previous_id, start_date, end_date.
Below I sketch a couple of versions that occur to me. Some require greater/lesser code changes, but the code changes could be weighed against greater ease of updating the data and less burden on users.
Thanks for considering.
Probably easiest from end-users' perspective
Requires update of all records in the history of given id
Sacrifices space efficiency and update ease for end-user ease
create table if not exists patentsview_post_grant.toy_table2(
current_id int,
previous_id int NULL,
start_date date,
end_date date NULL
);
insert into
patentsview_post_grant.toy_table3(current_id, previous_id, start_date, end_date)
values
(4, 1, '2000-01-01', '2000-01-02'),
(4, 2, '2000-01-02', '2000-01-03'),
(4, 3, '2000-01-03', '2000-01-04'),
(4, 3, '2000-01-04', null);
Seems fairly easy for end-user
Requires only to update the previous record (and current record)
create table if not exists patentsview_post_grant.toy_table3(
id int NOT NULL,
from_id int NULL,
to_id int NULL,
as_of date NOT NULL
);
insert into
patentsview_post_grant.toy_table3(id, from_id, to_id, as_of)
values
(1, null, 2, '2000-01-01'),
(2, 1, 3, '2000-01-02'),
(3, 2, 4, '2000-01-03'),
(4, 3, null, '2000-01-04');
More space-efficient
Only requires update for current record
Drawback is that you can't tell by inspection that the
last record is the final record
Makes me think the last record should (also) go in a separate table:
ie, have a current table and a history table, although that does
involve the additional step of deleting the previous record from
the 'current' table
create table if not exists patentsview_post_grant.toy_table4(
id int NOT NULL,
from_id int NULL,
as_of date NOT NULL
);
insert into
patentsview_post_grant.toy_table4(id, to_id, as_of)
values
(1, null, '2000-01-01'),
(2, 1, '2000-01-02'),
(3, 2, '2000-01-03'),
(4, 3, '2000-01-04');
More space-efficient
Requires update for current record and previous
Can tell by inspection which record is the final record
create table if not exists patentsview_post_grant.toy_table5(
id int NOT NULL,
to_id int NULL,
as_of date NOT NULL
);
insert into
patentsview_post_grant.toy_table5(id, to_id, as_of)
values
(1, 2, '2000-01-01'),
(2, 3, '2000-01-02'),
(3, 4, '2000-01-03'),
(4, null, '2000-01-04');
The text was updated successfully, but these errors were encountered: