Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: use signoz ksuid for epoch nano in log id's #489

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

nityanandagohain
Copy link
Member

@nityanandagohain
Copy link
Member Author

The issue with releasing this is that,

till the n'th log we have the old ksuid.
from the n+1 th log we will have the new ksuid.

And there is no guarantee that id of old ksuid < id of the new ksuid.

Now while paginating if we somehow land at a case were the query is id < id(n +1 th log) then the correctness of that query is not gauranteed. It's a rare case but it might happen. I am not sure what to do. From the surface level it's fine that it has a very rare occurance but it can happen.

@srikanthccv any suggestion here ?

@srikanthccv
Copy link
Member

Can you share some examples of such cases with IDs?

@nityanandagohain
Copy link
Member Author

here it is

old

➜  signoz git:(feat/migration-0.63) ✗ ksuid -f inspect 2qLQYqwa6VEthnfn1V24S1IjF9b

REPRESENTATION:

  String: 2qLQYqwa6VEthnfn1V24S1IjF9b
     Raw: 13EF2401C0C97A71DDF0D11679A1D74468CA4327

COMPONENTS:

       Time: 2024-12-17 19:43:45 +0700 +07
  Timestamp: 334439425
    Payload: C0C97A71DDF0D11679A1D74468CA4327

new

➜  ksuid git:(feat/epoch_nano_27_byte) ✗ go run cmd/ksuid/main.go -f inspect 0f3dVsDSIGcEPQwqDhHCUaKYIUV

REPRESENTATION:

  String: 0f3dVsDSIGcEPQwqDhHCUaKYIUV
     Raw: 04A42AE74572B280E8853C66C2E1AA2A66E137DB

COMPONENTS:

       Time: 2024-12-17 19:44:05.123216 +0700 +07
  Timestamp: 334439445123216000
    Payload: E8853C66C2E1AA2A66E137DB

When I compare

>>> '0f3dVsDSIGcEPQwqDhHCUaKYIUV' > '2qLQYqwa6VEthnfn1V24S1IjF9b'
False

@srikanthccv
Copy link
Member

Do we get any particular advantage by using id in the filter? Can I prepare the query to get rid of the id filter and replace it with the time component of the filter, given the table is ordered by (ts_bucket_start, resource_fingerprint, severity_text, timestamp, id)? For instance, if the incoming request is id < 2qLQYqwa6VEthnfn1V24S1IjF9b limit 100 can I transparently replace it with timestamp < X limit 100?

@nityanandagohain
Copy link
Member Author

The reason we have id instead of timestamp is that timestamp is not really unique.

In our case id is basically giving more uniqueness to the timestamp and is allowing sorting on logs based on how they are received in otel collector for logs with similar timestamp.

@srikanthccv
Copy link
Member

I understand the lack of strict uniqueness of timestamp, but that doesn't necessarily mean I can't use the filter by timestamp instead of ID? like give me a scenario where I can't substitute id < 2qLQYqwa6VEthnfn1V24S1IjF9b limit 100 with timestamp < X limit 100, given that the table is already ordered by (ts_bucket_start, resource_fingerprint, severity_text, timestamp, id)

@nityanandagohain
Copy link
Member Author

okay for timestamp < X limit 100 how will you filter the next 100 logs if the timestamp for all of them is same ? use offset ?

@srikanthccv
Copy link
Member

Yes, as long as the order is consistent (something to verify with timestamp, id), can we use limit + offset with timestamp?

@srikanthccv
Copy link
Member

@raj-k-singh, share your thoughts on this.

@raj-k-singh
Copy link
Contributor

like give me a scenario where I can't substitute id < 2qLQYqwa6VEthnfn1V24S1IjF9b limit 100 with timestamp < X limit 100, given that the table is already ordered by (ts_bucket_start, resource_fingerprint, severity_text, timestamp, id)

For example, let's say we have 10 logs with timestamp X and we are paginating through the logs with ascending timestamps. If a page of results ends at the 4th log with timestamp X, we can't construct the query for the next page starting at the 5th log with timestamp X with filter like timestamp >= X limit 100

@raj-k-singh
Copy link
Contributor

@nityanandagohain do we use (timestamp, id) both for pagination filter?

If yes, if we can ensure that the transition to the new ksuid only happens at a timestamp boundary things should work out?

i.e all logs with timestamp >= t0 will have the new ksuid while the ones with timestamp < t0 will have the old ksuid, with that if we use a filter like (timestamp >= t0 and id > XYZ) things should work out?

@srikanthccv
Copy link
Member

For example, let's say we have 10 logs with timestamp X and we are paginating through the logs with ascending timestamps. If a page of results ends at the 4th log with timestamp X, we can't construct the query for the next page starting at the 5th log with timestamp X with filter like timestamp >= X limit 100

If we add offset 4 (wherever the page ended), does it not give the desired result?

@raj-k-singh
Copy link
Contributor

For example, let's say we have 10 logs with timestamp X and we are paginating through the logs with ascending timestamps. If a page of results ends at the 4th log with timestamp X, we can't construct the query for the next page starting at the 5th log with timestamp X with filter like timestamp >= X limit 100

If we add offset 4 (wherever the page ended), does it not give the desired result?

Ya offset should also work 👍

@nityanandagohain
Copy link
Member Author

So as of now ordering is done by timestamp and in the filters we pass the entire timerange and i.e during pagination the timerange remains the same and only the id filter gets updated. but here since we are just using order by timestamp it might be wrong as well as it should have order by id included in it.

I will think more on this. I will pick this up once I am back

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[LOGS][EPIC]: Issue in pagination due to second presion of ksuid.
3 participants