Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Duplicate dev_nonce's and data loss #566

Closed
danieroux opened this issue Dec 15, 2021 · 3 comments
Closed

Duplicate dev_nonce's and data loss #566

danieroux opened this issue Dec 15, 2021 · 3 comments

Comments

@danieroux
Copy link

danieroux commented Dec 15, 2021

Duplicate dev_nonce's, for the same dev_addr and created_at leads to data being lost.

We’ve been seeing devices successfully re-joining and the data not reaching the application-server.

In these cases, there are no reported errors in device_error. Particularly: “validate dev-nonce error” is not present in these cases.

We traced it to this. Whenever we see this situation, we lose the data. This happens on the latest Chirpstack version as of the 15th of November:

dev_eui dev_addr created_at dev_nonce
ffffffffffffffe 015e5a93 2021-11-18T20:58:08.489875Z 4777
ffffffffffffffe 009ed597 2021-11-18T20:58:08.493642Z 4777

This happens only to devices that can see more than one gateway, so my guess is that the JOIN does not de-duplicated?

This query found a bunch of these:

with prev_nonces as (
 select encode(dev_eui, 'hex') as dev_eui, encode(dev_addr, 'hex') as dev_addr, created_at, dev_nonce, lag(dev_nonce) over (order by dev_eui,  created_at) as previous_dev_nonce
 from device_activation
)
select *
from prev_nonces
where dev_nonce = previous_dev_nonce and dev_eui = 'ffffffffffffffe' order by created_at
@danieroux danieroux changed the title For the same created_at: Duplicate dev_nonce's, Duplicate dev_nonce's and data loss Dec 15, 2021
@csanso-limit
Copy link

This may be a stretch but I think #557 is related to this.
I think I've experienced the same errors as you but with a single gateway, although the problem makes more sense with multiple gateways.

Duplicate dev_nonce's, for the same dev_addr and created_at leads to data being lost.

If you notice closely the created_at is not exactly the same, it's off by 3767 microseconds.

so my guess is that the JOIN does not de-duplicated?

I'm pretty sure that is exactly the problem, they are not getting de-duplicated.
The de-duplication logic probably needs a rethink, at least in the case of JOINs.
Because it does seem like JOINs are absent from being de-duplicated.
It could also be that de-duplication does not take different timestamps or frequencies into consideration.

@csanso-limit
Copy link

I am curious to know what dev_addr does the device endup receiving from the network server?
The first JOIN dev_addr, the second JOIN dev_addr or none?

danieroux added a commit to danieroux/chirpstack-network-server that referenced this issue Apr 26, 2022
…nt data loss being experienced

It defaults to false. If set to true, it will ignore the frequency on which the packets come in from the gateways. This allows a ghost packet to be gracefully collected as Just Another Packet to de-duplicate with.

Without this setting a ghost package gets in and overrides an established `dev_addr`. Leading to data being lost until the next JOIN request, with the edge device unaware that it has lost its JOIN status.

We have not been able to trace where the ghost JOINs come from. This stops those from being a problem for now.

- brocaar#557 (comment) is not what we are experiencing, our devices are far apart
- This fixes brocaar#566 for us
@brocaar
Copy link
Owner

brocaar commented Apr 28, 2022

This has been fixed by 1b50594.

@brocaar brocaar closed this as completed Apr 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants