Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Setting "trusted hash" to older header breaks pruner and prevents startup #4001

Open
S1nus opened this issue Dec 11, 2024 · 4 comments
Open
Assignees
Labels
bug Something isn't working external Issues created by non node team members

Comments

@S1nus
Copy link

S1nus commented Dec 11, 2024

Celestia Node version

Commit: 7af07bd

OS

MacOS

Install tools

No response

Others

No response

Steps to reproduce it

  1. Set TrustedHash to a new header
  2. start and sync the node
  3. stop the node
  4. set TrustedHash to an older header

Expected result

Node should sync the headers in between the newly set TrustedHash and the oldest stored header

Actual result

2024-12-11T16:53:11.668-0500	ERROR	pruner/service	pruner/service.go:107	failed to get last pruned header	{"height": 1, "err": "header: not found"}
2024-12-11T16:53:11.668-0500	WARN	pruner/service	pruner/service.go:109	exiting pruner service!
2024-12-11T16:53:12.300-0500	INFO	canonical-log	swarm/swarm_dial.go:620	CANONICAL_PEER_STATUS: peer=12D3KooWAAzkuwRBps6qar4Js54E37usSzW57N5QFeDdAVjxt6F1 addr=/ip4/65.21.203.101/tcp/2121 sample_rate=100 connection_status="established" dir="outbound"
panic: invalid type received %!s(<nil>)

goroutine 349 [running]:
github.com/celestiaorg/go-header/p2p.(*subscription[...]).NextHeader(0x10804cc00?, {0x10806e830, 0x140014e0cd0?})
	/Users/redacted/go/pkg/mod/github.com/celestiaorg/go-header@v0.6.3/p2p/subscription.go:47 +0x2ac
github.com/celestiaorg/celestia-node/share/shwap/p2p/shrex/peers.(*Manager).subscribeHeader(0x1400173a6e0, {0x10806e830, 0x140014e0cd0}, {0x10804c790, 0x14001ba5908})
	/Users/redacted/celestia-node/share/shwap/p2p/shrex/peers/manager.go:302 +0xac
created by github.com/celestiaorg/celestia-node/share/shwap/p2p/shrex/peers.(*Manager).Start in goroutine 198
	/Users/redacted/celestia-node/share/shwap/p2p/shrex/peers/manager.go:166 +0x3e8

Relevant log output

No response

Is the node "stuck"? Has it stopped syncing?

No response

Notes

No response

@S1nus S1nus added the bug Something isn't working label Dec 11, 2024
@github-actions github-actions bot added the external Issues created by non node team members label Dec 11, 2024
@S1nus
Copy link
Author

S1nus commented Dec 11, 2024

Might not be related to pruner, I noticed that warning seems to be there even after I nuked .celestia-light and started fresh

@Wondertan
Copy link
Member

Hey @S1nus. Thanks for reporting. The fix is here with additional details.

@renaynay renaynay assigned renaynay and unassigned renaynay Dec 12, 2024
@renaynay
Copy link
Member

@S1nus you're right though that pruner is not designed to run on a node with a fragmented header-chain. I need to investigate this so @Wondertan 's fix is actually targetting the panic from the subscription, but the pruner will crap out (non-fatally) for a non-contiguous headerchain.

@Wondertan
Copy link
Member

Wondertan commented Dec 12, 2024

Yeah, I thought the issue was in the panic that we have to fix anyway independently

Wondertan added a commit to celestiaorg/go-header that referenced this issue Jan 6, 2025
A subscription might be started without the PubSub validator fully
registered, causing nil pointer deref. This was observed as a flake in
tests and sometimes even on the celestia node start. It's time to fix
this issue altogether.

The naive fix would check for nil, discarding a valid header message. On
the other hand, this fix ensures proper order of events, guaranteeing
that a valid message is never processed by subscription before the
user's header verifier is set.

Related to celestiaorg/celestia-node#4001
cristaloleg pushed a commit to celestiaorg/go-header that referenced this issue Jan 23, 2025
A subscription might be started without the PubSub validator fully
registered, causing nil pointer deref. This was observed as a flake in
tests and sometimes even on the celestia node start. It's time to fix
this issue altogether.

The naive fix would check for nil, discarding a valid header message. On
the other hand, this fix ensures proper order of events, guaranteeing
that a valid message is never processed by subscription before the
user's header verifier is set.

Related to celestiaorg/celestia-node#4001
renaynay pushed a commit to renaynay/go-header that referenced this issue Jan 23, 2025
A subscription might be started without the PubSub validator fully
registered, causing nil pointer deref. This was observed as a flake in
tests and sometimes even on the celestia node start. It's time to fix
this issue altogether.

The naive fix would check for nil, discarding a valid header message. On
the other hand, this fix ensures proper order of events, guaranteeing
that a valid message is never processed by subscription before the
user's header verifier is set.

Related to celestiaorg/celestia-node#4001
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working external Issues created by non node team members
Projects
None yet
Development

No branches or pull requests

3 participants