-
Notifications
You must be signed in to change notification settings - Fork 284
Update offline message sending and receiving. #1688
base: master
Are you sure you want to change the base?
Conversation
The code currently publishes to the DHT first and then to the pushnodes when it finishes. This creates an opportunity where the node may be killed while it's publishing to the DHT and before it has an opportunity to publish to the pushnodes. This commit changes it to publish to the pushnodes in parrallel to the DHT so as to avoid the potentional problem from above.
@@ -105,57 +105,61 @@ func (m *MessageRetriever) Run() { | |||
peers := time.NewTicker(time.Minute * 10) | |||
defer dht.Stop() | |||
defer peers.Stop() | |||
go m.fetchPointers(true) | |||
go m.fetchPointersFromDHT() | |||
go m.fetchPointersFromPushNodes() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Separating these is a great idea! 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This changes look good to me.
WRT duplicate message handling, I would separate the process into steps: Retrieval (allowing for duplicates to be removed) -> Processing (without concern for dupes). Then you can ensure the Retrieval step can tolerate dupes without being concerned that dupes will leak into the Processing steps.
WRT your questions about testing: as you observe there are many components to the MessageRetriever. When a problem is too complex, I tend to break it into smaller pieces with clear(er) guarantees. The fact that these changes only affect a certain part of the MessageRetriever's dependencies might suggest that there's a subsystem which we might be able to extract. I haven't thoroughly analyzed the code but maybe breaking out a small part of the MR (maybe with a Publishing-specific interface?) would allow simpler unit tests to be constructed around the new behavior without being forced to create the world.
Fetching listings via IPFS appears to take longer with this change, at least here in Australia. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably need to investigate/address the reported performance issues before merging.
@cpacia we need to strip the receiving functionality of this PR into a separate one and then we can merge this as a change to sending ONLY. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed the goroutine is not cancelable by the caller and would run indefinitely. Given we're going to separate the sending and receiving portions, let's be sure the goroutine cancel handling is being applied to the separate portions.
I've also left some comments previously about potential testing/refactoring that should be considered. #1688 (review)
Let's rebase this to ethereum-master before moving forward. |
This PR makes a change to send offline messages to the push nodes in parallel to publishing to the DHT.
It also updates the message retriever to download messages from the push nodes in parallel to the DHT as well.
Open questions:
This change to the retriever that processes the messages in parallel potentially create duplicate message processing. Afaict it only results in a log saying duplicate message when putting to the offline message table, and the rest of our handlers should be able to handle duplicates. But is there a better way to do this that doesn't entail rewriting the file?
Is there any way we can think of to test this class given all the objects it's instantiated with?