-
Notifications
You must be signed in to change notification settings - Fork 616
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
change NIP-01 to only allow a single filter per REQ
#1645
base: master
Are you sure you want to change the base?
Conversation
I do use this multiple-filter thing, but not in many cases. I'm not sure the benefit of this change outweighs the cost of breaking things. And I'm a bit afraid relays will not like even more subscriptions per connection. Many relays have unwisely low caps on how many subscriptions you can have open at once. |
I wish you did this 2 years ago. 😂 Then I would have supported it enthusiastically. Now I have mixed feelings. |
What's wrong with that feature is that the |
This is exactly the point. Relays have these stupid limits and the only reason people use multiple filters per I'm pretty sure relays will increase their default limits if we pass this. Arguably they would have done it already and limit by filter, not by |
Are those mixed feelings stemming from the fact that you have now already spent a lot of programming time implementing convoluted batch-then-segregate-filters logic on the client side? |
I am not sure if this solves anything. Keep in mind that having multiple filters within a single subscription does address some problems inherent to Nostr's data model. For instance, to correctly handle EOSE when monitoring multiple event IDs and their associated author information simultaneously (e.g., for author metadata, replies, reactions, etc.), such as in a thread view, each event ID or pubkey-based filter might require its own since and until properties. Since all of these filters relate to the same screen or UI component, it makes sense for them to be part of the same subscription. If this approach is merged, each of these filters would become an individual subscription, which might consume more resources. On Amethyst, we found that it is best to adhere to the lowest common limitations, which generally is:
Because we frequently hit these limits—particularly the character limit for REQ—the app often splits and rotates filters very quickly. This rotation, in turn, hits rate limits. When this happens and Tor is enabled (which is by default), I mitigate the issue by creating two connections to the same relay, one over the clearnet and another over Tor, and splitting the filters between the two. This effectively doubles all limits. It's a hack but it works. If limits are the issue you are trying to solve, we can work on that. The current PR just shifts things around from filters to subs. |
Because they are completely independent, and mashing them up together in the same
They already are, in a way. Making that explicit is good. I can't see any performance downsides, only upsides.
Not only. I am mostly worried about the perspective of relays handling custom queries in a way that isn't just the dumb "give the user everything he wants" and dealing with custom The simplest example, today, is NIP-17 recommending that relays reject The same problem also showed up recently on @staab's implementation of NIP-29 conflicting with my relay assumptions: fiatjaf/relay29#13, and for anyone who expects Nostr relay ecosystem to grow in diversity, specifically relay diversity (like I do and expect and hope) then issues like this will continue to show up. Maybe in these cases the solution could be just to recommend single filter |
I think this "relay diversity" issue goes beyond single filters.
Multi-account clients with NIP-17 downloads require multiple connections to AUTH with a different user. Many relays don't accept multiple connections per IP, so the app has to either rotate (with wasteful since/until +/- 48 hours) or subscribe using Tor with multiple exit nodes, one for each account.
I do not think this is unsolvable. It is hard, but matching incoming filters, no matter how complex they are, with the authorization the logged-in user has should be solvable. Modifying the filter is hard, but running the filter and then applying a second filter on top of the results to filter what the user can see from the resulting event set is a lot easier. @nostr-wine had many of these issues last year when building inbox.nostr.wine. Maybe we can learn from those.
I can't see a future where nostr libraries will not rearrange all these filters/subs by default. There is just no way App devs can deal with the complexity of building and splitting filters/subscriptions/connections. We will need to provide higher-level tools and find great ways to convert them back into what each relay operator wants their Clients to do. If we want to give relays more power to diversify their responses while letting Clients know how to deal with those customizations, we need a new abstraction layer. |
@fiatjaf do you think custom relays should still be called "relays"? As you said, optionality sucks. There should be a pitch somewhere to separate the relay network as we know today, with a simple, well-defined behavior, from new networks of custom relays. Clients could then declare compliance with the main network and/or other networks/schemes. We shouldn't expect every client to be able to deal with ALL of these custom relays and their rules. |
This is a bug on the relay though. A relay that deals with AUTH should accept more than one connection per IP as long as these connections are authenticated. The best course of action here is to talk to the relay maintainer.
This can be done, of course, but my point is that it becomes unnecessarily confusing. As the relay you can't just return a simple |
It's an interesting idea, although I can't imagine it right now. I think the current idioms we have are enough to power a thousand of "custom relays" already, so if make these things separate entities we risk splitting the network too much into different API interfaces or whatever. But I'm not opposed to it, I want to see what you have in mind.
Well, I wasn't expecting all clients to do it, I was expecting clients to hardcode some behaviors specifically tailored to some relays (i.e. DM apps hardcode behavior to NIP-17 relays, group apps to hardcode behavior to NIP-29 relays etc) -- so yeah maybe having more specific "relay APIs" (or whatever you wanna call them) for each use case could be interesting, as long as they are standardized. |
Don't assume users are not getting what they are expecting, getting mad, etc... The current thinking is the opposite. Relays are always removing/hiding events. Users/Client almost never see what they expect to see. We are already used to have to work with whatever we get from each relay. To me, relays SHOULD take any filter but only return what the user can see, with whatever rule they want to apply. Only CLOSE when there is nothing to return/monitor. |
I agree that it would probably only result in performance improvements, since in every relay implementation I've seen the relay just iterates through filters and sends the results anyway, keeping the subscription open slightly longer than if each filter had its own. Also, with batching, subscriptions are held open until the last subscription finishes. Breaking into separate subscriptions would result in better concurrency at the very slight cost of more messages being sent back and forth. Unfortunately, the core architecture of most sophisticated libraries (at least welshman and NDK, and it sounds like Alex has a similar thing going on) is based around splitting/merging, so a lot of work would have to go into taking that out. On the other hand, it would probably make things way simpler. Right now my request flow looks something like this:
I've spent an incredible amount of time developing (and troubleshooting) this logic. Sunk cost notwithstanding, it would probably be better for everyone if we moved away from multiple filters per request. It would simplify the process to:
This would remove the need for the batching timeout, and plenty of performance-intensive and error-prone filter matching/merging. I originally started batching at @pablof7z's suggestion, specifically to deal with relay.nostr.band's very low concurrent subscription limits (8 subs at a time as I recall). So yeah, batching is almost entirely a workaround for stingy relay limits, although I think there are a few cases in which batching makes sense, like when fetching reactions/replies/zaps for a parent note. Instead of injecting that logic into the data loading code, you can just put it in the leaf component, which simplifies things a lot. Of course, it would not be hard to write batching logic for that case which only results in a single filter, so maybe that's a moot point. Another good point fiatjaf made in fiatjaf/relay29#13:
CLOSED runs into the same problem that custom relay policies do, and it's baked into the protocol. Another thing:
I've been saying for a long time that relays are defined by an interface, and differ by policy. Graceful degradation and progressive enhancement through feature support signaling are the way to handle this. NIP 29 (despite the rigidity of the relay29 implementation in that it rejects any non-h-tagged event) only breaks the relay interface in this one place (which is why I had to address the relay implementation rather than work around it), despite its policy being wildly different from most relays on the network. I think this is great. TL;DR I support clients moving away from multiple filters, although I'd like to see how well that works in practice before really committing to the switch. If this really doesn't cause any problems for clients, relays should watch for multi-filter requests to stop coming in before dropping support. |
Yes. I am also depending on it now too. But after looking at my code, it's less than I thought. So, I think it wouldn't be so bad after all.
Why would we accept such shoddy implementations? Of course I generate one SQL query for combined filters. https://gitlab.com/soapbox-pub/nostrify/-/blob/main/packages/db/NPostgres.ts?ref_type=heads#L265 |
Gossip must be pretty dumb by comparison then. I just ask for what I want, when I want it, and if the relays complain I consider it a relay problem. I don't split or merge. I don't try to limit kinds, filters, or subscriptions to small numbers. I don't rate limit other than timing out for 15 seconds if the relay was rude. Then again, I don't think any of my REQs are unreasonable. By far the most common relay complaint is that I didn't AUTH and it insists that I AUTH just to do a REQ. |
But the SQL database underneath probably just iterates through all the filters independently. |
This is good to hear because I have never done any of the batching people are doing and understood why people felt that was so necessary, but I was never sure because none of my clients are heavily used. Gossip works basically perfectly though, and the fact that it is so naïve reinforces my beliefs. It's weird that if I started to hit relay limits so much that I felt the need to start complicating my client code with hacks in order to make basic things work I wouldn't think that was normal and proceed by piling hacks on top of hacks, I would think the protocol was broken and that it needed a serious reboot (because if it doesn't work today, damn, imagine if we had more than 16 users). |
FYI: For most relays in most situations, gossip has only 4 subscriptions ( In coincidences, there may be more than 4 subscriptions: If that relay happens to also be a discovery relay, it may have So in that worst case, that is only 7 subscriptions (with possibly more if you quickly spam the load more button). Every other filter set only gets subscribed if you navigate somewhere else (in which case some of the above ones close): Notably there are things gossip could be subscribing to that we choose not to. For example
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's worth it.
I always thought it was wrong that you can pass multiple filters in a single REQ. It makes the "limit" property make no sense, and harms your ability to do rate limiting and set timeouts on the relay side.
However, clients still need to request multiple filters sometimes. This responsibility is now shifted to the client. You could argue it always was. But still, clients now need to loop over the filters and possibly aggregate the results. This is actually more powerful for the client, so it probably should have always been a client responsibility.
Finally, it will simplify logs, and simplify queries. By removing this we actually get more predictable outcomes. This will make it easier to improve performance.
Would this change also propagate to COUNT? |
I like this. But I don't see a great path to obsolete multi-filter-REQs. Asking for serious rewrites of critical hard-to-debug parts of big clients for the benefit of relays is a very one-sided deal. Maybe clients could ask for higher rate-limits first, and if relays agreed - clients wouldn't need the batching and the use of multi-filter-REQs would just fade away? Without higher rate limits this change won't fly anyway. |
@nostrband you're right, I guess we have to start first with the limits. If I got it right @hoytech has agreed to change strfry limits to be based on number of filters instead of subscriptions, so 1 subscription with 2 filters will be equivalent to 2 subscriptions with 1 filter each. I don't know when he plans to implement that, but once he does I believe he will also increase the default limit. After that I think we can nudge clients to start using REQs with a single filter, and after that we can deprecate multi-filter REQs softly. |
Ok this sounds promising! |
COUNT seems to be deprecated. nobody uses it. |
@fiatjaf we can support it as well in our relay. i need to take a look at the code but it probably makes the process more lightweight. currently, we aggregate queries based on kind and send separate queries to Mongo DB (we still haven't found the best practice for that and even there is the possibility of moving to another database). this makes it more lightweight i think since we make fewer queries (most of the time) and each query will be searched in a smaller collection. not between all events. but i believe this change must still make it better. also, we need to provide this in the nip-11 document and remove the current max filters thing. |
Yes this is on my TODO list, as soon as I find some spare time. I'm pretty sure we can increase the limits in strfry quite a bit, I just want to do a little bit of testing to make sure, and to choose reasonable default values even for modest hardware. I think only allowing a single filter per REQ would've been a better/simpler design, but I think changing it at this stage is too big of a backwards incompatibility hit (?) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we also extend EVENT
to be able to signal multiple matching filters as an optimization?
this can be negotiated separately actually, so it's out of scope here |
this breaks the "find last note for a set of pubkeys" algo filter I just added to notedeck :( with low subscription limits on strfry relays, this would be pretty limiting, for why? I don't see how handling multiple filters in one request is more demanding then doing it in multiple requests. |
I'm pretty sure strfry's limits exist because of the risk of handling huge queries with many filters -- but those limits are harder to apply when the complexity of the query is unpredictable so everything ends up broken, like someone making a bunch of small simple queries gets rate-limited while Amethyst creates an arcane incantation that shoves 32 completely unrelated filters into a single query and doesn't get rate-limited. When each query can only have 1 filter then suddenly all queries are much simpler and it becomes easier to calculate limits and deal with them. Your algo could be easily split in multiple queries then, with the same cost for everybody. And in fact if you consider outbox model you would have to split it anyway because for each target pubkey you would have to query a different relay instead of sending all filters to a single relay in one request. It's basically only upsides! |
@jb55 i changed immortal to support only one filter per |
In practice the existence of the possibility of multiple filters in a single
REQ
has no benefit and only serves to hurt performance and increase complexity of codebases.Anything that can be done with a single
REQ
and multiple filters can much more easily be done with multipleREQ
s each with a single filter.I know this is a breaking change, but not a very drastic one. We can easily transition to this by having clients stop being built with this "feature" in mind, and relays can follow later. If a client sends a
REQ
with more than one filter to a relay that only supports one that's also ok, the client will just get less results than it expected.