Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix fdtable allocation on new Debian #230

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

okt-sergeyn
Copy link
Contributor

The first commit fixes #229
The second one fixes initialization of citp_log_level.

If EF_FDTABLE_SIZE is not set, RLIMIT_NOFILE hard limit is used as
fdtable size. The value may be too big. For instance, on Debian 13
it is 1073741816=3ffffff8. In such case allocation will fail.
Introduce a new environment variable EF_FDS_MAX to limit maximum
size of fdtable. By default it is 1048576 (default value of RLIMIT_NOFILE
hard limit on Debian/Ubuntu).

Signed-off-by: Sergey Nikitin <sergey.nikitin@oktet.tech>
Initialize citp_log_level variable with user value right after
reading environment variables. Otherwise we may miss some startup
logs, for instance fdtable initialization warnings.

Signed-off-by: Sergey Nikitin <sergey.nikitin@oktet.tech>
@okt-sergeyn okt-sergeyn requested a review from a team as a code owner June 24, 2024 18:29
@ivatet-amd
Copy link
Contributor

Thank you for filing a PR. 👍

I cannot schedule it at the moment, but we are keeping it on our radar.

Presumably, in the absence of a fix in Debian, a workaround for #229 is passing EF_FDTABLE_SIZE to Onload. This PR makes this workaround go away. Is my understanding correct?

What obstacles do you anticipate for the dynamic approach, where the FD tables increase in runtime on demand dynamically (say, unless EF_FDTABLE_SIZE is set, which could prevent the dynamic behaviour to avoid jitter)?

@okt-sergeyn
Copy link
Contributor Author

Presumably, in the absence of a fix in Debian, a workaround for #229 is passing EF_FDTABLE_SIZE to Onload. This PR makes this workaround go away. Is my understanding correct?

Yes, absolutely right.

What obstacles do you anticipate for the dynamic approach, where the FD tables increase in runtime on demand dynamically (say, unless EF_FDTABLE_SIZE is set, which could prevent the dynamic behaviour to avoid jitter)?

It'd be a major modification I guess. Maybe @ol-alexandra has some thoughts.

@ol-alexandra
Copy link
Contributor

What obstacles do you anticipate for the dynamic approach, where the FD tables increase in runtime on demand dynamically (say, unless EF_FDTABLE_SIZE is set, which could prevent the dynamic behaviour to avoid jitter)?

It'd be a major modification I guess. Maybe @ol-alexandra has some thoughts.

I agree that it would be a major modification with unpleasent experience for some users. For example some users run a multithreaded application with EF_FDS_MT_SAFE=1. If one thread re-allocates fdtable, and other thread assumes lockless access to fdtable, it would result in a surprise. You can say that such users must set EF_FDTABLE_SIZE, but it is not an easy task to pass this message to all the users.

You can implement "dynamic approach" which does not include realloc() call (and does not change the user address), but again, it is a major rework of fdtable.

New Debian will probably release next summer; you have some months to decide. At the same time, as this change comes from systemd, the same issue may arise with other distros (Fedora and so on).

@okt-sergeyn
Copy link
Contributor Author

Hello! Friendly asking.
Are there any updates on this issue?

@sianj-xilinx
Copy link
Contributor

Are there any updates on this issue?

Hi Sergey. We have been thinking about this, but haven't yet come to a conclusion. The problem you point out definitely needs addressing, but I've got a few concerns about this. The main issue is that this will change the rlimit by default, which has the potential to silently break (unusual for onload) applications. The benefit of doing this way is obviously that things Just Work for most applications. The new EF_FDS_MAX is also very similar to the existing EF_FDTABLE_SIZE. We're considering what the best approach is, with alternatives including this or a more explicit configuration, for example bailing out with a helpful error message if we fail to allocate the fdtable, in which case the user can either configure the app or system limit, or use the existing EF_FDTABLE_SIZE. In the medium term we'd like to move to a dynamic approach, but in practice this will probably become a problem before we get that work done, so we'll need an interim solution.

@okt-sergeyn
Copy link
Contributor Author

Hi Siân! I got your point. Thanks for detailed answer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

citp_fdtable_ctor: failed to allocate fdtable
4 participants