Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add reference tracking to netdev and release netdev reference in nondl on shutdown #255

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

zhuyifei1999
Copy link
Contributor

We are using onload in AF_XDP mode, and one thing we noticed was that as long as onload was loaded, shutdown / reboot would loop with console message:

unregister_netdevice: waiting for eth0 to become free. Usage count = 3
unregister_netdevice: waiting for eth0 to become free. Usage count = 3
unregister_netdevice: waiting for eth0 to become free. Usage count = 3

This is because module cleanup functions only get called in the delete_module syscall and not reboot. I added reference counting and fixed the issue by adding a reboot notifier that calls the appropriate cleanup functions necessary to release the netdev references.

Netdev refcounting was added a couple years back in 5.17 [1], but
onload only uses the untracked versions. This patch adds support
to the ones that are low-hanging fruit.

For sfc, kernel compatibility for netdevice_tracker is already
provided by its kernel_compat.h. Tracking in tc_encap_actions.c
is already upstream [2] so this is backporting from there. For
rx_common.c I just sent a patch upstream [3] and backported it
here.

Outside sfc I added the same support to ci/driver/kernel_compat.h.

[1] https://lore.kernel.org/netdev/20211205042217.982127-4-eric.dumazet@gmail.com/
[2] torvalds/linux@7e5e7d800011ad#diff-9ffb1b01a8a11d0d7b6976a10eede3bebdcfaf256ef26d0d95714fe8aa03c89fR156
[3] https://lore.kernel.org/netdev/20241217224717.1711626-1-zhuyifei@google.com/T/

Signed-off-by: YiFei Zhu <zhuyifei@google.com>
@zhuyifei1999 zhuyifei1999 requested a review from a team as a code owner December 18, 2024 23:43
Copy link
Contributor

@ligallag-amd ligallag-amd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, thanks for your PR!

Overall, I think the netdev reference counting is a helpful addition. Once your upstream changes are accepted and have been pulled into out our sfc-linux-net repo, we will import the net driver with your changes this way.

As for your changes to the reboot notifier, it makes sense to me but I think there is an opportunity for this to race with the delete_module syscall (this is the only place I could find in the kernel source where mod->exit() is called). You have mentioned that shutdown shouldn't be concerned with this but I imagine the race would cause kernel warnings at the least. I'd like some of the other members of my team to weigh in on this.

src/driver/linux_resource/resource_driver.c Outdated Show resolved Hide resolved
@ligallag-amd ligallag-amd requested a review from a team December 20, 2024 16:03
@zhuyifei1999
Copy link
Contributor Author

You have mentioned that shutdown shouldn't be concerned with this but I imagine the race would cause kernel warnings at the least.

I had a bit of consideration for this. Both cleanup functions hold RTNL lock:

extern void efrm_nondl_unregister(void)
{
#ifdef EFHW_HAS_AF_XDP
efrm_nondl_unregister_driver(&efrm_nondl_driver);
#endif
}

/* Unregister a driver from the non-driverlink resource manager. */
void efrm_nondl_unregister_driver(struct efrm_nondl_driver *driver)
{
struct efrm_nondl_device *device, *device_n;
rtnl_lock();
EFRM_ASSERT(nondl_driver == driver);
list_for_each_entry_safe_reverse(device, device_n, &driver->devices,
driver_node)
efrm_nondl_del_device(device);
BUG_ON(!list_empty(&driver->devices));
nondl_driver = NULL;
rtnl_unlock();
}
EXPORT_SYMBOL(efrm_nondl_unregister_driver);

void efrm_nondl_shutdown(void)
{
struct efrm_nondl_device *device, *device_n;
/* If we are being unloaded then any module which depends on
* us should have been unloaded already, so our driver list
* should be empty. If not then something went badly wrong.
*
* We still might have devices registered with us, though. If
* so they need to be cleaned up. */
rtnl_lock();
BUG_ON(nondl_driver);
list_for_each_entry_safe_reverse(device, device_n, &nondl_device_list, node)
efrm_nondl_cleanup_netdev(device);
BUG_ON(!list_empty(&nondl_device_list));
rtnl_unlock();
}

So data races aren't possible. Both functions are also in the format of

list_for_each_entry_safe_reverse(...)
   cleanup(...)
BUG_ON(!list_empty(...));

So both functions are essentially no-ops if they have already been called. The only place I see that it could cause crash / warn is efrm_nondl_unregister_driver calling EFRM_ASSERT(nondl_driver == driver);, which I just didn't bother fixing in the initial implementation, considering a normal shutdown PID 1 typically kills all processes before calling reboot syscall, so no one would be there to call delete_module syscall. And I'm not sure, if I were to fix this, what the best way to fix it would be.

Copy link
Contributor

@ligallag-amd ligallag-amd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good points.

Regarding EFRM_ASSERT(nondl_driver == driver);, you could check if nondl_driver == NULL and exit early.

On shutdown / reboot, if netdevs don't have a refcount of 1, the
kernel loops with message:
  unregister_netdevice: waiting for eth0 to become free. Usage count = 3
  unregister_netdevice: waiting for eth0 to become free. Usage count = 3
  unregister_netdevice: waiting for eth0 to become free. Usage count = 3
  [...]

With netdev refcount tracking, the traces are visible:
  unregister_netdevice: waiting for eth0 to become free. Usage count = 3
  ref_tracker: eth%d@ff3cf768c8344548 has 1/2 users at
       efrm_nic_add+0x29b/0x460 [sfc_resource]
       efrm_nondl_add_device+0x124/0x1c0 [sfc_resource]
       efrm_nondl_try_add_device+0x27/0xf0 [sfc_resource]
       efrm_nondl_register_netdev+0x106/0x160 [sfc_resource]
       nondl_register_store+0x174/0x1e0 [sfc_resource]
       kernfs_fop_write_iter+0x128/0x1c0
       vfs_write+0x308/0x420
       ksys_write+0x5f/0xe0
       do_syscall_64+0x7b/0x160
       entry_SYSCALL_64_after_hwframe+0x76/0x7e

  ref_tracker: eth%d@ff3cf768c8344548 has 1/2 users at
       efrm_nondl_register_netdev+0xa9/0x160 [sfc_resource]
       nondl_register_store+0x174/0x1e0 [sfc_resource]
       kernfs_fop_write_iter+0x128/0x1c0
       vfs_write+0x308/0x420
       ksys_write+0x5f/0xe0
       do_syscall_64+0x7b/0x160
       entry_SYSCALL_64_after_hwframe+0x76/0x7e

This patch makes sfc_resource register a reboot notifier that calls
efrm_nondl_unregister & efrm_nondl_shutdown that handles each of the two
refcounts. Considering that the only other caller of those functions
is the cleanup_sfc_resource module exit function, I don't think shutdown
should be concerned about module unload racing under normal circumstances.
In any case, efrm_nondl_unregister_driver is made to exit early before
the assert, in the event that the race does end up happening.

Signed-off-by: YiFei Zhu <zhuyifei@google.com>
@zhuyifei1999
Copy link
Contributor Author

Regarding EFRM_ASSERT(nondl_driver == driver);, you could check if nondl_driver == NULL and exit early.

Done.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants