Revision tags: v6.6.25, v6.6.24, v6.6.23, v6.6.16, v6.6.15, v6.6.14, v6.6.13, v6.6.12, v6.6.11, v6.6.10, v6.6.9, v6.6.8, v6.6.7, v6.6.6, v6.6.5, v6.6.4, v6.6.3, v6.6.2, v6.5.11, v6.6.1, v6.5.10, v6.6, v6.5.9, v6.5.8, v6.5.7, v6.5.6, v6.5.5, v6.5.4, v6.5.3, v6.5.2, v6.1.51, v6.5.1, v6.1.50, v6.5, v6.1.49, v6.1.48, v6.1.46, v6.1.45, v6.1.44, v6.1.43, v6.1.42, v6.1.41, v6.1.40, v6.1.39, v6.1.38, v6.1.37, v6.1.36, v6.4, v6.1.35, v6.1.34, v6.1.33, v6.1.32, v6.1.31, v6.1.30, v6.1.29, v6.1.28, v6.1.27, v6.1.26, v6.3, v6.1.25, v6.1.24, v6.1.23, v6.1.22, v6.1.21, v6.1.20, v6.1.19, v6.1.18, v6.1.17, v6.1.16, v6.1.15, v6.1.14, v6.1.13, v6.2, v6.1.12, v6.1.11, v6.1.10, v6.1.9, v6.1.8, v6.1.7, v6.1.6, v6.1.5, v6.0.19, v6.0.18, v6.1.4, v6.1.3, v6.0.17, v6.1.2, v6.0.16, v6.1.1, v6.0.15, v6.0.14, v6.0.13, v6.1, v6.0.12, v6.0.11, v6.0.10, v5.15.80, v6.0.9, v5.15.79, v6.0.8, v5.15.78, v6.0.7, v5.15.77, v5.15.76, v6.0.6, v6.0.5, v5.15.75, v6.0.4, v6.0.3, v6.0.2, v5.15.74, v5.15.73, v6.0.1, v5.15.72, v6.0, v5.15.71, v5.15.70, v5.15.69, v5.15.68, v5.15.67, v5.15.66, v5.15.65, v5.15.64, v5.15.63, v5.15.62, v5.15.61, v5.15.60, v5.15.59, v5.19, v5.15.58, v5.15.57, v5.15.56, v5.15.55, v5.15.54, v5.15.53, v5.15.52, v5.15.51, v5.15.50, v5.15.49, v5.15.48, v5.15.47, v5.15.46, v5.15.45, v5.15.44, v5.15.43, v5.15.42, v5.18, v5.15.41, v5.15.40, v5.15.39, v5.15.38, v5.15.37, v5.15.36, v5.15.35, v5.15.34, v5.15.33 |
|
#
6264f58c |
| 06-Apr-2022 |
Jakub Kicinski <kuba@kernel.org> |
net: extract a few internals from netdevice.h
There's a number of functions and static variables used under net/core/ but not from the outside. We currently dump most of them into netdevice.h. That
net: extract a few internals from netdevice.h
There's a number of functions and static variables used under net/core/ but not from the outside. We currently dump most of them into netdevice.h. That bad for many reasons: - netdevice.h is very cluttered, hard to figure out what the APIs are; - netdevice.h is very long; - we have to touch netdevice.h more which causes expensive incremental builds.
Create a header under net/core/ and move some declarations.
The new header is also a bit of a catch-all but that's fine, if we create more specific headers people will likely over-think where their declaration fit best. And end up putting them in netdevice.h, again.
More work should be done on splitting netdevice.h into more targeted headers, but that'd be more time consuming so small steps.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|
Revision tags: v5.15.32, v5.15.31, v5.17, v5.15.30, v5.15.29, v5.15.28, v5.15.27, v5.15.26, v5.15.25, v5.15.24, v5.15.23, v5.15.22, v5.15.21, v5.15.20, v5.15.19, v5.15.18, v5.15.17, v5.4.173, v5.15.16, v5.15.15, v5.16, v5.15.10, v5.15.9, v5.15.8, v5.15.7, v5.15.6, v5.15.5, v5.15.4 |
|
#
a387ff8e |
| 19-Nov-2021 |
Jakub Kicinski <kuba@kernel.org> |
dev_addr_list: put the first addr on the tree
Since all netdev->dev_addr modifications go via dev_addr_mod() we can put it on the list. When address is change remove it and add it back.
Signed-off-
dev_addr_list: put the first addr on the tree
Since all netdev->dev_addr modifications go via dev_addr_mod() we can put it on the list. When address is change remove it and add it back.
Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
d07b26f5 |
| 19-Nov-2021 |
Jakub Kicinski <kuba@kernel.org> |
dev_addr: add a modification check
netdev->dev_addr should only be modified via helpers, but someone may be casting off the const. Add a runtime check to catch abuses.
Signed-off-by: Jakub Kicinski
dev_addr: add a modification check
netdev->dev_addr should only be modified via helpers, but someone may be casting off the const. Add a runtime check to catch abuses.
Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
5f0b6923 |
| 19-Nov-2021 |
Jakub Kicinski <kuba@kernel.org> |
net: unexport dev_addr_init() & dev_addr_flush()
There are no module callers in-tree and it's hard to justify why anyone would init or flush addresses of a netdev (note the flush is more of a destru
net: unexport dev_addr_init() & dev_addr_flush()
There are no module callers in-tree and it's hard to justify why anyone would init or flush addresses of a netdev (note the flush is more of a destructor, it frees netdev->dev_addr).
Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
adeef3e3 |
| 19-Nov-2021 |
Jakub Kicinski <kuba@kernel.org> |
net: constify netdev->dev_addr
Commit 406f42fa0d3c ("net-next: When a bond have a massive amount of VLANs...") introduced a rbtree for faster Ethernet address look up. We converted all users to make
net: constify netdev->dev_addr
Commit 406f42fa0d3c ("net-next: When a bond have a massive amount of VLANs...") introduced a rbtree for faster Ethernet address look up. We converted all users to make modifications via appropriate helpers, make netdev->dev_addr const.
The update helpers need to upcast from the buffer to struct netdev_hw_addr.
Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
Revision tags: v5.15.3, v5.15.2, v5.15.1, v5.15, v5.14.14, v5.14.13, v5.14.12, v5.14.11, v5.14.10, v5.14.9 |
|
#
a5b8fd65 |
| 29-Sep-2021 |
Jakub Kicinski <kuba@kernel.org> |
net: dev_addr_list: handle first address in __hw_addr_add_ex
struct dev_addr_list is used for device addresses, unicast addresses and multicast addresses. The first of those needs special handling o
net: dev_addr_list: handle first address in __hw_addr_add_ex
struct dev_addr_list is used for device addresses, unicast addresses and multicast addresses. The first of those needs special handling of the main address - netdev->dev_addr points directly the data of the entry and drivers write to it freely, so we can't maintain it in the rbtree (for now, at least, to be fixed in net-next).
Current work around sprinkles special handling of the first address on the list throughout the code but it missed the case where address is being added. First address will not be visible during subsequent adds.
Syzbot found a warning where unicast addresses are modified without holding the rtnl lock, tl;dr is that team generates the same modification multiple times, not necessarily when right locks are held.
In the repro we have:
macvlan -> team -> veth
macvlan adds a unicast address to the team. Team then pushes that address down to its memebers (veths). Next something unrelated makes team sync member addrs again, and because of the bug the addr entries get duplicated in the veths. macvlan gets removed, removes its addr from team which removes only one of the duplicated addresses from veths. This removal is done under rtnl. Next syzbot uses iptables to add a multicast addr to team (which does not hold rtnl lock). Team syncs veth addrs, but because veths' unicast list still has the duplicate it will also get sync, even though this update is intended for mc addresses. Again, uc address updates need rtnl lock, boom.
Reported-by: syzbot+7a2ab2cdc14d134de553@syzkaller.appspotmail.com Fixes: 406f42fa0d3c ("net-next: When a bond have a massive amount of VLANs with IPv6 addresses, performance of changing link state, attaching a VRF, changing an IPv6 address, etc. go down dramtically.") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
Revision tags: v5.14.8, v5.14.7, v5.14.6, v5.10.67, v5.10.66, v5.14.5, v5.14.4, v5.10.65, v5.14.3, v5.10.64, v5.14.2, v5.10.63, v5.14.1, v5.10.62, v5.14, v5.10.61 |
|
#
406f42fa |
| 19-Aug-2021 |
Gilad Naaman <gnaaman@drivenets.com> |
net-next: When a bond have a massive amount of VLANs with IPv6 addresses, performance of changing link state, attaching a VRF, changing an IPv6 address, etc. go down dramtically.
The source of most
net-next: When a bond have a massive amount of VLANs with IPv6 addresses, performance of changing link state, attaching a VRF, changing an IPv6 address, etc. go down dramtically.
The source of most of the slow down is the `dev_addr_lists.c` module, which mainatins a linked list of HW addresses. When using IPv6, this list grows for each IPv6 address added on a VLAN, since each IPv6 address has a multicast HW address associated with it.
When performing any modification to the involved links, this list is traversed many times, often for nothing, all while holding the RTNL lock.
Instead, this patch adds an auxilliary rbtree which cuts down traversal time significantly.
Performance can be seen with the following script:
#!/bin/bash ip netns del test || true 2>/dev/null ip netns add test
echo 1 | ip netns exec test tee /proc/sys/net/ipv6/conf/all/keep_addr_on_down > /dev/null
set -e
ip -n test link add foo type veth peer name bar ip -n test link add b1 type bond ip -n test link add florp type vrf table 10
ip -n test link set bar master b1 ip -n test link set foo up ip -n test link set bar up ip -n test link set b1 up ip -n test link set florp up
VLAN_COUNT=1500 BASE_DEV=b1
echo Creating vlans ip netns exec test time -p bash -c "for i in \$(seq 1 $VLAN_COUNT); do ip -n test link add link $BASE_DEV name foo.\$i type vlan id \$i; done"
echo Bringing them up ip netns exec test time -p bash -c "for i in \$(seq 1 $VLAN_COUNT); do ip -n test link set foo.\$i up; done"
echo Assiging IPv6 Addresses ip netns exec test time -p bash -c "for i in \$(seq 1 $VLAN_COUNT); do ip -n test address add dev foo.\$i 2000::\$i/64; done"
echo Attaching to VRF ip netns exec test time -p bash -c "for i in \$(seq 1 $VLAN_COUNT); do ip -n test link set foo.\$i master florp; done"
On an Intel(R) Xeon(R) CPU E5-2650 v3 @ 2.30GHz machine, the performance before the patch is (truncated):
Creating vlans real 108.35 Bringing them up real 4.96 Assiging IPv6 Addresses real 19.22 Attaching to VRF real 458.84
After the patch:
Creating vlans real 5.59 Bringing them up real 5.07 Assiging IPv6 Addresses real 5.64 Attaching to VRF real 25.37
Cc: David S. Miller <davem@davemloft.net> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Lu Wei <luwei32@huawei.com> Cc: Xiongfeng Wang <wangxiongfeng2@huawei.com> Cc: Taehee Yoo <ap420073@gmail.com> Signed-off-by: Gilad Naaman <gnaaman@drivenets.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
Revision tags: v5.10.60, v5.10.53, v5.10.52, v5.10.51, v5.10.50, v5.10.49, v5.13, v5.10.46, v5.10.43, v5.10.42, v5.10.41, v5.10.40, v5.10.39, v5.4.119, v5.10.36, v5.10.35, v5.10.34, v5.4.116, v5.10.33, v5.12, v5.10.32, v5.10.31, v5.10.30, v5.10.27 |
|
#
af825087 |
| 27-Mar-2021 |
Xiongfeng Wang <wangxiongfeng2@huawei.com> |
net: core: Correct function name dev_uc_flush() in the kerneldoc
Fix the following W=1 kernel build warning(s):
net/core/dev_addr_lists.c:732: warning: expecting prototype for dev_uc_flush(). Prot
net: core: Correct function name dev_uc_flush() in the kerneldoc
Fix the following W=1 kernel build warning(s):
net/core/dev_addr_lists.c:732: warning: expecting prototype for dev_uc_flush(). Prototype was for dev_uc_init() instead
Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
Revision tags: v5.10.26 |
|
#
897b9fae |
| 25-Mar-2021 |
Lu Wei <luwei32@huawei.com> |
net: core: Fix a typo in dev_addr_lists.c
Modify "funciton" to "function" in net/core/dev_addr_lists.c.
Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Lu Wei <luwei32@huawei.com> Signed
net: core: Fix a typo in dev_addr_lists.c
Modify "funciton" to "function" in net/core/dev_addr_lists.c.
Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Lu Wei <luwei32@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
Revision tags: v5.10.25, v5.10.24, v5.10.23, v5.10.22, v5.10.21, v5.10.20, v5.10.19, v5.4.101, v5.10.18, v5.10.17, v5.11, v5.10.16, v5.10.15, v5.10.14, v5.10, v5.8.17, v5.8.16, v5.8.15, v5.9, v5.8.14, v5.8.13, v5.8.12 |
|
#
1fc70edb |
| 25-Sep-2020 |
Taehee Yoo <ap420073@gmail.com> |
net: core: add nested_level variable in net_device
This patch is to add a new variable 'nested_level' into the net_device structure. This variable will be used as a parameter of spin_lock_nested() o
net: core: add nested_level variable in net_device
This patch is to add a new variable 'nested_level' into the net_device structure. This variable will be used as a parameter of spin_lock_nested() of dev->addr_list_lock.
netif_addr_lock() can be called recursively so spin_lock_nested() is used instead of spin_lock() and dev->lower_level is used as a parameter of spin_lock_nested(). But, dev->lower_level value can be updated while it is being used. So, lockdep would warn a possible deadlock scenario.
When a stacked interface is deleted, netif_{uc | mc}_sync() is called recursively. So, spin_lock_nested() is called recursively too. At this moment, the dev->lower_level variable is used as a parameter of it. dev->lower_level value is updated when interfaces are being unlinked/linked immediately. Thus, After unlinking, dev->lower_level shouldn't be a parameter of spin_lock_nested().
A (macvlan) | B (vlan) | C (bridge) | D (macvlan) | E (vlan) | F (bridge)
A->lower_level : 6 B->lower_level : 5 C->lower_level : 4 D->lower_level : 3 E->lower_level : 2 F->lower_level : 1
When an interface 'A' is removed, it releases resources. At this moment, netif_addr_lock() would be called. Then, netdev_upper_dev_unlink() is called recursively. Then dev->lower_level is updated. There is no problem.
But, when the bridge module is removed, 'C' and 'F' interfaces are removed at once. If 'F' is removed first, a lower_level value is like below. A->lower_level : 5 B->lower_level : 4 C->lower_level : 3 D->lower_level : 2 E->lower_level : 1 F->lower_level : 1
Then, 'C' is removed. at this moment, netif_addr_lock() is called recursively. The ordering is like this. C(3)->D(2)->E(1)->F(1) At this moment, the lower_level value of 'E' and 'F' are the same. So, lockdep warns a possible deadlock scenario.
In order to avoid this problem, a new variable 'nested_level' is added. This value is the same as dev->lower_level - 1. But this value is updated in rtnl_unlock(). So, this variable can be used as a parameter of spin_lock_nested() safely in the rtnl context.
Test commands: ip link add br0 type bridge vlan_filtering 1 ip link add vlan1 link br0 type vlan id 10 ip link add macvlan2 link vlan1 type macvlan ip link add br3 type bridge vlan_filtering 1 ip link set macvlan2 master br3 ip link add vlan4 link br3 type vlan id 10 ip link add macvlan5 link vlan4 type macvlan ip link add br6 type bridge vlan_filtering 1 ip link set macvlan5 master br6 ip link add vlan7 link br6 type vlan id 10 ip link add macvlan8 link vlan7 type macvlan
ip link set br0 up ip link set vlan1 up ip link set macvlan2 up ip link set br3 up ip link set vlan4 up ip link set macvlan5 up ip link set br6 up ip link set vlan7 up ip link set macvlan8 up modprobe -rv bridge
Splat looks like: [ 36.057436][ T744] WARNING: possible recursive locking detected [ 36.058848][ T744] 5.9.0-rc6+ #728 Not tainted [ 36.059959][ T744] -------------------------------------------- [ 36.061391][ T744] ip/744 is trying to acquire lock: [ 36.062590][ T744] ffff8c4767509280 (&vlan_netdev_addr_lock_key){+...}-{2:2}, at: dev_set_rx_mode+0x19/0x30 [ 36.064922][ T744] [ 36.064922][ T744] but task is already holding lock: [ 36.066626][ T744] ffff8c4767769280 (&vlan_netdev_addr_lock_key){+...}-{2:2}, at: dev_uc_add+0x1e/0x60 [ 36.068851][ T744] [ 36.068851][ T744] other info that might help us debug this: [ 36.070731][ T744] Possible unsafe locking scenario: [ 36.070731][ T744] [ 36.072497][ T744] CPU0 [ 36.073238][ T744] ---- [ 36.074007][ T744] lock(&vlan_netdev_addr_lock_key); [ 36.075290][ T744] lock(&vlan_netdev_addr_lock_key); [ 36.076590][ T744] [ 36.076590][ T744] *** DEADLOCK *** [ 36.076590][ T744] [ 36.078515][ T744] May be due to missing lock nesting notation [ 36.078515][ T744] [ 36.080491][ T744] 3 locks held by ip/744: [ 36.081471][ T744] #0: ffffffff98571df0 (rtnl_mutex){+.+.}-{3:3}, at: rtnetlink_rcv_msg+0x236/0x490 [ 36.083614][ T744] #1: ffff8c4767769280 (&vlan_netdev_addr_lock_key){+...}-{2:2}, at: dev_uc_add+0x1e/0x60 [ 36.085942][ T744] #2: ffff8c476c8da280 (&bridge_netdev_addr_lock_key/4){+...}-{2:2}, at: dev_uc_sync+0x39/0x80 [ 36.088400][ T744] [ 36.088400][ T744] stack backtrace: [ 36.089772][ T744] CPU: 6 PID: 744 Comm: ip Not tainted 5.9.0-rc6+ #728 [ 36.091364][ T744] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014 [ 36.093630][ T744] Call Trace: [ 36.094416][ T744] dump_stack+0x77/0x9b [ 36.095385][ T744] __lock_acquire+0xbc3/0x1f40 [ 36.096522][ T744] lock_acquire+0xb4/0x3b0 [ 36.097540][ T744] ? dev_set_rx_mode+0x19/0x30 [ 36.098657][ T744] ? rtmsg_ifinfo+0x1f/0x30 [ 36.099711][ T744] ? __dev_notify_flags+0xa5/0xf0 [ 36.100874][ T744] ? rtnl_is_locked+0x11/0x20 [ 36.101967][ T744] ? __dev_set_promiscuity+0x7b/0x1a0 [ 36.103230][ T744] _raw_spin_lock_bh+0x38/0x70 [ 36.104348][ T744] ? dev_set_rx_mode+0x19/0x30 [ 36.105461][ T744] dev_set_rx_mode+0x19/0x30 [ 36.106532][ T744] dev_set_promiscuity+0x36/0x50 [ 36.107692][ T744] __dev_set_promiscuity+0x123/0x1a0 [ 36.108929][ T744] dev_set_promiscuity+0x1e/0x50 [ 36.110093][ T744] br_port_set_promisc+0x1f/0x40 [bridge] [ 36.111415][ T744] br_manage_promisc+0x8b/0xe0 [bridge] [ 36.112728][ T744] __dev_set_promiscuity+0x123/0x1a0 [ 36.113967][ T744] ? __hw_addr_sync_one+0x23/0x50 [ 36.115135][ T744] __dev_set_rx_mode+0x68/0x90 [ 36.116249][ T744] dev_uc_sync+0x70/0x80 [ 36.117244][ T744] dev_uc_add+0x50/0x60 [ 36.118223][ T744] macvlan_open+0x18e/0x1f0 [macvlan] [ 36.119470][ T744] __dev_open+0xd6/0x170 [ 36.120470][ T744] __dev_change_flags+0x181/0x1d0 [ 36.121644][ T744] dev_change_flags+0x23/0x60 [ 36.122741][ T744] do_setlink+0x30a/0x11e0 [ 36.123778][ T744] ? __lock_acquire+0x92c/0x1f40 [ 36.124929][ T744] ? __nla_validate_parse.part.6+0x45/0x8e0 [ 36.126309][ T744] ? __lock_acquire+0x92c/0x1f40 [ 36.127457][ T744] __rtnl_newlink+0x546/0x8e0 [ 36.128560][ T744] ? lock_acquire+0xb4/0x3b0 [ 36.129623][ T744] ? deactivate_slab.isra.85+0x6a1/0x850 [ 36.130946][ T744] ? __lock_acquire+0x92c/0x1f40 [ 36.132102][ T744] ? lock_acquire+0xb4/0x3b0 [ 36.133176][ T744] ? is_bpf_text_address+0x5/0xe0 [ 36.134364][ T744] ? rtnl_newlink+0x2e/0x70 [ 36.135445][ T744] ? rcu_read_lock_sched_held+0x32/0x60 [ 36.136771][ T744] ? kmem_cache_alloc_trace+0x2d8/0x380 [ 36.138070][ T744] ? rtnl_newlink+0x2e/0x70 [ 36.139164][ T744] rtnl_newlink+0x47/0x70 [ ... ]
Fixes: 845e0ebb4408 ("net: change addr_list_lock back to static key") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
Revision tags: v5.8.11, v5.8.10, v5.8.9, v5.8.8, v5.8.7, v5.8.6, v5.4.62, v5.8.5, v5.8.4, v5.4.61, v5.8.3, v5.4.60, v5.8.2, v5.4.59, v5.8.1, v5.4.58, v5.4.57, v5.4.56, v5.8, v5.7.12, v5.4.55, v5.7.11, v5.4.54, v5.7.10, v5.4.53, v5.4.52, v5.7.9, v5.7.8, v5.4.51, v5.4.50, v5.7.7 |
|
#
e8280338 |
| 26-Jun-2020 |
Cong Wang <xiyou.wangcong@gmail.com> |
net: explain the lockdep annotations for dev_uc_unsync()
The lockdep annotations for dev_uc_unsync() and dev_mc_unsync() are not easy to understand, so add some comments to explain why they are corr
net: explain the lockdep annotations for dev_uc_unsync()
The lockdep annotations for dev_uc_unsync() and dev_mc_unsync() are not easy to understand, so add some comments to explain why they are correct.
Similar for the rest netif_addr_lock_bh() cases, they don't need nested version.
Cc: Taehee Yoo <ap420073@gmail.com> Cc: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
Revision tags: v5.4.49, v5.7.6, v5.7.5, v5.4.48, v5.7.4, v5.7.3, v5.4.47, v5.4.46, v5.7.2 |
|
#
845e0ebb |
| 08-Jun-2020 |
Cong Wang <xiyou.wangcong@gmail.com> |
net: change addr_list_lock back to static key
The dynamic key update for addr_list_lock still causes troubles, for example the following race condition still exists:
CPU 0: CPU 1: (RCU read lock
net: change addr_list_lock back to static key
The dynamic key update for addr_list_lock still causes troubles, for example the following race condition still exists:
CPU 0: CPU 1: (RCU read lock) (RTNL lock) dev_mc_seq_show() netdev_update_lockdep_key() -> lockdep_unregister_key() -> netif_addr_lock_bh()
because lockdep doesn't provide an API to update it atomically. Therefore, we have to move it back to static keys and use subclass for nest locking like before.
In commit 1a33e10e4a95 ("net: partially revert dynamic lockdep key changes"), I already reverted most parts of commit ab92d68fc22f ("net: core: add generic lockdep keys").
This patch reverts the rest and also part of commit f3b0a18bb6cb ("net: remove unnecessary variables and callback"). After this patch, addr_list_lock changes back to using static keys and subclasses to satisfy lockdep. Thanks to dev->lower_level, we do not have to change back to ->ndo_get_lock_subclass().
And hopefully this reduces some syzbot lockdep noises too.
Reported-by: syzbot+f3a0e80c34b3fc28ac5e@syzkaller.appspotmail.com Cc: Taehee Yoo <ap420073@gmail.com> Cc: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
Revision tags: v5.4.45, v5.7.1, v5.4.44, v5.7, v5.4.43, v5.4.42, v5.4.41, v5.4.40, v5.4.39, v5.4.38, v5.4.37, v5.4.36, v5.4.35, v5.4.34, v5.4.33, v5.4.32, v5.4.31, v5.4.30, v5.4.29, v5.6, v5.4.28, v5.4.27, v5.4.26, v5.4.25, v5.4.24, v5.4.23, v5.4.22, v5.4.21, v5.4.20, v5.4.19, v5.4.18, v5.4.17, v5.4.16, v5.5, v5.4.15, v5.4.14, v5.4.13, v5.4.12, v5.4.11, v5.4.10, v5.4.9, v5.4.8, v5.4.7, v5.4.6, v5.4.5, v5.4.4, v5.4.3, v5.3.15, v5.4.2, v5.4.1, v5.3.14, v5.4, v5.3.13, v5.3.12, v5.3.11, v5.3.10, v5.3.9, v5.3.8 |
|
#
f3b0a18b |
| 21-Oct-2019 |
Taehee Yoo <ap420073@gmail.com> |
net: remove unnecessary variables and callback
This patch removes variables and callback these are related to the nested device structure. devices that can be nested have their own nest_level variab
net: remove unnecessary variables and callback
This patch removes variables and callback these are related to the nested device structure. devices that can be nested have their own nest_level variable that represents the depth of nested devices. In the previous patch, new {lower/upper}_level variables are added and they replace old private nest_level variable. So, this patch removes all 'nest_level' variables.
In order to avoid lockdep warning, ->ndo_get_lock_subclass() was added to get lockdep subclass value, which is actually lower nested depth value. But now, they use the dynamic lockdep key to avoid lockdep warning instead of the subclass. So, this patch removes ->ndo_get_lock_subclass() callback.
Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
Revision tags: v5.3.7, v5.3.6, v5.3.5, v5.3.4, v5.3.3, v5.3.2, v5.3.1, v5.3, v5.2.14, v5.3-rc8, v5.2.13, v5.2.12, v5.2.11, v5.2.10, v5.2.9, v5.2.8, v5.2.7, v5.2.6, v5.2.5, v5.2.4, v5.2.3, v5.2.2, v5.2.1, v5.2, v5.1.16, v5.1.15, v5.1.14, v5.1.13, v5.1.12, v5.1.11, v5.1.10, v5.1.9, v5.1.8, v5.1.7, v5.1.6 |
|
#
2874c5fd |
| 27-May-2019 |
Thomas Gleixner <tglx@linutronix.de> |
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152
Based on 1 normalized pattern(s):
this program is free software you can redistribute it and or modify it under the terms of th
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152
Based on 1 normalized pattern(s):
this program is free software you can redistribute it and or modify it under the terms of the gnu general public license as published by the free software foundation either version 2 of the license or at your option any later version
extracted by the scancode license scanner the SPDX license identifier
GPL-2.0-or-later
has been chosen to replace the boilerplate/reference in 3029 file(s).
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Allison Randal <allison@lohutok.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190527070032.746973796@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
Revision tags: v5.1.5, v5.1.4, v5.1.3, v5.1.2, v5.1.1, v5.0.14, v5.1, v5.0.13, v5.0.12, v5.0.11, v5.0.10, v5.0.9, v5.0.8, v5.0.7, v5.0.6, v5.0.5, v5.0.4, v5.0.3, v4.19.29, v5.0.2, v4.19.28, v5.0.1, v4.19.27, v5.0, v4.19.26, v4.19.25, v4.19.24, v4.19.23, v4.19.22, v4.19.21, v4.19.20, v4.19.19, v4.19.18, v4.19.17, v4.19.16, v4.19.15, v4.19.14, v4.19.13, v4.19.12, v4.19.11, v4.19.10 |
|
#
d59cdf94 |
| 13-Dec-2018 |
Petr Machata <petrm@mellanox.com> |
net: dev: Issue NETDEV_PRE_CHANGEADDR
When a device address is about to be changed, or an address added to the list of device HW addresses, it is necessary to ensure that all interested parties can
net: dev: Issue NETDEV_PRE_CHANGEADDR
When a device address is about to be changed, or an address added to the list of device HW addresses, it is necessary to ensure that all interested parties can support the address. Therefore, send the NETDEV_PRE_CHANGEADDR notification, and if anyone bails on it, do not change the address.
Signed-off-by: Petr Machata <petrm@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
Revision tags: v4.19.9, v4.19.8, v4.19.7, v4.19.6, v4.19.5, v4.19.4, v4.18.20, v4.19.3, v4.18.19, v4.19.2, v4.18.18 |
|
#
e7946760 |
| 08-Nov-2018 |
Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> |
net: core: dev_addr_lists: add auxiliary func to handle reference address updates
In order to avoid all table update, and only remove or add new address, the auxiliary function exists, named __hw_ad
net: core: dev_addr_lists: add auxiliary func to handle reference address updates
In order to avoid all table update, and only remove or add new address, the auxiliary function exists, named __hw_addr_sync_dev(). It allows end driver do nothing when nothing changed and add/rm when concrete address is firstly added or lastly removed. But it doesn't include cases when an address of real device or vlan was reused by other vlans or vlan/macval devices.
For handaling events when address was reused/unreused the patch adds new auxiliary routine - __hw_addr_ref_sync_dev(). It allows to do nothing when nothing was changed and do updates only for an address being added/reused/deleted/unreused. Thus, clone address changes for vlans can be mirrored in the table. The function is exclusive with __hw_addr_sync_dev(). It's responsibility of the end driver to identify address vlan device, if it needs so.
Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
Revision tags: v4.18.17, v4.19.1, v4.19, v4.18.16, v4.18.15, v4.18.14, v4.18.13, v4.18.12, v4.18.11, v4.18.10, v4.18.9, v4.18.7, v4.18.6, v4.18.5, v4.17.18, v4.18.4, v4.18.3, v4.17.17, v4.18.2, v4.17.16, v4.17.15, v4.18.1, v4.18, v4.17.14, v4.17.13, v4.17.12, v4.17.11, v4.17.10, v4.17.9, v4.17.8, v4.17.7, v4.17.6, v4.17.5, v4.17.4, v4.17.3, v4.17.2, v4.17.1, v4.17 |
|
#
bb9aaaa1 |
| 17-Apr-2018 |
sunlianwen <sunlw.fnst@cn.fujitsu.com> |
net: change the comment of dev_mc_init
The comment of dev_mc_init() is wrong. which use dev_mc_flush instead of dev_mc_init.
Signed-off-by: Lianwen Sun <sunlw.fnst@cn.fujitsu.com Signed-off-by: Dav
net: change the comment of dev_mc_init
The comment of dev_mc_init() is wrong. which use dev_mc_flush instead of dev_mc_init.
Signed-off-by: Lianwen Sun <sunlw.fnst@cn.fujitsu.com Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
77d36398 |
| 07-Apr-2018 |
Eric Dumazet <edumazet@google.com> |
net: fix uninit-value in __hw_addr_add_ex()
syzbot complained :
BUG: KMSAN: uninit-value in memcmp+0x119/0x180 lib/string.c:861 CPU: 0 PID: 3 Comm: kworker/0:0 Not tainted 4.16.0+ #82 Hardware name
net: fix uninit-value in __hw_addr_add_ex()
syzbot complained :
BUG: KMSAN: uninit-value in memcmp+0x119/0x180 lib/string.c:861 CPU: 0 PID: 3 Comm: kworker/0:0 Not tainted 4.16.0+ #82 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Workqueue: ipv6_addrconf addrconf_dad_work Call Trace: __dump_stack lib/dump_stack.c:17 [inline] dump_stack+0x185/0x1d0 lib/dump_stack.c:53 kmsan_report+0x142/0x240 mm/kmsan/kmsan.c:1067 __msan_warning_32+0x6c/0xb0 mm/kmsan/kmsan_instr.c:676 memcmp+0x119/0x180 lib/string.c:861 __hw_addr_add_ex net/core/dev_addr_lists.c:60 [inline] __dev_mc_add+0x1c2/0x8e0 net/core/dev_addr_lists.c:670 dev_mc_add+0x6d/0x80 net/core/dev_addr_lists.c:687 igmp6_group_added+0x2db/0xa00 net/ipv6/mcast.c:662 ipv6_dev_mc_inc+0xe9e/0x1130 net/ipv6/mcast.c:914 addrconf_join_solict net/ipv6/addrconf.c:2078 [inline] addrconf_dad_begin net/ipv6/addrconf.c:3828 [inline] addrconf_dad_work+0x427/0x2150 net/ipv6/addrconf.c:3954 process_one_work+0x12c6/0x1f60 kernel/workqueue.c:2113 worker_thread+0x113c/0x24f0 kernel/workqueue.c:2247 kthread+0x539/0x720 kernel/kthread.c:239
Fixes: f001fde5eadd ("net: introduce a list of device addresses dev_addr_list (v6)") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
Revision tags: v4.16, v4.15, v4.13.16, v4.14, v4.13.5, v4.13, v4.12, v4.10.17, v4.10.16, v4.10.15, v4.10.14, v4.10.13, v4.10.12, v4.10.11, v4.10.10, v4.10.9, v4.10.8, v4.10.7, v4.10.6, v4.10.5, v4.10.4, v4.10.3, v4.10.2, v4.10.1, v4.10, v4.9, openbmc-4.4-20161121-1, v4.4.33, v4.4.32, v4.4.31, v4.4.30, v4.4.29, v4.4.28, v4.4.27, v4.7.10, openbmc-4.4-20161021-1, v4.7.9, v4.4.26, v4.7.8, v4.4.25, v4.4.24, v4.7.7, v4.8, v4.4.23, v4.7.6, v4.7.5, v4.4.22, v4.4.21, v4.7.4, v4.7.3, v4.4.20, v4.7.2, v4.4.19, openbmc-4.4-20160819-1, v4.7.1, v4.4.18, v4.4.17, openbmc-4.4-20160804-1, v4.4.16, v4.7, openbmc-4.4-20160722-1, openbmc-20160722-1, openbmc-20160713-1, v4.4.15, v4.6.4, v4.6.3, v4.4.14, v4.6.2, v4.4.13, openbmc-20160606-1, v4.6.1, v4.4.12, openbmc-20160521-1, v4.4.11, openbmc-20160518-1, v4.6, v4.4.10, openbmc-20160511-1, openbmc-20160505-1, v4.4.9, v4.4.8, v4.4.7, openbmc-20160329-2, openbmc-20160329-1, openbmc-20160321-1, v4.4.6, v4.5, v4.4.5, v4.4.4, v4.4.3, openbmc-20160222-1, v4.4.2, openbmc-20160212-1, openbmc-20160210-1, openbmc-20160202-2, openbmc-20160202-1, v4.4.1, openbmc-20160127-1, openbmc-20160120-1, v4.4, openbmc-20151217-1, openbmc-20151210-1, openbmc-20151202-1, openbmc-20151123-1, openbmc-20151118-1, openbmc-20151104-1, v4.3, openbmc-20151102-1, openbmc-20151028-1, v4.3-rc1, v4.2, v4.2-rc8, v4.2-rc7, v4.2-rc6, v4.2-rc5, v4.2-rc4, v4.2-rc3, v4.2-rc2, v4.2-rc1, v4.1, v4.1-rc8, v4.1-rc7, v4.1-rc6, v4.1-rc5, v4.1-rc4, v4.1-rc3, v4.1-rc2, v4.1-rc1, v4.0, v4.0-rc7, v4.0-rc6, v4.0-rc5, v4.0-rc4, v4.0-rc3, v4.0-rc2, v4.0-rc1, v3.19, v3.19-rc7, v3.19-rc6, v3.19-rc5, v3.19-rc4, v3.19-rc3, v3.19-rc2, v3.19-rc1, v3.18, v3.18-rc7, v3.18-rc6 |
|
#
1d2398dc |
| 17-Nov-2014 |
Fabian Frederick <fabf@skynet.be> |
net: fix spelling for synchronized
Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: David S. Miller <davem@davemloft.net>
|
Revision tags: v3.18-rc5, v3.18-rc4, v3.18-rc3, v3.18-rc2, v3.18-rc1, v3.17, v3.17-rc7, v3.17-rc6, v3.17-rc5, v3.17-rc4, v3.17-rc3, v3.17-rc2, v3.17-rc1, v3.16, v3.16-rc7, v3.16-rc6, v3.16-rc5, v3.16-rc4, v3.16-rc3, v3.16-rc2, v3.16-rc1, v3.15, v3.15-rc8 |
|
#
670e5b8e |
| 28-May-2014 |
Alexander Duyck <alexander.h.duyck@intel.com> |
net: Add support for device specific address syncing
This change provides a function to be used in order to break the ndo_set_rx_mode call into a set of address add and remove calls. The code is ba
net: Add support for device specific address syncing
This change provides a function to be used in order to break the ndo_set_rx_mode call into a set of address add and remove calls. The code is based on the implementation of dev_uc_sync/dev_mc_sync. Since they essentially do the same thing but with only one dev I simply named my functions __dev_uc_sync/__dev_mc_sync.
I also implemented an unsync version of the functions as well to allow for cleanup on close.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
Revision tags: v3.15-rc7, v3.15-rc6, v3.15-rc5, v3.15-rc4, v3.15-rc3, v3.15-rc2, v3.15-rc1, v3.14, v3.14-rc8, v3.14-rc7, v3.14-rc6, v3.14-rc5, v3.14-rc4, v3.14-rc3, v3.14-rc2, v3.14-rc1 |
|
#
6ef7b8a2 |
| 22-Jan-2014 |
Vlad Yasevich <vyasevic@redhat.com> |
net: Correctly sync addresses from multiple sources to single device
When we have multiple devices attempting to sync the same address to a single destination, each device should be permitted to syn
net: Correctly sync addresses from multiple sources to single device
When we have multiple devices attempting to sync the same address to a single destination, each device should be permitted to sync it once. To accomplish this, pass the 'sync_cnt' of the source address when adding the addresss to the lower device. 'sync_cnt' tracks how many time a given address has been succefully synced. This way, we know that if the 'sync_cnt' passed in is 0, we should sync this address.
Also, turn 'synced' member back into the counter as was originally done in commit 4543fbefe6e06a9e40d9f2b28d688393a299f079. net: count hw_addr syncs so that unsync works properly. It tracks how many time a given address has been added via a 'sync' operation. For every successfull 'sync' the counter is incremented, and for ever 'unsync', the counter is decremented. This makes sure that the address will be properly removed from the the lower device when all the upper devices have removed it.
Reported-by: Andrey Dmitrov <andrey.dmitrov@oktetlabs.ru> CC: Andrey Dmitrov <andrey.dmitrov@oktetlabs.ru> CC: Alexandra N. Kossovsky <Alexandra.Kossovsky@oktetlabs.ru> CC: Konstantin Ushakov <Konstantin.Ushakov@oktetlabs.ru> Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
Revision tags: v3.13, v3.13-rc8, v3.13-rc7, v3.13-rc6, v3.13-rc5, v3.13-rc4 |
|
#
477bb933 |
| 13-Dec-2013 |
stephen hemminger <shemming@brocade.com> |
net: remove dead code for add/del multiple
These function to manipulate multiple addresses are not used anywhere in current net-next tree. Some out of tree code maybe using these but too bad; they s
net: remove dead code for add/del multiple
These function to manipulate multiple addresses are not used anywhere in current net-next tree. Some out of tree code maybe using these but too bad; they should submit their code upstream..
Also, make __hw_addr_flush local since only used by dev_addr_lists.c
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
Revision tags: v3.13-rc3, v3.13-rc2, v3.13-rc1, v3.12 |
|
#
cdfb97bc |
| 28-Oct-2013 |
Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> |
net, mc: fix the incorrect comments in two mc-related functions
Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
|
Revision tags: v3.12-rc7, v3.12-rc6, v3.12-rc5, v3.12-rc4, v3.12-rc3, v3.12-rc2, v3.12-rc1, v3.11, v3.11-rc7, v3.11-rc6, v3.11-rc5, v3.11-rc4, v3.11-rc3, v3.11-rc2, v3.11-rc1, v3.10, v3.10-rc7, v3.10-rc6, v3.10-rc5, v3.10-rc4 |
|
#
b190a508 |
| 31-May-2013 |
Jay Vosburgh <fubar@us.ibm.com> |
net/core: dev_mc_sync_multiple calls wrong helper
The dev_mc_sync_multiple function is currently calling __hw_addr_sync, and not __hw_addr_sync_multiple. This will result in addresses only being sy
net/core: dev_mc_sync_multiple calls wrong helper
The dev_mc_sync_multiple function is currently calling __hw_addr_sync, and not __hw_addr_sync_multiple. This will result in addresses only being synced to the first device from the set.
Corrected by calling the _multiple variant.
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com> Reviewed-by: Vlad Yasevich <vyasevic@redhat.com> Tested-by: Shawn Bohrer <sbohrer@rgmadvisors.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
29ca2f8f |
| 31-May-2013 |
Jay Vosburgh <fubar@us.ibm.com> |
net/core: __hw_addr_sync_one / _multiple broken
Currently, __hw_addr_sync_one is called in a loop by __hw_addr_sync_multiple to sync each of a "from" device's hw addresses to a "to" device. __hw_ad
net/core: __hw_addr_sync_one / _multiple broken
Currently, __hw_addr_sync_one is called in a loop by __hw_addr_sync_multiple to sync each of a "from" device's hw addresses to a "to" device. __hw_addr_sync_one calls __hw_addr_add_ex to attempt to add each address. __hw_addr_add_ex is called with global=false, and sync=true.
__hw_addr_add_ex checks to see if the new address matches an address already on the list. If so, it tests global and sync. In this case, sync=true, and it then checks if the address is already synced, and if so, returns 0.
This 0 return causes __hw_addr_sync_one to increment the sync_cnt and refcount for the "from" list's address entry, even though the address is already synced and has a reference and sync_cnt. This will cause the sync_cnt and refcount to increment without bound every time an addresses is added to the "from" device and synced to the "to" device.
The fix here has two parts:
First, when __hw_addr_add_ex finds the address already exists and is synced, return -EEXIST instead of 0.
Second, __hw_addr_sync_one checks the error return for -EEXIST, and if so, it (a) does not add a refcount/sync_cnt, and (b) returns 0 itself so that __hw_addr_sync_multiple will not return an error.
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com> Reviewed-by: Vlad Yasevich <vyasevic@redhat.com> Tested-by: Shawn Bohrer <sbohrer@rgmadvisors.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|