Revision tags: v6.6.25, v6.6.24, v6.6.23, v6.6.16, v6.6.15, v6.6.14, v6.6.13, v6.6.12, v6.6.11, v6.6.10, v6.6.9, v6.6.8, v6.6.7, v6.6.6, v6.6.5, v6.6.4, v6.6.3, v6.6.2, v6.5.11, v6.6.1, v6.5.10, v6.6, v6.5.9, v6.5.8, v6.5.7, v6.5.6, v6.5.5 |
|
#
4b2b6060 |
| 22-Sep-2023 |
Hangbin Liu <liuhangbin@gmail.com> |
ipv4/fib: send notify when delete source address routes
After deleting an interface address in fib_del_ifaddr(), the function scans the fib_info list for stray entries and calls fib_flush() and fib_
ipv4/fib: send notify when delete source address routes
After deleting an interface address in fib_del_ifaddr(), the function scans the fib_info list for stray entries and calls fib_flush() and fib_table_flush(). Then the stray entries will be deleted silently and no RTM_DELROUTE notification will be sent.
This lack of notification can make routing daemons, or monitor like `ip monitor route` miss the routing changes. e.g.
+ ip link add dummy1 type dummy + ip link add dummy2 type dummy + ip link set dummy1 up + ip link set dummy2 up + ip addr add 192.168.5.5/24 dev dummy1 + ip route add 7.7.7.0/24 dev dummy2 src 192.168.5.5 + ip -4 route 7.7.7.0/24 dev dummy2 scope link src 192.168.5.5 192.168.5.0/24 dev dummy1 proto kernel scope link src 192.168.5.5 + ip monitor route + ip addr del 192.168.5.5/24 dev dummy1 Deleted 192.168.5.0/24 dev dummy1 proto kernel scope link src 192.168.5.5 Deleted broadcast 192.168.5.255 dev dummy1 table local proto kernel scope link src 192.168.5.5 Deleted local 192.168.5.5 dev dummy1 table local proto kernel scope host src 192.168.5.5
As Ido reminded, fib_table_flush() isn't only called when an address is deleted, but also when an interface is deleted or put down. The lack of notification in these cases is deliberate. And commit 7c6bb7d2faaf ("net/ipv6: Add knob to skip DELROUTE message on device down") introduced a sysctl to make IPv6 behave like IPv4 in this regard. So we can't send the route delete notify blindly in fib_table_flush().
To fix this issue, let's add a new flag in "struct fib_info" to track the deleted prefer source address routes, and only send notify for them.
After update: + ip monitor route + ip addr del 192.168.5.5/24 dev dummy1 Deleted 192.168.5.0/24 dev dummy1 proto kernel scope link src 192.168.5.5 Deleted broadcast 192.168.5.255 dev dummy1 table local proto kernel scope link src 192.168.5.5 Deleted local 192.168.5.5 dev dummy1 table local proto kernel scope host src 192.168.5.5 Deleted 7.7.7.0/24 dev dummy2 scope link src 192.168.5.5
Suggested-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20230922075508.848925-1-liuhangbin@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
show more ...
|
Revision tags: v6.5.4, v6.5.3, v6.5.2, v6.1.51, v6.5.1, v6.1.50 |
|
#
8aae7625 |
| 30-Aug-2023 |
Florian Westphal <fw@strlen.de> |
net: fib: avoid warn splat in flow dissector
New skbs allocated via nf_send_reset() have skb->dev == NULL.
fib*_rules_early_flow_dissect helpers already have a 'struct net' argument but its not pas
net: fib: avoid warn splat in flow dissector
New skbs allocated via nf_send_reset() have skb->dev == NULL.
fib*_rules_early_flow_dissect helpers already have a 'struct net' argument but its not passed down to the flow dissector core, which will then WARN as it can't derive a net namespace to use:
WARNING: CPU: 0 PID: 0 at net/core/flow_dissector.c:1016 __skb_flow_dissect+0xa91/0x1cd0 [..] ip_route_me_harder+0x143/0x330 nf_send_reset+0x17c/0x2d0 [nf_reject_ipv4] nft_reject_inet_eval+0xa9/0xf2 [nft_reject_inet] nft_do_chain+0x198/0x5d0 [nf_tables] nft_do_chain_inet+0xa4/0x110 [nf_tables] nf_hook_slow+0x41/0xc0 ip_local_deliver+0xce/0x110 ..
Cc: Stanislav Fomichev <sdf@google.com> Cc: David Ahern <dsahern@kernel.org> Cc: Ido Schimmel <idosch@nvidia.com> Fixes: 812fa71f0d96 ("netfilter: Dissect flow after packet mangling") Link: https://bugzilla.kernel.org/show_bug.cgi?id=217826 Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20230830110043.30497-1-fw@strlen.de Signed-off-by: Paolo Abeni <pabeni@redhat.com>
show more ...
|
Revision tags: v6.5, v6.1.49, v6.1.48, v6.1.46, v6.1.45, v6.1.44, v6.1.43, v6.1.42, v6.1.41, v6.1.40, v6.1.39, v6.1.38, v6.1.37, v6.1.36, v6.4, v6.1.35, v6.1.34, v6.1.33, v6.1.32, v6.1.31, v6.1.30, v6.1.29, v6.1.28, v6.1.27, v6.1.26, v6.3, v6.1.25, v6.1.24, v6.1.23, v6.1.22, v6.1.21, v6.1.20, v6.1.19, v6.1.18, v6.1.17, v6.1.16, v6.1.15, v6.1.14, v6.1.13, v6.2, v6.1.12, v6.1.11, v6.1.10, v6.1.9, v6.1.8, v6.1.7, v6.1.6, v6.1.5, v6.0.19, v6.0.18, v6.1.4, v6.1.3, v6.0.17, v6.1.2, v6.0.16, v6.1.1, v6.0.15, v6.0.14, v6.0.13, v6.1, v6.0.12, v6.0.11, v6.0.10, v5.15.80, v6.0.9, v5.15.79, v6.0.8, v5.15.78, v6.0.7, v5.15.77, v5.15.76, v6.0.6, v6.0.5, v5.15.75, v6.0.4, v6.0.3, v6.0.2, v5.15.74, v5.15.73, v6.0.1, v5.15.72, v6.0, v5.15.71, v5.15.70, v5.15.69, v5.15.68, v5.15.67, v5.15.66, v5.15.65, v5.15.64, v5.15.63, v5.15.62, v5.15.61, v5.15.60, v5.15.59, v5.19, v5.15.58, v5.15.57, v5.15.56, v5.15.55, v5.15.54, v5.15.53, v5.15.52, v5.15.51, v5.15.50, v5.15.49, v5.15.48, v5.15.47, v5.15.46, v5.15.45, v5.15.44, v5.15.43, v5.15.42, v5.18, v5.15.41, v5.15.40, v5.15.39, v5.15.38, v5.15.37, v5.15.36, v5.15.35, v5.15.34 |
|
#
568a3f33 |
| 08-Apr-2022 |
Guillaume Nault <gnault@redhat.com> |
ipv4: Use dscp_t in struct fib_entry_notifier_info
Use the new dscp_t type to replace the tos field of struct fib_entry_notifier_info. This ensures ECN bits are ignored and makes it compatible with
ipv4: Use dscp_t in struct fib_entry_notifier_info
Use the new dscp_t type to replace the tos field of struct fib_entry_notifier_info. This ensures ECN bits are ignored and makes it compatible with the dscp field of struct fib_rt_info.
This also allows sparse to flag potential incorrect uses of DSCP and ECN bits.
Signed-off-by: Guillaume Nault <gnault@redhat.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|
#
888ade8f |
| 08-Apr-2022 |
Guillaume Nault <gnault@redhat.com> |
ipv4: Use dscp_t in struct fib_rt_info
Use the new dscp_t type to replace the tos field of struct fib_rt_info. This ensures ECN bits are ignored and makes it compatible with the fa_dscp field of str
ipv4: Use dscp_t in struct fib_rt_info
Use the new dscp_t type to replace the tos field of struct fib_rt_info. This ensures ECN bits are ignored and makes it compatible with the fa_dscp field of struct fib_alias.
This also allows sparse to flag potential incorrect uses of DSCP and ECN bits.
Signed-off-by: Guillaume Nault <gnault@redhat.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|
Revision tags: v5.15.33, v5.15.32, v5.15.31, v5.17, v5.15.30, v5.15.29, v5.15.28, v5.15.27, v5.15.26, v5.15.25, v5.15.24, v5.15.23, v5.15.22, v5.15.21, v5.15.20 |
|
#
f55fbb6a |
| 04-Feb-2022 |
Guillaume Nault <gnault@redhat.com> |
ipv4: Reject routes specifying ECN bits in rtm_tos
Use the new dscp_t type to replace the fc_tos field of fib_config, to ensure IPv4 routes aren't influenced by ECN bits when configured with non-zer
ipv4: Reject routes specifying ECN bits in rtm_tos
Use the new dscp_t type to replace the fc_tos field of fib_config, to ensure IPv4 routes aren't influenced by ECN bits when configured with non-zero rtm_tos.
Before this patch, IPv4 routes specifying an rtm_tos with some of the ECN bits set were accepted. However they wouldn't work (never match) as IPv4 normally clears the ECN bits with IPTOS_RT_MASK before doing a FIB lookup (although a few buggy code paths don't).
After this patch, IPv4 routes specifying an rtm_tos with any ECN bit set is rejected.
Note: IPv6 routes ignore rtm_tos altogether, any rtm_tos is accepted, but treated as if it were 0.
Signed-off-by: Guillaume Nault <gnault@redhat.com> Acked-by: David Ahern <dsahern@kernel.org> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|
Revision tags: v5.15.19, v5.15.18, v5.15.17, v5.4.173, v5.15.16, v5.15.15, v5.16, v5.15.10, v5.15.9, v5.15.8, v5.15.7 |
|
#
e44b14eb |
| 06-Dec-2021 |
Eric Dumazet <edumazet@google.com> |
inet: add net device refcount tracker to struct fib_nh_common
Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
#
213f5f8f |
| 01-Dec-2021 |
Eric Dumazet <edumazet@google.com> |
ipv4: convert fib_num_tclassid_users to atomic_t
Before commit faa041a40b9f ("ipv4: Create cleanup helper for fib_nh") changes to net->ipv4.fib_num_tclassid_users were protected by RTNL.
After the
ipv4: convert fib_num_tclassid_users to atomic_t
Before commit faa041a40b9f ("ipv4: Create cleanup helper for fib_nh") changes to net->ipv4.fib_num_tclassid_users were protected by RTNL.
After the change, this is no longer the case, as free_fib_info_rcu() runs after rcu grace period, without rtnl being held.
Fixes: faa041a40b9f ("ipv4: Create cleanup helper for fib_nh") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: David Ahern <dsahern@kernel.org> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
170739c4 |
| 01-Dec-2021 |
Eric Dumazet <edumazet@google.com> |
ipv4: convert fib_num_tclassid_users to atomic_t
commit 213f5f8f31f10aa1e83187ae20fb7fa4e626b724 upstream.
Before commit faa041a40b9f ("ipv4: Create cleanup helper for fib_nh") changes to net->ipv4
ipv4: convert fib_num_tclassid_users to atomic_t
commit 213f5f8f31f10aa1e83187ae20fb7fa4e626b724 upstream.
Before commit faa041a40b9f ("ipv4: Create cleanup helper for fib_nh") changes to net->ipv4.fib_num_tclassid_users were protected by RTNL.
After the change, this is no longer the case, as free_fib_info_rcu() runs after rcu grace period, without rtnl being held.
Fixes: faa041a40b9f ("ipv4: Create cleanup helper for fib_nh") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: David Ahern <dsahern@kernel.org> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
Revision tags: v5.15.6, v5.15.5, v5.15.4, v5.15.3, v5.15.2, v5.15.1, v5.15, v5.14.14, v5.14.13, v5.14.12, v5.14.11, v5.14.10, v5.14.9, v5.14.8 |
|
#
597aa16c |
| 23-Sep-2021 |
Xiao Liang <shaw.leon@gmail.com> |
net: ipv4: Fix rtnexthop len when RTA_FLOW is present
Multipath RTA_FLOW is embedded in nexthop. Dump it in fib_add_nexthop() to get the length of rtnexthop correct.
Fixes: b0f60193632e ("ipv4: Ref
net: ipv4: Fix rtnexthop len when RTA_FLOW is present
Multipath RTA_FLOW is embedded in nexthop. Dump it in fib_add_nexthop() to get the length of rtnexthop correct.
Fixes: b0f60193632e ("ipv4: Refactor nexthop attributes in fib_dump_info") Signed-off-by: Xiao Liang <shaw.leon@gmail.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
Revision tags: v5.14.7, v5.14.6, v5.10.67, v5.10.66, v5.14.5, v5.14.4, v5.10.65, v5.14.3, v5.10.64, v5.14.2, v5.10.63, v5.14.1, v5.10.62, v5.14, v5.10.61, v5.10.60 |
|
#
79976892 |
| 29-Jul-2021 |
Yajun Deng <yajun.deng@linux.dev> |
net: convert fib_treeref from int to refcount_t
refcount_t type should be used instead of int when fib_treeref is used as a reference counter,and avoid use-after-free risks.
Signed-off-by: Yajun De
net: convert fib_treeref from int to refcount_t
refcount_t type should be used instead of int when fib_treeref is used as a reference counter,and avoid use-after-free risks.
Signed-off-by: Yajun Deng <yajun.deng@linux.dev> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20210729071350.28919-1-yajun.deng@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>
show more ...
|
Revision tags: v5.10.53, v5.10.52, v5.10.51, v5.10.50, v5.10.49, v5.13, v5.10.46, v5.10.43, v5.10.42, v5.10.41, v5.10.40, v5.10.39 |
|
#
ce5c9c20 |
| 17-May-2021 |
Ido Schimmel <idosch@OSS.NVIDIA.COM> |
ipv4: Add a sysctl to control multipath hash fields
A subsequent patch will add a new multipath hash policy where the packet fields used for multipath hash calculation are determined by user space.
ipv4: Add a sysctl to control multipath hash fields
A subsequent patch will add a new multipath hash policy where the packet fields used for multipath hash calculation are determined by user space. This patch adds a sysctl that allows user space to set these fields.
The packet fields are represented using a bitmask and are common between IPv4 and IPv6 to allow user space to use the same numbering across both protocols. For example, to hash based on standard 5-tuple:
# sysctl -w net.ipv4.fib_multipath_hash_fields=0x0037 net.ipv4.fib_multipath_hash_fields = 0x0037
The kernel rejects unknown fields, for example:
# sysctl -w net.ipv4.fib_multipath_hash_fields=0x1000 sysctl: setting key "net.ipv4.fib_multipath_hash_fields": Invalid argument
More fields can be added in the future, if needed.
Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
Revision tags: v5.4.119, v5.10.36, v5.10.35, v5.10.34, v5.4.116, v5.10.33, v5.12, v5.10.32, v5.10.31, v5.10.30, v5.10.27, v5.10.26, v5.10.25, v5.10.24, v5.10.23, v5.10.22, v5.10.21, v5.10.20, v5.10.19, v5.4.101, v5.10.18, v5.10.17, v5.11, v5.10.16, v5.10.15, v5.10.14 |
|
#
36c5100e |
| 07-Feb-2021 |
Amit Cohen <amcohen@nvidia.com> |
IPv4: Add "offload failed" indication to routes
After installing a route to the kernel, user space receives an acknowledgment, which means the route was installed in the kernel, but not necessarily
IPv4: Add "offload failed" indication to routes
After installing a route to the kernel, user space receives an acknowledgment, which means the route was installed in the kernel, but not necessarily in hardware.
The asynchronous nature of route installation in hardware can lead to a routing daemon advertising a route before it was actually installed in hardware. This can result in packet loss or mis-routed packets until the route is installed in hardware.
To avoid such cases, previous patch set added the ability to emit RTM_NEWROUTE notifications whenever RTM_F_OFFLOAD/RTM_F_TRAP flags are changed, this behavior is controlled by sysctl.
With the above mentioned behavior, it is possible to know from user-space if the route was offloaded, but if the offload fails there is no indication to user-space. Following a failure, a routing daemon will wait indefinitely for a notification that will never come.
This patch adds an "offload_failed" indication to IPv4 routes, so that users will have better visibility into the offload process.
'struct fib_alias', and 'struct fib_rt_info' are extended with new field that indicates if route offload failed. Note that the new field is added using unused bit and therefore there is no need to increase structs size.
Signed-off-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
Revision tags: v5.10, v5.8.17, v5.8.16, v5.8.15, v5.9, v5.8.14, v5.8.13, v5.8.12, v5.8.11, v5.8.10, v5.8.9, v5.8.8, v5.8.7, v5.8.6, v5.4.62, v5.8.5, v5.8.4, v5.4.61, v5.8.3, v5.4.60, v5.8.2, v5.4.59, v5.8.1, v5.4.58, v5.4.57, v5.4.56, v5.8, v5.7.12, v5.4.55, v5.7.11, v5.4.54, v5.7.10, v5.4.53, v5.4.52, v5.7.9, v5.7.8, v5.4.51, v5.4.50, v5.7.7, v5.4.49, v5.7.6, v5.7.5, v5.4.48, v5.7.4, v5.7.3, v5.4.47, v5.4.46, v5.7.2, v5.4.45, v5.7.1, v5.4.44, v5.7, v5.4.43 |
|
#
1fd1c768 |
| 26-May-2020 |
David Ahern <dsahern@gmail.com> |
ipv4: nexthop version of fib_info_nh_uses_dev
Similar to the last path, need to fix fib_info_nh_uses_dev for external nexthops to avoid referencing multiple nh_grp structs. Move the device check in
ipv4: nexthop version of fib_info_nh_uses_dev
Similar to the last path, need to fix fib_info_nh_uses_dev for external nexthops to avoid referencing multiple nh_grp structs. Move the device check in fib_info_nh_uses_dev to a helper and create a nexthop version that is called if the fib_info uses an external nexthop.
Fixes: 430a049190de ("nexthop: Add support for nexthop groups") Signed-off-by: David Ahern <dsahern@gmail.com> Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
af7888ad |
| 26-May-2020 |
David Ahern <dsahern@gmail.com> |
ipv4: Refactor nhc evaluation in fib_table_lookup
FIB lookups can return an entry that references an external nexthop. While walking the nexthop struct we do not want to make multiple calls into the
ipv4: Refactor nhc evaluation in fib_table_lookup
FIB lookups can return an entry that references an external nexthop. While walking the nexthop struct we do not want to make multiple calls into the nexthop code which can result in 2 different structs getting accessed - one returning the number of paths the rest of the loop seeing a different nh_grp struct. If the nexthop group shrunk, the result is an attempt to access a fib_nh_common that does not exist for the new nh_grp struct but did for the old one.
To fix that move the device evaluation code to a helper that can be used for inline fib_nh path as well as external nexthops.
Update the existing check for fi->nh in fib_table_lookup to call a new helper, nexthop_get_nhc_lookup, which walks the external nexthop with a single rcu dereference.
Fixes: 430a049190de ("nexthop: Add support for nexthop groups") Signed-off-by: David Ahern <dsahern@gmail.com> Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
41b4bd98 |
| 20-May-2020 |
Sabrina Dubroca <sd@queasysnail.net> |
net: don't return invalid table id error when we fall back to PF_UNSPEC
In case we can't find a ->dumpit callback for the requested (family,type) pair, we fall back to (PF_UNSPEC,type). In effect, w
net: don't return invalid table id error when we fall back to PF_UNSPEC
In case we can't find a ->dumpit callback for the requested (family,type) pair, we fall back to (PF_UNSPEC,type). In effect, we're in the same situation as if userspace had requested a PF_UNSPEC dump. For RTM_GETROUTE, that handler is rtnl_dump_all, which calls all the registered RTM_GETROUTE handlers.
The requested table id may or may not exist for all of those families. commit ae677bbb4441 ("net: Don't return invalid table id error when dumping all families") fixed the problem when userspace explicitly requests a PF_UNSPEC dump, but missed the fallback case.
For example, when we pass ipv6.disable=1 to a kernel with CONFIG_IP_MROUTE=y and CONFIG_IP_MROUTE_MULTIPLE_TABLES=y, the (PF_INET6, RTM_GETROUTE) handler isn't registered, so we end up in rtnl_dump_all, and listing IPv6 routes will unexpectedly print:
# ip -6 r Error: ipv4: MR table does not exist. Dump terminated
commit ae677bbb4441 introduced the dump_all_families variable, which gets set when userspace requests a PF_UNSPEC dump. However, we can't simply set the family to PF_UNSPEC in rtnetlink_rcv_msg in the fallback case to get dump_all_families == true, because some messages types (for example RTM_GETRULE and RTM_GETNEIGH) only register the PF_UNSPEC handler and use the family to filter in the kernel what is dumped to userspace. We would then export more entries, that userspace would have to filter. iproute does that, but other programs may not.
Instead, this patch removes dump_all_families and updates the RTM_GETROUTE handlers to check if the family that is being dumped is their own. When it's not, which covers both the intentional PF_UNSPEC dumps (as dump_all_families did) and the fallback case, ignore the missing table id error.
Fixes: cb167893f41e ("net: Plumb support for filtering ipv4 and ipv6 multicast route dumps") Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
Revision tags: v5.4.42, v5.4.41, v5.4.40, v5.4.39, v5.4.38, v5.4.37, v5.4.36, v5.4.35, v5.4.34, v5.4.33, v5.4.32, v5.4.31, v5.4.30, v5.4.29, v5.6 |
|
#
faee6769 |
| 27-Mar-2020 |
Alexander Aring <alex.aring@gmail.com> |
net: add net available in build_state
The build_state callback of lwtunnel doesn't contain the net namespace structure yet. This patch will add it so we can check on specific address configuration a
net: add net available in build_state
The build_state callback of lwtunnel doesn't contain the net namespace structure yet. This patch will add it so we can check on specific address configuration at creation time of rpl source routes.
Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
Revision tags: v5.4.28, v5.4.27, v5.4.26, v5.4.25, v5.4.24 |
|
#
a5311060 |
| 02-Mar-2020 |
Gustavo A. R. Silva <gustavo@embeddedor.com> |
net: ip_fib: Replace zero-length array with flexible-array member
The current codebase makes use of the zero-length array language extension to the C90 standard, but the preferred mechanism to decla
net: ip_fib: Replace zero-length array with flexible-array member
The current codebase makes use of the zero-length array language extension to the C90 standard, but the preferred mechanism to declare variable-length types such as these ones is a flexible array member[1][2], introduced in C99:
struct foo { int stuff; struct boo array[]; };
By making use of the mechanism above, we will get a compiler warning in case the flexible array does not occur last in the structure, which will help us prevent some kind of undefined behavior bugs from being inadvertently introduced[3] to the codebase from now on.
Also, notice that, dynamic memory allocations won't be affected by this change:
"Flexible array members have incomplete type, and so the sizeof operator may not be applied. As a quirk of the original implementation of zero-length arrays, sizeof evaluates to zero."[1]
This issue was found with the help of Coccinelle.
[1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html [2] https://github.com/KSPP/linux/issues/21 [3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
Revision tags: v5.4.23, v5.4.22, v5.4.21, v5.4.20, v5.4.19, v5.4.18, v5.4.17, v5.4.16, v5.5, v5.4.15, v5.4.14, v5.4.13, v5.4.12 |
|
#
90b93f1b |
| 14-Jan-2020 |
Ido Schimmel <idosch@mellanox.com> |
ipv4: Add "offload" and "trap" indications to routes
When performing L3 offload, routes and nexthops are usually programmed into two different tables in the underlying device. Therefore, the fact th
ipv4: Add "offload" and "trap" indications to routes
When performing L3 offload, routes and nexthops are usually programmed into two different tables in the underlying device. Therefore, the fact that a nexthop resides in hardware does not necessarily mean that all the associated routes also reside in hardware and vice-versa.
While the kernel can signal to user space the presence of a nexthop in hardware (via 'RTNH_F_OFFLOAD'), it does not have a corresponding flag for routes. In addition, the fact that a route resides in hardware does not necessarily mean that the traffic is offloaded. For example, unreachable routes (i.e., 'RTN_UNREACHABLE') are programmed to trap packets to the CPU so that the kernel will be able to generate the appropriate ICMP error packet.
This patch adds an "offload" and "trap" indications to IPv4 routes, so that users will have better visibility into the offload process.
'struct fib_alias' is extended with two new fields that indicate if the route resides in hardware or not and if it is offloading traffic from the kernel or trapping packets to it. Note that the new fields are added in the 6 bytes hole and therefore the struct still fits in a single cache line [1].
Capable drivers are expected to invoke fib_alias_hw_flags_set() with the route's key in order to set the flags.
The indications are dumped to user space via a new flags (i.e., 'RTM_F_OFFLOAD' and 'RTM_F_TRAP') in the 'rtm_flags' field in the ancillary header.
v2: * Make use of 'struct fib_rt_info' in fib_alias_hw_flags_set()
[1] struct fib_alias { struct hlist_node fa_list; /* 0 16 */ struct fib_info * fa_info; /* 16 8 */ u8 fa_tos; /* 24 1 */ u8 fa_type; /* 25 1 */ u8 fa_state; /* 26 1 */ u8 fa_slen; /* 27 1 */ u32 tb_id; /* 28 4 */ s16 fa_default; /* 32 2 */ u8 offload:1; /* 34: 0 1 */ u8 trap:1; /* 34: 1 1 */ u8 unused:6; /* 34: 2 1 */
/* XXX 5 bytes hole, try to pack */
struct callback_head rcu __attribute__((__aligned__(8))); /* 40 16 */
/* size: 56, cachelines: 1, members: 12 */ /* sum members: 50, holes: 1, sum holes: 5 */ /* sum bitfield members: 8 bits (1 bytes) */ /* forced alignments: 1, forced holes: 1, sum forced holes: 5 */ /* last cacheline: 56 bytes */ } __attribute__((__aligned__(8)));
Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: David Ahern <dsahern@gmail.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
1e301fd0 |
| 14-Jan-2020 |
Ido Schimmel <idosch@mellanox.com> |
ipv4: Encapsulate function arguments in a struct
fib_dump_info() is used to prepare RTM_{NEW,DEL}ROUTE netlink messages using the passed arguments. Currently, the function takes 11 arguments, 6 of w
ipv4: Encapsulate function arguments in a struct
fib_dump_info() is used to prepare RTM_{NEW,DEL}ROUTE netlink messages using the passed arguments. Currently, the function takes 11 arguments, 6 of which are attributes of the route being dumped (e.g., prefix, TOS).
The next patch will need the function to also dump to user space an indication if the route is present in hardware or not. Instead of passing yet another argument, change the function to take a struct containing the different route attributes.
v2: * Name last argument of fib_dump_info() * Move 'struct fib_rt_info' to include/net/ip_fib.h so that it could later be passed to fib_alias_hw_flags_set()
Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: David Ahern <dsahern@gmail.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
Revision tags: v5.4.11, v5.4.10, v5.4.9, v5.4.8, v5.4.7, v5.4.6, v5.4.5, v5.4.4, v5.4.3, v5.3.15, v5.4.2, v5.4.1, v5.3.14, v5.4, v5.3.13, v5.3.12 |
|
#
c43c3d76 |
| 20-Nov-2019 |
Paolo Abeni <pabeni@redhat.com> |
ipv4: move fib4_has_custom_rules() helper to public header
So that we can use it in the next patch. Additionally constify the helper argument.
Suggested-by: David Ahern <dsahern@gmail.com> Signed-o
ipv4: move fib4_has_custom_rules() helper to public header
So that we can use it in the next patch. Additionally constify the helper argument.
Suggested-by: David Ahern <dsahern@gmail.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
Revision tags: v5.3.11, v5.3.10, v5.3.9, v5.3.8, v5.3.7, v5.3.6, v5.3.5, v5.3.4, v5.3.3 |
|
#
b7a59557 |
| 03-Oct-2019 |
Jiri Pirko <jiri@mellanox.com> |
net: fib_notifier: propagate extack down to the notifier block callback
Since errors are propagated all the way up to the caller, propagate possible extack of the caller all the way down to the noti
net: fib_notifier: propagate extack down to the notifier block callback
Since errors are propagated all the way up to the caller, propagate possible extack of the caller all the way down to the notifier block callback.
Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
55c894f7 |
| 03-Oct-2019 |
Jiri Pirko <jiri@mellanox.com> |
net: fib_notifier: propagate possible error during fib notifier registration
Unlike events for registered notifier, during the registration, the errors that happened for the block being registered a
net: fib_notifier: propagate possible error during fib notifier registration
Unlike events for registered notifier, during the registration, the errors that happened for the block being registered are not propagated up to the caller. Make sure the error is propagated for FIB rules and entries.
Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
#
7c550daf |
| 03-Oct-2019 |
Jiri Pirko <jiri@mellanox.com> |
net: fib_notifier: make FIB notifier per-netns
Currently all users of FIB notifier only cares about events in init_net. Later in this patchset, users get interested in other namespaces too. However,
net: fib_notifier: make FIB notifier per-netns
Currently all users of FIB notifier only cares about events in init_net. Later in this patchset, users get interested in other namespaces too. However, for every registered block user is interested only about one namespace. Make the FIB notifier registration per-netns and avoid unnecessary calls of notifier block for other namespaces.
Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
Revision tags: v5.3.2, v5.3.1, v5.3, v5.2.14, v5.3-rc8, v5.2.13, v5.2.12 |
|
#
7bdf4de1 |
| 04-Sep-2019 |
Donald Sharp <sharpd@cumulusnetworks.com> |
net: Properly update v4 routes with v6 nexthop
When creating a v4 route that uses a v6 nexthop from a nexthop group. Allow the kernel to properly send the nexthop as v6 via the RTA_VIA attribute.
B
net: Properly update v4 routes with v6 nexthop
When creating a v4 route that uses a v6 nexthop from a nexthop group. Allow the kernel to properly send the nexthop as v6 via the RTA_VIA attribute.
Broken behavior:
$ ip nexthop add via fe80::9 dev eth0 $ ip nexthop show id 1 via fe80::9 dev eth0 scope link $ ip route add 4.5.6.7/32 nhid 1 $ ip route show default via 10.0.2.2 dev eth0 4.5.6.7 nhid 1 via 254.128.0.0 dev eth0 10.0.2.0/24 dev eth0 proto kernel scope link src 10.0.2.15 $
Fixed behavior:
$ ip nexthop add via fe80::9 dev eth0 $ ip nexthop show id 1 via fe80::9 dev eth0 scope link $ ip route add 4.5.6.7/32 nhid 1 $ ip route show default via 10.0.2.2 dev eth0 4.5.6.7 nhid 1 via inet6 fe80::9 dev eth0 10.0.2.0/24 dev eth0 proto kernel scope link src 10.0.2.15 $
v2, v3: Addresses code review comments from David Ahern
Fixes: dcb1ecb50edf (“ipv4: Prepare for fib6_nh from a nexthop object”) Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
Revision tags: v5.2.11, v5.2.10, v5.2.9, v5.2.8, v5.2.7, v5.2.6, v5.2.5, v5.2.4, v5.2.3, v5.2.2, v5.2.1, v5.2, v5.1.16, v5.1.15, v5.1.14, v5.1.13 |
|
#
564c91f7 |
| 21-Jun-2019 |
Stefano Brivio <sbrivio@redhat.com> |
fib_frontend, ip6_fib: Select routes or exceptions dump from RTM_F_CLONED
The following patches add back the ability to dump IPv4 and IPv6 exception routes, and we need to allow selection of regular
fib_frontend, ip6_fib: Select routes or exceptions dump from RTM_F_CLONED
The following patches add back the ability to dump IPv4 and IPv6 exception routes, and we need to allow selection of regular routes or exceptions.
Use RTM_F_CLONED as filter to decide whether to dump routes or exceptions: iproute2 passes it in dump requests (except for IPv6 cache flush requests, this will be fixed in iproute2) and this used to work as long as exceptions were stored directly in the FIB, for both IPv4 and IPv6.
Caveat: if strict checking is not requested (that is, if the dump request doesn't go through ip_valid_fib_dump_req()), we can't filter on protocol, tables or route types.
In this case, filtering on RTM_F_CLONED would be inconsistent: we would fix 'ip route list cache' by returning exception routes and at the same time introduce another bug in case another selector is present, e.g. on 'ip route list cache table main' we would return all exception routes, without filtering on tables.
Keep this consistent by applying no filters at all, and dumping both routes and exceptions, if strict checking is not requested. iproute2 currently filters results anyway, and no unwanted results will be presented to the user. The kernel will just dump more data than needed.
v7: No changes
v6: Rebase onto net-next, no changes
v5: New patch: add dump_routes and dump_exceptions flags in filter and simply clear the unwanted one if strict checking is enabled, don't ignore NLM_F_MATCH and don't set filter_set if NLM_F_MATCH is set. Skip filtering altogether if no strict checking is requested: selecting routes or exceptions only would be inconsistent with the fact we can't filter on tables.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|