#
c1048aff |
| 10-Apr-2017 |
Erez Shitrit <erezsh@mellanox.com> |
IB/IPoIB: Use defined function for netdev_priv function
Make ipoib_priv point to netdev_priv where the code calls netdev_priv.
Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Reviewed-by: Alex Ve
IB/IPoIB: Use defined function for netdev_priv function
Make ipoib_priv point to netdev_priv where the code calls netdev_priv.
Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Reviewed-by: Alex Vesker <valex@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
show more ...
|
#
7ce1a3ee |
| 10-Apr-2017 |
Erez Shitrit <erezsh@mellanox.com> |
IB/IPoIB: Separate control from HW operation on ipoib_open/stop ndo
This patch is preparing the netdev part at the IPoIB driver to be able to use the ipoib_options.
It deals with the two flows from
IB/IPoIB: Separate control from HW operation on ipoib_open/stop ndo
This patch is preparing the netdev part at the IPoIB driver to be able to use the ipoib_options.
It deals with the two flows from the .ndo: ipoib_open and ipoib_stop.
The code is rearranged as follows: * All operations which deal with the hardware resources, (for example change QP state, post-receive etc.) are performed in one place. * All operations that are control oriented (like restart multicast task, start the reap_ah etc.) are performed in separate place.
The functions that deal with the hardware resources now located at __ipoib_ib_dev_open for the ipoib_open flow and __ipoib_ib_dev_stop for ipoib_stop.
Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Reviewed-by: Alex Vesker <valex@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
show more ...
|
#
515ed4f3 |
| 10-Apr-2017 |
Erez Shitrit <erezsh@mellanox.com> |
IB/IPoIB: Separate control and data related initializations
This patch prepares init and teardown flows so we can call them through ipoib_options function pointers.
It arranges that area of code as
IB/IPoIB: Separate control and data related initializations
This patch prepares init and teardown flows so we can call them through ipoib_options function pointers.
It arranges that area of code as the following: * All operations which deal with the resource allocation/deletion are performed in one place. * All operations that are control oriented, meaning that they are not connected to a specific hardware, are performed in a separate place.
The operations for allocation of hardware resources are now in the function ipoib_dev_init_default, and the deletion of all the resources are in ipoib_dev_uninit_default
The only exception is the creation of the PD object, which is used both for resource allocation (create QP etc.) and for control flows like creating AH.
It also does: * Move creation of rx_ring and tx_ring to be in the resources allocation area. * Move the function ipoib_ib_dev_open that does the open device to the control area instead of the dev_init which creates resources.
Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Reviewed-by: Alex Vesker <valex@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
show more ...
|
#
771a5258 |
| 29-Mar-2017 |
Shamir Rabinovitch <shamir.rabinovitch@oracle.com> |
IB/IPoIB: ibX: failed to create mcg debug file
When udev renames the netdev devices, ipoib debugfs entries does not get renamed. As a result, if subsequent probe of ipoib device reuse the name then
IB/IPoIB: ibX: failed to create mcg debug file
When udev renames the netdev devices, ipoib debugfs entries does not get renamed. As a result, if subsequent probe of ipoib device reuse the name then creating a debugfs entry for the new device would fail.
Also, moved ipoib_create_debug_files and ipoib_delete_debug_files as part of ipoib event handling in order to avoid any race condition between these.
Fixes: 1732b0ef3b3a ([IPoIB] add path record information in debugfs) Cc: stable@vger.kernel.org # 2.6.15+ Signed-off-by: Vijay Kumar <vijay.ac.kumar@oracle.com> Signed-off-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
show more ...
|
Revision tags: v4.10.4, v4.10.3, v4.10.2, v4.10.1, v4.10 |
|
#
2b084176 |
| 01-Feb-2017 |
Erez Shitrit <erezsh@mellanox.com> |
IB/IPoIB: Add destination address when re-queue packet
When sending packet to destination that was not resolved yet via path query, the driver keeps the skb and tries to re-send it again when the pa
IB/IPoIB: Add destination address when re-queue packet
When sending packet to destination that was not resolved yet via path query, the driver keeps the skb and tries to re-send it again when the path is resolved.
But when re-sending via dev_queue_xmit the kernel doesn't call to dev_hard_header, so IPoIB needs to keep 20 bytes in the skb and to put the destination address inside them.
In that way the dev_start_xmit will have the correct destination, and the driver won't take the destination from the skb->data, while nothing exists there, which causes to packet be be dropped.
The test flow is: 1. Run the SM on remote node, 2. Restart the driver. 4. Ping some destination, 3. Observe that first ICMP request will be dropped.
Fixes: fc791b633515 ("IB/ipoib: move back IB LL address into the hard header") Cc: <stable@vger.kernel.org> # v4.8+ Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Signed-off-by: Noa Osherovich <noaos@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Tested-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
show more ...
|
#
5c37077f |
| 18-Jan-2017 |
Zhu Yanjun <yanjun.zhu@oracle.com> |
IB/ipoib: Remove the unnecessary error check
The function ipoib_mcast_start_thread/ipoib_ib_dev_up always return zero. As such, in the function ipoib_open, err_stop will never be reached. So remove
IB/ipoib: Remove the unnecessary error check
The function ipoib_mcast_start_thread/ipoib_ib_dev_up always return zero. As such, in the function ipoib_open, err_stop will never be reached. So remove this err_stop and change the return type of the function ipoib_mcast_start_thread/ipoib_ib_dev_up to void.
Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
show more ...
|
#
f7534f45 |
| 05-Jan-2017 |
Zhu Yanjun <yanjun.zhu@oracle.com> |
IB/ipoib: Remove unnecessary returned value check
In the function ipoib_set_dev_features, the returned value is always 0. As such, it is not necessary to check the returned value. This is not a bug.
IB/ipoib: Remove unnecessary returned value check
In the function ipoib_set_dev_features, the returned value is always 0. As such, it is not necessary to check the returned value. This is not a bug. It is a trivial problem.
Reviewed-by: Guanglei Li <guanglei.li@oracle.com> Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
show more ...
|
#
db97ed0a |
| 20-Jan-2017 |
Bart Van Assche <bart.vanassche@sandisk.com> |
IB/IPoIB: Switch from dma_device to dev.parent
Prepare for removal of ib_device.dma_device.
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Doug Ledford <dledford@redhat.
IB/IPoIB: Switch from dma_device to dev.parent
Prepare for removal of ib_device.dma_device.
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
show more ...
|
#
c586071d |
| 28-Dec-2016 |
Feras Daoud <ferasda@mellanox.com> |
IB/ipoib: Replace list_del of the neigh->list with list_del_init
In order to resolve a situation where a few process delete the same list element in sequence and cause panic, list_del is replaced wi
IB/ipoib: Replace list_del of the neigh->list with list_del_init
In order to resolve a situation where a few process delete the same list element in sequence and cause panic, list_del is replaced with list_del_init. In this case if the first process that calls list_del releases the lock before acquiring it again, other processes who can acquire the lock will call list_del_init.
Fixes: b63b70d87741 ("IPoIB: Use a private hash table for path lookup") Signed-off-by: Feras Daoud <ferasda@mellanox.com> Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Reviewed-by: Alex Vesker <valex@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
show more ...
|
#
d32b9a81 |
| 28-Dec-2016 |
Feras Daoud <ferasda@mellanox.com> |
IB/ipoib: Add detailed error message to dev_queue_xmit call
Add a detailed return code to dev_queue_xmit function when calling to requeue packet via __skb_dequeue.
Signed-off-by: Feras Daoud <feras
IB/ipoib: Add detailed error message to dev_queue_xmit call
Add a detailed return code to dev_queue_xmit function when calling to requeue packet via __skb_dequeue.
Signed-off-by: Feras Daoud <ferasda@mellanox.com> Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
show more ...
|
#
0a0007f2 |
| 28-Dec-2016 |
Feras Daoud <ferasda@mellanox.com> |
IB/ipoib: Fix deadlock between rmmod and set_mode
When calling set_mode from sys/fs, the call flow locks the sys/fs lock first and then tries to lock rtnl_lock (when calling ipoib_set_mod). On the o
IB/ipoib: Fix deadlock between rmmod and set_mode
When calling set_mode from sys/fs, the call flow locks the sys/fs lock first and then tries to lock rtnl_lock (when calling ipoib_set_mod). On the other hand, the rmmod call flow takes the rtnl_lock first (when calling unregister_netdev) and then tries to take the sys/fs lock. Deadlock a->b, b->a.
The problem starts when ipoib_set_mod frees it's rtnl_lck and tries to get it after that.
set_mod: [<ffffffff8104f2bd>] ? check_preempt_curr+0x6d/0x90 [<ffffffff814fee8e>] __mutex_lock_slowpath+0x13e/0x180 [<ffffffff81448655>] ? __rtnl_unlock+0x15/0x20 [<ffffffff814fed2b>] mutex_lock+0x2b/0x50 [<ffffffff81448675>] rtnl_lock+0x15/0x20 [<ffffffffa02ad807>] ipoib_set_mode+0x97/0x160 [ib_ipoib] [<ffffffffa02b5f5b>] set_mode+0x3b/0x80 [ib_ipoib] [<ffffffff8134b840>] dev_attr_store+0x20/0x30 [<ffffffff811f0fe5>] sysfs_write_file+0xe5/0x170 [<ffffffff8117b068>] vfs_write+0xb8/0x1a0 [<ffffffff8117ba81>] sys_write+0x51/0x90 [<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b
rmmod: [<ffffffff81279ffc>] ? put_dec+0x10c/0x110 [<ffffffff8127a2ee>] ? number+0x2ee/0x320 [<ffffffff814fe6a5>] schedule_timeout+0x215/0x2e0 [<ffffffff8127cc04>] ? vsnprintf+0x484/0x5f0 [<ffffffff8127b550>] ? string+0x40/0x100 [<ffffffff814fe323>] wait_for_common+0x123/0x180 [<ffffffff81060250>] ? default_wake_function+0x0/0x20 [<ffffffff8119661e>] ? ifind_fast+0x5e/0xb0 [<ffffffff814fe43d>] wait_for_completion+0x1d/0x20 [<ffffffff811f2e68>] sysfs_addrm_finish+0x228/0x270 [<ffffffff811f2fb3>] sysfs_remove_dir+0xa3/0xf0 [<ffffffff81273f66>] kobject_del+0x16/0x40 [<ffffffff8134cd14>] device_del+0x184/0x1e0 [<ffffffff8144e59b>] netdev_unregister_kobject+0xab/0xc0 [<ffffffff8143c05e>] rollback_registered+0xae/0x130 [<ffffffff8143c102>] unregister_netdevice+0x22/0x70 [<ffffffff8143c16e>] unregister_netdev+0x1e/0x30 [<ffffffffa02a91b0>] ipoib_remove_one+0xe0/0x120 [ib_ipoib] [<ffffffffa01ed95f>] ib_unregister_device+0x4f/0x100 [ib_core] [<ffffffffa021f5e1>] mlx4_ib_remove+0x41/0x180 [mlx4_ib] [<ffffffffa01ab771>] mlx4_remove_device+0x71/0x90 [mlx4_core]
Fixes: 862096a8bbf8 ("IB/ipoib: Add more rtnl_link_ops callbacks") Cc: <stable@vger.kernel.org> # v3.6+ Cc: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Feras Daoud <ferasda@mellanox.com> Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
show more ...
|
#
80b5b35a |
| 28-Dec-2016 |
Feras Daoud <ferasda@mellanox.com> |
IB/ipoib: Set device connection mode only when needed
When changing the connection mode, the ipoib_set_mode function did not check if the previous connection mode equals to the new one. This commit
IB/ipoib: Set device connection mode only when needed
When changing the connection mode, the ipoib_set_mode function did not check if the previous connection mode equals to the new one. This commit adds the required check and return 0 if the new mode equals to the previous one.
Fixes: 839fcaba355a ("IPoIB: Connected mode experimental support") Signed-off-by: Feras Daoud <ferasda@mellanox.com> Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Reviewed-by: Alex Vesker <valex@mellanox.com> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
show more ...
|
#
29da686d |
| 28-Dec-2016 |
Feras Daoud <ferasda@mellanox.com> |
IB/ipoib: When given an invalid UD MTU, give debug msg
In datagram mode, the IB UD (Unreliable Datagram) transport is used so the MTU of the interface is equal to the IB L2 MTU minus the IPoIB encap
IB/ipoib: When given an invalid UD MTU, give debug msg
In datagram mode, the IB UD (Unreliable Datagram) transport is used so the MTU of the interface is equal to the IB L2 MTU minus the IPoIB encapsulation header. Any request to change the MTU value above the maximum range will change the MTU to the max allowed, but will not show any warning message. An ipoib_warn is issued in such cases, letting the user know that even though the value is legal, it can't be currently applied.
Signed-off-by: Feras Daoud <ferasda@mellanox.com> Signed-off-by: Noa Osherovich <noaos@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
show more ...
|
Revision tags: v4.9, openbmc-4.4-20161121-1, v4.4.33, v4.4.32, v4.4.31 |
|
#
74226649 |
| 03-Nov-2016 |
Leon Romanovsky <leon@kernel.org> |
IB/ipoib: Remove and fix debug prints after allocation failure
The prints after [k|v][m|z|c]alloc() functions are not needed, because in case of failure, allocator will print their internal error pr
IB/ipoib: Remove and fix debug prints after allocation failure
The prints after [k|v][m|z|c]alloc() functions are not needed, because in case of failure, allocator will print their internal error prints anyway.
Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
show more ...
|
Revision tags: v4.4.30, v4.4.29, v4.4.28, v4.4.27, v4.7.10, openbmc-4.4-20161021-1 |
|
#
b3e3893e |
| 20-Oct-2016 |
Jarod Wilson <jarod@redhat.com> |
net: use core MTU range checking in misc drivers
firewire-net: - set min/max_mtu - remove fwnet_change_mtu
nes: - set max_mtu - clean up nes_netdev_change_mtu
xpnet: - set min/max_mtu - remove xpn
net: use core MTU range checking in misc drivers
firewire-net: - set min/max_mtu - remove fwnet_change_mtu
nes: - set max_mtu - clean up nes_netdev_change_mtu
xpnet: - set min/max_mtu - remove xpnet_dev_change_mtu
hippi: - set min/max_mtu - remove hippi_change_mtu
batman-adv: - set max_mtu - remove batadv_interface_change_mtu - initialization is a little async, not 100% certain that max_mtu is set in the optimal place, don't have hardware to test with
rionet: - set min/max_mtu - remove rionet_change_mtu
slip: - set min/max_mtu - streamline sl_change_mtu
um/net_kern: - remove pointless ndo_change_mtu
hsi/clients/ssi_protocol: - use core MTU range checking - remove now redundant ssip_pn_set_mtu
ipoib: - set a default max MTU value - Note: ipoib's actual max MTU can vary, depending on if the device is in connected mode or not, so we'll just set the max_mtu value to the max possible, and let the ndo_change_mtu function continue to validate any new MTU change requests with checks for CM or not. Note that ipoib has no min_mtu set, and thus, the network core's mtu > 0 check is the only lower bounds here.
mptlan: - use net core MTU range checking - remove now redundant mpt_lan_change_mtu
fddi: - min_mtu = 21, max_mtu = 4470 - remove now redundant fddi_change_mtu (including export)
fjes: - min_mtu = 8192, max_mtu = 65536 - The max_mtu value is actually one over IP_MAX_MTU here, but the idea is to get past the core net MTU range checks so fjes_change_mtu can validate a new MTU against what it supports (see fjes_support_mtu in fjes_hw.c)
hsr: - min_mtu = 0 (calls ether_setup, max_mtu is 1500)
f_phonet: - min_mtu = 6, max_mtu = 65541
u_ether: - min_mtu = 14, max_mtu = 15412
phonet/pep-gprs: - min_mtu = 576, max_mtu = 65530 - remove redundant gprs_set_mtu
CC: netdev@vger.kernel.org CC: linux-rdma@vger.kernel.org CC: Stefan Richter <stefanr@s5r6.in-berlin.de> CC: Faisal Latif <faisal.latif@intel.com> CC: linux-rdma@vger.kernel.org CC: Cliff Whickman <cpw@sgi.com> CC: Robin Holt <robinmholt@gmail.com> CC: Jes Sorensen <jes@trained-monkey.org> CC: Marek Lindner <mareklindner@neomailbox.ch> CC: Simon Wunderlich <sw@simonwunderlich.de> CC: Antonio Quartulli <a@unstable.cc> CC: Sathya Prakash <sathya.prakash@broadcom.com> CC: Chaitra P B <chaitra.basappa@broadcom.com> CC: Suganath Prabu Subramani <suganath-prabu.subramani@broadcom.com> CC: MPT-FusionLinux.pdl@broadcom.com CC: Sebastian Reichel <sre@kernel.org> CC: Felipe Balbi <balbi@kernel.org> CC: Arvid Brodin <arvid.brodin@alten.se> CC: Remi Denis-Courmont <courmisch@gmail.com> Signed-off-by: Jarod Wilson <jarod@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
Revision tags: v4.7.9, v4.4.26 |
|
#
e0e79c8e |
| 17-Oct-2016 |
David Ahern <dsa@cumulusnetworks.com> |
IB/ipoib: Flip to new dev walk API
Convert ipoib_get_net_dev_match_addr to the new upper device walk API. This is just a code conversion; no functional change is intended.
v2 - removed typecast of
IB/ipoib: Flip to new dev walk API
Convert ipoib_get_net_dev_match_addr to the new upper device walk API. This is just a code conversion; no functional change is intended.
v2 - removed typecast of data
Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
Revision tags: v4.7.8, v4.4.25 |
|
#
fc791b63 |
| 13-Oct-2016 |
Paolo Abeni <pabeni@redhat.com> |
IB/ipoib: move back IB LL address into the hard header
After the commit 9207f9d45b0a ("net: preserve IP control block during GSO segmentation"), the GSO CB and the IPoIB CB conflict. That destroy th
IB/ipoib: move back IB LL address into the hard header
After the commit 9207f9d45b0a ("net: preserve IP control block during GSO segmentation"), the GSO CB and the IPoIB CB conflict. That destroy the IPoIB address information cached there, causing a severe performance regression, as better described here:
http://marc.info/?l=linux-kernel&m=146787279825501&w=2
This change moves the data cached by the IPoIB driver from the skb control lock into the IPoIB hard header, as done before the commit 936d7de3d736 ("IPoIB: Stop lying about hard_header_len and use skb->cb to stash LL addresses"). In order to avoid GRO issue, on packet reception, the IPoIB driver stash into the skb a dummy pseudo header, so that the received packets have actually a hard header matching the declared length. To avoid changing the connected mode maximum mtu, the allocated head buffer size is increased by the pseudo header length.
After this commit, IPoIB performances are back to pre-regression value.
v2 -> v3: rebased v1 -> v2: avoid changing the max mtu, increasing the head buf size
Fixes: 9207f9d45b0a ("net: preserve IP control block during GSO segmentation") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
show more ...
|
Revision tags: v4.4.24, v4.7.7, v4.8, v4.4.23, v4.7.6, v4.7.5, v4.4.22, v4.4.21, v4.7.4, v4.7.3, v4.4.20, v4.7.2, v4.4.19, openbmc-4.4-20160819-1, v4.7.1, v4.4.18 |
|
#
855cda68 |
| 15-Aug-2016 |
Bhaktipriya Shridhar <bhaktipriya96@gmail.com> |
IB/ipoib: Remove deprecated create_singlethread_workqueue
alloc_ordered_workqueue() replaces deprecated create_singlethread_workqueue().
The workqueue "ipoib_workqueue" that is used for all flush o
IB/ipoib: Remove deprecated create_singlethread_workqueue
alloc_ordered_workqueue() replaces deprecated create_singlethread_workqueue().
The workqueue "ipoib_workqueue" that is used for all flush operations for the device.
WQ_MEM_RECLAIM has been set since the flush operations may need to complete in order for other network functions to continue, and the memory reclaim operation might need the network functioning in order to make progress.
Signed-off-by: Bhaktipriya Shridhar <bhaktipriya96@gmail.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
show more ...
|
#
546481c2 |
| 28-Aug-2016 |
Erez Shitrit <erezsh@mellanox.com> |
IB/ipoib: Fix memory corruption in ipoib cm mode connect flow
When a new CM connection is being requested, ipoib driver copies data from the path pointer in the CM/tx object, the path object might b
IB/ipoib: Fix memory corruption in ipoib cm mode connect flow
When a new CM connection is being requested, ipoib driver copies data from the path pointer in the CM/tx object, the path object might be invalid at the point and memory corruption will happened later when now the CM driver will try using that data.
The next scenario demonstrates it: neigh_add_path --> ipoib_cm_create_tx --> queue_work (pointer to path is in the cm/tx struct) #while the work is still in the queue, #the port goes down and causes the ipoib_flush_paths: ipoib_flush_paths --> path_free --> kfree(path) #at this point the work scheduled starts. ipoib_cm_tx_start --> copy from the (invalid)path pointer: (memcpy(&pathrec, &p->path->pathrec, sizeof pathrec);) -> memory corruption.
To fix that the driver now starts the CM/tx connection only if that specific path exists in the general paths database. This check is protected with the relevant locks, and uses the gid from the neigh member in the CM/tx object which is valid according to the ref count that was taken by the CM/tx.
Fixes: 839fcaba35 ('IPoIB: Connected mode experimental support') Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
show more ...
|
Revision tags: v4.4.17, openbmc-4.4-20160804-1, v4.4.16, v4.7, openbmc-4.4-20160722-1, openbmc-20160722-1 |
|
#
5faba546 |
| 20-Jul-2016 |
Yuval Shaia <yuval.shaia@oracle.com> |
IB/ipoib: Report SG feature regardless of HW UD CSUM capability
Decouple SG support from HW ability to do UD checksum. This coupling is for historical reasons and removed with 'commit ec5f06156423 (
IB/ipoib: Report SG feature regardless of HW UD CSUM capability
Decouple SG support from HW ability to do UD checksum. This coupling is for historical reasons and removed with 'commit ec5f06156423 ("net: Kill link between CSUM and SG features.")'
During driver load it is assumed that device does not supports SG. The final decision is taken after creating UD QP based on device capability.
Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
show more ...
|
Revision tags: openbmc-20160713-1, v4.4.15, v4.6.4, v4.6.3, v4.4.14, v4.6.2, v4.4.13, openbmc-20160606-1 |
|
#
61c78eea |
| 04-Jun-2016 |
Erez Shitrit <erezsh@mellanox.com> |
IB/IPoIB: Don't update neigh validity for unresolved entries
ipoib_neigh_get unconditionally updates the "alive" variable member on any packet send. This prevents the neighbor garbage collection fr
IB/IPoIB: Don't update neigh validity for unresolved entries
ipoib_neigh_get unconditionally updates the "alive" variable member on any packet send. This prevents the neighbor garbage collection from cleaning out a dead neighbor entry if we are still queueing packets for it. If the queue for this neighbor is full, then don't update the alive timestamp. That way the neighbor can time out even if packets are still being queued as long as none of them are being sent.
Fixes: b63b70d87741 ("IPoIB: Use a private hash table for path lookup in xmit path") Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
show more ...
|
#
9b29953b |
| 04-Jun-2016 |
Mark Bloch <markb@mellanox.com> |
IB/IPoIB: Disable bottom half when dealing with device address
Align locking usage when touching device address with rest of the kernel. Lock the bottom half when doing so using netif_addr_lock_bh.
IB/IPoIB: Disable bottom half when dealing with device address
Align locking usage when touching device address with rest of the kernel. Lock the bottom half when doing so using netif_addr_lock_bh.
This also solves the following case as reported by lockdep: CPU0 CPU1 ---- ---- lock(_xmit_INFINIBAND); local_irq_disable(); lock(&(&mc->mca_lock)->rlock); lock(_xmit_INFINIBAND); <Interrupt> lock(&(&mc->mca_lock)->rlock);
*** DEADLOCK ***
Fixes: 492a7e67ff83 ("IB/IPoIB: Allow setting the device address") Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
show more ...
|
#
198b12f7 |
| 04-Jun-2016 |
Erez Shitrit <erezsh@mellanox.com> |
IB/IPoIB: Fix race between ipoib_remove_one to sysfs functions
In ipoib_remove_one the driver holds the rtnl_lock and tries to do some operation like dev_change_flags or unregister_netdev, while sys
IB/IPoIB: Fix race between ipoib_remove_one to sysfs functions
In ipoib_remove_one the driver holds the rtnl_lock and tries to do some operation like dev_change_flags or unregister_netdev, while sysfs callback like ipoib_vlan_delete holds sysfs mutex and tries to hold the rtnl_lock via rtnl_trylock() and restart_syscall() if the lock is not free, meanwhile ipoib_remove_one tries to get the sysfs lock in order to free its sysfs directory, and we will get a->b, b->a deadlock.
Trace like the following:
schedule+0x37/0x80 schedule_preempt_disabled+0xe/0x10 __mutex_lock_slowpath+0xb5/0x120 mutex_lock+0x23/0x40 rtnl_lock+0x15/0x20 netdev_run_todo+0x17c/0x320 rtnl_unlock+0xe/0x10 ipoib_vlan_delete+0x11b/0x1b0 [ib_ipoib] delete_child+0x54/0x80 [ib_ipoib] dev_attr_store+0x18/0x30 sysfs_kf_write+0x37/0x40 mutex_lock+0x16/0x40 SyS_write+0x55/0xc0 entry_SYSCALL_64_fastpath+0x16/0x75 And schedule+0x37/0x80 __kernfs_remove+0x1a8/0x260 ? wake_atomic_t_function+0x60/0x60 kernfs_remove+0x25/0x40 sysfs_remove_dir+0x50/0x80 kobject_del+0x18/0x50 device_del+0x19f/0x260 netdev_unregister_kobject+0x6a/0x80 rollback_registered_many+0x1fd/0x340 rollback_registered+0x3c/0x70 unregister_netdevice_queue+0x55/0xc0 unregister_netdev+0x20/0x30 ipoib_remove_one+0x114/0x1b0 [ib_ipoib] ib_unregister_client+0x4a/0x170 [ib_core] ? find_module_all+0x71/0xa0 ipoib_cleanup_module+0x10/0x94 [ib_ipoib] SyS_delete_module+0x1b5/0x210 entry_SYSCALL_64_fastpath+0x16/0x75
The fix is by checking the flag IPOIB_FLAG_INTF_ON_DESTROY in order to get out from the sysfs function.
Fixes: 862096a8bbf8 ("IB/ipoib: Add more rtnl_link_ops callbacks") Fixes: 9baa0b036410 ("IB/ipoib: Add rtnl_link_ops support") Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
show more ...
|
Revision tags: v4.6.1, v4.4.12, openbmc-20160521-1, v4.4.11 |
|
#
492a7e67 |
| 18-May-2016 |
Mark Bloch <markb@mellanox.com> |
IB/IPoIB: Allow setting the device address
In IB networks, and specifically in IPoIB/rdmacm traffic, the device address of an IPoIB interface is used as a means to exchange information between nodes
IB/IPoIB: Allow setting the device address
In IB networks, and specifically in IPoIB/rdmacm traffic, the device address of an IPoIB interface is used as a means to exchange information between nodes needed for communication.
Currently an IPoIB interface will always be created with a device address based on its node GUID without a way to change that.
This change adds the ability to set the device address of an IPoIB interface by value. We use the set mac address ndo to do that.
The flow should be broken down to two: 1) The GID value is already in the GID table, in this case the interface will be able to set carrier up.
2) The GID value is not yet in the GID table, in this case the interface won't try to join the multicast group and will wait (listen on GID_CHANGE event) until the GID is inserted.
In order to track those changes, we add a new flag: * IPOIB_FLAG_DEV_ADDR_SET.
When set, it means the dev_addr is a based on a value in the gid table. this bit will be cleared upon a dev_addr change triggered by the user and set after validation.
Per IB spec the port GUID can't change if the module is loaded. port GUID is the basis for GID at index 0 which is the basis for the default device address of a ipoib interface.
The issue is that there are devices that don't follow the spec, they change the port GUID while HCA is powered on, so in order not to break userspace applications. We need to check if the user wanted to control the device address and we assume that if he sets the device address back to be based on GID index 0, he no longer wishs to control it.
In order to track this, we add an additional flag: * IPOIB_FLAG_DEV_ADDR_CTRL
When setting the device address, there is no validation of the upper twelve bytes of the device address (flags, qpn, subnet prefix) as those bytes are not under the control of the user.
Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
show more ...
|
#
3b561130 |
| 25-May-2016 |
Erez Shitrit <erezsh@mellanox.com> |
IB/ipoib: Support SendOnlyFullMember MCG for SendOnly join
Check (via an SA query) if the SM supports the new option for SendOnly multicast joins. If the SM supports that option it will use the new
IB/ipoib: Support SendOnlyFullMember MCG for SendOnly join
Check (via an SA query) if the SM supports the new option for SendOnly multicast joins. If the SM supports that option it will use the new join state to create such multicast group. If SendOnlyFullMember is supported, we wouldn't use faked FullMember state join for SendOnly MCG, use the correct state if supported.
This check is performed at every invocation of mcast_restart task, to be sure that the driver stays in sync with the current state of the SM.
Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
show more ...
|