History log of /openbmc/linux/drivers/infiniband/ulp/ipoib/ipoib_main.c (Results 101 – 125 of 630)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
# c1048aff 10-Apr-2017 Erez Shitrit <erezsh@mellanox.com>

IB/IPoIB: Use defined function for netdev_priv function

Make ipoib_priv point to netdev_priv where the code calls netdev_priv.

Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Reviewed-by: Alex Ve

IB/IPoIB: Use defined function for netdev_priv function

Make ipoib_priv point to netdev_priv where the code calls netdev_priv.

Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Reviewed-by: Alex Vesker <valex@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>

show more ...


# 7ce1a3ee 10-Apr-2017 Erez Shitrit <erezsh@mellanox.com>

IB/IPoIB: Separate control from HW operation on ipoib_open/stop ndo

This patch is preparing the netdev part at the IPoIB driver to be able
to use the ipoib_options.

It deals with the two flows from

IB/IPoIB: Separate control from HW operation on ipoib_open/stop ndo

This patch is preparing the netdev part at the IPoIB driver to be able
to use the ipoib_options.

It deals with the two flows from the .ndo: ipoib_open and ipoib_stop.

The code is rearranged as follows:
* All operations which deal with the hardware resources, (for example
change QP state, post-receive etc.) are performed in one place.
* All operations that are control oriented (like restart multicast task,
start the reap_ah etc.) are performed in separate place.

The functions that deal with the hardware resources now located at
__ipoib_ib_dev_open for the ipoib_open flow and __ipoib_ib_dev_stop
for ipoib_stop.

Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Reviewed-by: Alex Vesker <valex@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>

show more ...


# 515ed4f3 10-Apr-2017 Erez Shitrit <erezsh@mellanox.com>

IB/IPoIB: Separate control and data related initializations

This patch prepares init and teardown flows so we can call them
through ipoib_options function pointers.

It arranges that area of code as

IB/IPoIB: Separate control and data related initializations

This patch prepares init and teardown flows so we can call them
through ipoib_options function pointers.

It arranges that area of code as the following:
* All operations which deal with the resource allocation/deletion
are performed in one place.
* All operations that are control oriented, meaning that they are not
connected to a specific hardware, are performed in a separate place.

The operations for allocation of hardware resources are now in the
function ipoib_dev_init_default, and the deletion of all the resources
are in ipoib_dev_uninit_default

The only exception is the creation of the PD object,
which is used both for resource allocation (create QP etc.)
and for control flows like creating AH.

It also does:
* Move creation of rx_ring and tx_ring to be in the resources
allocation area.
* Move the function ipoib_ib_dev_open that does the open device
to the control area instead of the dev_init which creates resources.

Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Reviewed-by: Alex Vesker <valex@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>

show more ...


# 771a5258 29-Mar-2017 Shamir Rabinovitch <shamir.rabinovitch@oracle.com>

IB/IPoIB: ibX: failed to create mcg debug file

When udev renames the netdev devices, ipoib debugfs entries does not
get renamed. As a result, if subsequent probe of ipoib device reuse the
name then

IB/IPoIB: ibX: failed to create mcg debug file

When udev renames the netdev devices, ipoib debugfs entries does not
get renamed. As a result, if subsequent probe of ipoib device reuse the
name then creating a debugfs entry for the new device would fail.

Also, moved ipoib_create_debug_files and ipoib_delete_debug_files as part
of ipoib event handling in order to avoid any race condition between these.

Fixes: 1732b0ef3b3a ([IPoIB] add path record information in debugfs)
Cc: stable@vger.kernel.org # 2.6.15+
Signed-off-by: Vijay Kumar <vijay.ac.kumar@oracle.com>
Signed-off-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>

show more ...


Revision tags: v4.10.4, v4.10.3, v4.10.2, v4.10.1, v4.10
# 2b084176 01-Feb-2017 Erez Shitrit <erezsh@mellanox.com>

IB/IPoIB: Add destination address when re-queue packet

When sending packet to destination that was not resolved yet
via path query, the driver keeps the skb and tries to re-send it
again when the pa

IB/IPoIB: Add destination address when re-queue packet

When sending packet to destination that was not resolved yet
via path query, the driver keeps the skb and tries to re-send it
again when the path is resolved.

But when re-sending via dev_queue_xmit the kernel doesn't call
to dev_hard_header, so IPoIB needs to keep 20 bytes in the skb
and to put the destination address inside them.

In that way the dev_start_xmit will have the correct destination,
and the driver won't take the destination from the skb->data, while
nothing exists there, which causes to packet be be dropped.

The test flow is:
1. Run the SM on remote node,
2. Restart the driver.
4. Ping some destination,
3. Observe that first ICMP request will be dropped.

Fixes: fc791b633515 ("IB/ipoib: move back IB LL address into the hard header")
Cc: <stable@vger.kernel.org> # v4.8+
Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Noa Osherovich <noaos@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Tested-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>

show more ...


# 5c37077f 18-Jan-2017 Zhu Yanjun <yanjun.zhu@oracle.com>

IB/ipoib: Remove the unnecessary error check

The function ipoib_mcast_start_thread/ipoib_ib_dev_up always return zero.
As such, in the function ipoib_open, err_stop will never be reached.
So remove

IB/ipoib: Remove the unnecessary error check

The function ipoib_mcast_start_thread/ipoib_ib_dev_up always return zero.
As such, in the function ipoib_open, err_stop will never be reached.
So remove this err_stop and change the return type of the function
ipoib_mcast_start_thread/ipoib_ib_dev_up to void.

Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>

show more ...


# f7534f45 05-Jan-2017 Zhu Yanjun <yanjun.zhu@oracle.com>

IB/ipoib: Remove unnecessary returned value check

In the function ipoib_set_dev_features, the returned value is always 0.
As such, it is not necessary to check the returned value.
This is not a bug.

IB/ipoib: Remove unnecessary returned value check

In the function ipoib_set_dev_features, the returned value is always 0.
As such, it is not necessary to check the returned value.
This is not a bug. It is a trivial problem.

Reviewed-by: Guanglei Li <guanglei.li@oracle.com>
Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com>
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>

show more ...


# db97ed0a 20-Jan-2017 Bart Van Assche <bart.vanassche@sandisk.com>

IB/IPoIB: Switch from dma_device to dev.parent

Prepare for removal of ib_device.dma_device.

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Doug Ledford <dledford@redhat.

IB/IPoIB: Switch from dma_device to dev.parent

Prepare for removal of ib_device.dma_device.

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>

show more ...


# c586071d 28-Dec-2016 Feras Daoud <ferasda@mellanox.com>

IB/ipoib: Replace list_del of the neigh->list with list_del_init

In order to resolve a situation where a few process delete
the same list element in sequence and cause panic, list_del
is replaced wi

IB/ipoib: Replace list_del of the neigh->list with list_del_init

In order to resolve a situation where a few process delete
the same list element in sequence and cause panic, list_del
is replaced with list_del_init. In this case if the first
process that calls list_del releases the lock before acquiring
it again, other processes who can acquire the lock will call
list_del_init.

Fixes: b63b70d87741 ("IPoIB: Use a private hash table for path lookup")
Signed-off-by: Feras Daoud <ferasda@mellanox.com>
Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Reviewed-by: Alex Vesker <valex@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>

show more ...


# d32b9a81 28-Dec-2016 Feras Daoud <ferasda@mellanox.com>

IB/ipoib: Add detailed error message to dev_queue_xmit call

Add a detailed return code to dev_queue_xmit function when
calling to requeue packet via __skb_dequeue.

Signed-off-by: Feras Daoud <feras

IB/ipoib: Add detailed error message to dev_queue_xmit call

Add a detailed return code to dev_queue_xmit function when
calling to requeue packet via __skb_dequeue.

Signed-off-by: Feras Daoud <ferasda@mellanox.com>
Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>

show more ...


# 0a0007f2 28-Dec-2016 Feras Daoud <ferasda@mellanox.com>

IB/ipoib: Fix deadlock between rmmod and set_mode

When calling set_mode from sys/fs, the call flow locks the sys/fs lock
first and then tries to lock rtnl_lock (when calling ipoib_set_mod).
On the o

IB/ipoib: Fix deadlock between rmmod and set_mode

When calling set_mode from sys/fs, the call flow locks the sys/fs lock
first and then tries to lock rtnl_lock (when calling ipoib_set_mod).
On the other hand, the rmmod call flow takes the rtnl_lock first
(when calling unregister_netdev) and then tries to take the sys/fs
lock. Deadlock a->b, b->a.

The problem starts when ipoib_set_mod frees it's rtnl_lck and tries
to get it after that.

set_mod:
[<ffffffff8104f2bd>] ? check_preempt_curr+0x6d/0x90
[<ffffffff814fee8e>] __mutex_lock_slowpath+0x13e/0x180
[<ffffffff81448655>] ? __rtnl_unlock+0x15/0x20
[<ffffffff814fed2b>] mutex_lock+0x2b/0x50
[<ffffffff81448675>] rtnl_lock+0x15/0x20
[<ffffffffa02ad807>] ipoib_set_mode+0x97/0x160 [ib_ipoib]
[<ffffffffa02b5f5b>] set_mode+0x3b/0x80 [ib_ipoib]
[<ffffffff8134b840>] dev_attr_store+0x20/0x30
[<ffffffff811f0fe5>] sysfs_write_file+0xe5/0x170
[<ffffffff8117b068>] vfs_write+0xb8/0x1a0
[<ffffffff8117ba81>] sys_write+0x51/0x90
[<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b

rmmod:
[<ffffffff81279ffc>] ? put_dec+0x10c/0x110
[<ffffffff8127a2ee>] ? number+0x2ee/0x320
[<ffffffff814fe6a5>] schedule_timeout+0x215/0x2e0
[<ffffffff8127cc04>] ? vsnprintf+0x484/0x5f0
[<ffffffff8127b550>] ? string+0x40/0x100
[<ffffffff814fe323>] wait_for_common+0x123/0x180
[<ffffffff81060250>] ? default_wake_function+0x0/0x20
[<ffffffff8119661e>] ? ifind_fast+0x5e/0xb0
[<ffffffff814fe43d>] wait_for_completion+0x1d/0x20
[<ffffffff811f2e68>] sysfs_addrm_finish+0x228/0x270
[<ffffffff811f2fb3>] sysfs_remove_dir+0xa3/0xf0
[<ffffffff81273f66>] kobject_del+0x16/0x40
[<ffffffff8134cd14>] device_del+0x184/0x1e0
[<ffffffff8144e59b>] netdev_unregister_kobject+0xab/0xc0
[<ffffffff8143c05e>] rollback_registered+0xae/0x130
[<ffffffff8143c102>] unregister_netdevice+0x22/0x70
[<ffffffff8143c16e>] unregister_netdev+0x1e/0x30
[<ffffffffa02a91b0>] ipoib_remove_one+0xe0/0x120 [ib_ipoib]
[<ffffffffa01ed95f>] ib_unregister_device+0x4f/0x100 [ib_core]
[<ffffffffa021f5e1>] mlx4_ib_remove+0x41/0x180 [mlx4_ib]
[<ffffffffa01ab771>] mlx4_remove_device+0x71/0x90 [mlx4_core]

Fixes: 862096a8bbf8 ("IB/ipoib: Add more rtnl_link_ops callbacks")
Cc: <stable@vger.kernel.org> # v3.6+
Cc: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Feras Daoud <ferasda@mellanox.com>
Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>

show more ...


# 80b5b35a 28-Dec-2016 Feras Daoud <ferasda@mellanox.com>

IB/ipoib: Set device connection mode only when needed

When changing the connection mode, the ipoib_set_mode function
did not check if the previous connection mode equals to the
new one. This commit

IB/ipoib: Set device connection mode only when needed

When changing the connection mode, the ipoib_set_mode function
did not check if the previous connection mode equals to the
new one. This commit adds the required check and return 0 if the new
mode equals to the previous one.

Fixes: 839fcaba355a ("IPoIB: Connected mode experimental support")
Signed-off-by: Feras Daoud <ferasda@mellanox.com>
Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Reviewed-by: Alex Vesker <valex@mellanox.com>
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>

show more ...


# 29da686d 28-Dec-2016 Feras Daoud <ferasda@mellanox.com>

IB/ipoib: When given an invalid UD MTU, give debug msg

In datagram mode, the IB UD (Unreliable Datagram) transport is used
so the MTU of the interface is equal to the IB L2 MTU minus the
IPoIB encap

IB/ipoib: When given an invalid UD MTU, give debug msg

In datagram mode, the IB UD (Unreliable Datagram) transport is used
so the MTU of the interface is equal to the IB L2 MTU minus the
IPoIB encapsulation header. Any request to change the MTU value
above the maximum range will change the MTU to the max allowed, but
will not show any warning message. An ipoib_warn is issued in such
cases, letting the user know that even though the value is legal,
it can't be currently applied.

Signed-off-by: Feras Daoud <ferasda@mellanox.com>
Signed-off-by: Noa Osherovich <noaos@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>

show more ...


Revision tags: v4.9, openbmc-4.4-20161121-1, v4.4.33, v4.4.32, v4.4.31
# 74226649 03-Nov-2016 Leon Romanovsky <leon@kernel.org>

IB/ipoib: Remove and fix debug prints after allocation failure

The prints after [k|v][m|z|c]alloc() functions are not needed,
because in case of failure, allocator will print their internal
error pr

IB/ipoib: Remove and fix debug prints after allocation failure

The prints after [k|v][m|z|c]alloc() functions are not needed,
because in case of failure, allocator will print their internal
error prints anyway.

Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>

show more ...


Revision tags: v4.4.30, v4.4.29, v4.4.28, v4.4.27, v4.7.10, openbmc-4.4-20161021-1
# b3e3893e 20-Oct-2016 Jarod Wilson <jarod@redhat.com>

net: use core MTU range checking in misc drivers

firewire-net:
- set min/max_mtu
- remove fwnet_change_mtu

nes:
- set max_mtu
- clean up nes_netdev_change_mtu

xpnet:
- set min/max_mtu
- remove xpn

net: use core MTU range checking in misc drivers

firewire-net:
- set min/max_mtu
- remove fwnet_change_mtu

nes:
- set max_mtu
- clean up nes_netdev_change_mtu

xpnet:
- set min/max_mtu
- remove xpnet_dev_change_mtu

hippi:
- set min/max_mtu
- remove hippi_change_mtu

batman-adv:
- set max_mtu
- remove batadv_interface_change_mtu
- initialization is a little async, not 100% certain that max_mtu is set
in the optimal place, don't have hardware to test with

rionet:
- set min/max_mtu
- remove rionet_change_mtu

slip:
- set min/max_mtu
- streamline sl_change_mtu

um/net_kern:
- remove pointless ndo_change_mtu

hsi/clients/ssi_protocol:
- use core MTU range checking
- remove now redundant ssip_pn_set_mtu

ipoib:
- set a default max MTU value
- Note: ipoib's actual max MTU can vary, depending on if the device is in
connected mode or not, so we'll just set the max_mtu value to the max
possible, and let the ndo_change_mtu function continue to validate any new
MTU change requests with checks for CM or not. Note that ipoib has no
min_mtu set, and thus, the network core's mtu > 0 check is the only lower
bounds here.

mptlan:
- use net core MTU range checking
- remove now redundant mpt_lan_change_mtu

fddi:
- min_mtu = 21, max_mtu = 4470
- remove now redundant fddi_change_mtu (including export)

fjes:
- min_mtu = 8192, max_mtu = 65536
- The max_mtu value is actually one over IP_MAX_MTU here, but the idea is to
get past the core net MTU range checks so fjes_change_mtu can validate a
new MTU against what it supports (see fjes_support_mtu in fjes_hw.c)

hsr:
- min_mtu = 0 (calls ether_setup, max_mtu is 1500)

f_phonet:
- min_mtu = 6, max_mtu = 65541

u_ether:
- min_mtu = 14, max_mtu = 15412

phonet/pep-gprs:
- min_mtu = 576, max_mtu = 65530
- remove redundant gprs_set_mtu

CC: netdev@vger.kernel.org
CC: linux-rdma@vger.kernel.org
CC: Stefan Richter <stefanr@s5r6.in-berlin.de>
CC: Faisal Latif <faisal.latif@intel.com>
CC: linux-rdma@vger.kernel.org
CC: Cliff Whickman <cpw@sgi.com>
CC: Robin Holt <robinmholt@gmail.com>
CC: Jes Sorensen <jes@trained-monkey.org>
CC: Marek Lindner <mareklindner@neomailbox.ch>
CC: Simon Wunderlich <sw@simonwunderlich.de>
CC: Antonio Quartulli <a@unstable.cc>
CC: Sathya Prakash <sathya.prakash@broadcom.com>
CC: Chaitra P B <chaitra.basappa@broadcom.com>
CC: Suganath Prabu Subramani <suganath-prabu.subramani@broadcom.com>
CC: MPT-FusionLinux.pdl@broadcom.com
CC: Sebastian Reichel <sre@kernel.org>
CC: Felipe Balbi <balbi@kernel.org>
CC: Arvid Brodin <arvid.brodin@alten.se>
CC: Remi Denis-Courmont <courmisch@gmail.com>
Signed-off-by: Jarod Wilson <jarod@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

show more ...


Revision tags: v4.7.9, v4.4.26
# e0e79c8e 17-Oct-2016 David Ahern <dsa@cumulusnetworks.com>

IB/ipoib: Flip to new dev walk API

Convert ipoib_get_net_dev_match_addr to the new upper device walk API.
This is just a code conversion; no functional change is intended.

v2
- removed typecast of

IB/ipoib: Flip to new dev walk API

Convert ipoib_get_net_dev_match_addr to the new upper device walk API.
This is just a code conversion; no functional change is intended.

v2
- removed typecast of data

Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

show more ...


Revision tags: v4.7.8, v4.4.25
# fc791b63 13-Oct-2016 Paolo Abeni <pabeni@redhat.com>

IB/ipoib: move back IB LL address into the hard header

After the commit 9207f9d45b0a ("net: preserve IP control block
during GSO segmentation"), the GSO CB and the IPoIB CB conflict.
That destroy th

IB/ipoib: move back IB LL address into the hard header

After the commit 9207f9d45b0a ("net: preserve IP control block
during GSO segmentation"), the GSO CB and the IPoIB CB conflict.
That destroy the IPoIB address information cached there,
causing a severe performance regression, as better described here:

http://marc.info/?l=linux-kernel&m=146787279825501&w=2

This change moves the data cached by the IPoIB driver from the
skb control lock into the IPoIB hard header, as done before
the commit 936d7de3d736 ("IPoIB: Stop lying about hard_header_len
and use skb->cb to stash LL addresses").
In order to avoid GRO issue, on packet reception, the IPoIB driver
stash into the skb a dummy pseudo header, so that the received
packets have actually a hard header matching the declared length.
To avoid changing the connected mode maximum mtu, the allocated
head buffer size is increased by the pseudo header length.

After this commit, IPoIB performances are back to pre-regression
value.

v2 -> v3: rebased
v1 -> v2: avoid changing the max mtu, increasing the head buf size

Fixes: 9207f9d45b0a ("net: preserve IP control block during GSO segmentation")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

show more ...


Revision tags: v4.4.24, v4.7.7, v4.8, v4.4.23, v4.7.6, v4.7.5, v4.4.22, v4.4.21, v4.7.4, v4.7.3, v4.4.20, v4.7.2, v4.4.19, openbmc-4.4-20160819-1, v4.7.1, v4.4.18
# 855cda68 15-Aug-2016 Bhaktipriya Shridhar <bhaktipriya96@gmail.com>

IB/ipoib: Remove deprecated create_singlethread_workqueue

alloc_ordered_workqueue() replaces deprecated
create_singlethread_workqueue().

The workqueue "ipoib_workqueue" that is used for all flush o

IB/ipoib: Remove deprecated create_singlethread_workqueue

alloc_ordered_workqueue() replaces deprecated
create_singlethread_workqueue().

The workqueue "ipoib_workqueue" that is used for all flush operations
for the device.

WQ_MEM_RECLAIM has been set since the flush operations may need to
complete in order for other network functions to continue, and
the memory reclaim operation might need the network functioning in
order to make progress.

Signed-off-by: Bhaktipriya Shridhar <bhaktipriya96@gmail.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>

show more ...


# 546481c2 28-Aug-2016 Erez Shitrit <erezsh@mellanox.com>

IB/ipoib: Fix memory corruption in ipoib cm mode connect flow

When a new CM connection is being requested, ipoib driver copies data
from the path pointer in the CM/tx object, the path object might b

IB/ipoib: Fix memory corruption in ipoib cm mode connect flow

When a new CM connection is being requested, ipoib driver copies data
from the path pointer in the CM/tx object, the path object might be
invalid at the point and memory corruption will happened later when now
the CM driver will try using that data.

The next scenario demonstrates it:
neigh_add_path --> ipoib_cm_create_tx -->
queue_work (pointer to path is in the cm/tx struct)
#while the work is still in the queue,
#the port goes down and causes the ipoib_flush_paths:
ipoib_flush_paths --> path_free --> kfree(path)
#at this point the work scheduled starts.
ipoib_cm_tx_start --> copy from the (invalid)path pointer:
(memcpy(&pathrec, &p->path->pathrec, sizeof pathrec);)
-> memory corruption.

To fix that the driver now starts the CM/tx connection only if that
specific path exists in the general paths database.
This check is protected with the relevant locks, and uses the gid from
the neigh member in the CM/tx object which is valid according to the ref
count that was taken by the CM/tx.

Fixes: 839fcaba35 ('IPoIB: Connected mode experimental support')
Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>

show more ...


Revision tags: v4.4.17, openbmc-4.4-20160804-1, v4.4.16, v4.7, openbmc-4.4-20160722-1, openbmc-20160722-1
# 5faba546 20-Jul-2016 Yuval Shaia <yuval.shaia@oracle.com>

IB/ipoib: Report SG feature regardless of HW UD CSUM capability

Decouple SG support from HW ability to do UD checksum.
This coupling is for historical reasons and removed with 'commit
ec5f06156423 (

IB/ipoib: Report SG feature regardless of HW UD CSUM capability

Decouple SG support from HW ability to do UD checksum.
This coupling is for historical reasons and removed with 'commit
ec5f06156423 ("net: Kill link between CSUM and SG features.")'

During driver load it is assumed that device does not supports SG. The
final decision is taken after creating UD QP based on device capability.

Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>

show more ...


Revision tags: openbmc-20160713-1, v4.4.15, v4.6.4, v4.6.3, v4.4.14, v4.6.2, v4.4.13, openbmc-20160606-1
# 61c78eea 04-Jun-2016 Erez Shitrit <erezsh@mellanox.com>

IB/IPoIB: Don't update neigh validity for unresolved entries

ipoib_neigh_get unconditionally updates the "alive" variable member on
any packet send. This prevents the neighbor garbage collection fr

IB/IPoIB: Don't update neigh validity for unresolved entries

ipoib_neigh_get unconditionally updates the "alive" variable member on
any packet send. This prevents the neighbor garbage collection from
cleaning out a dead neighbor entry if we are still queueing packets
for it. If the queue for this neighbor is full, then don't update the
alive timestamp. That way the neighbor can time out even if packets
are still being queued as long as none of them are being sent.

Fixes: b63b70d87741 ("IPoIB: Use a private hash table for path lookup in xmit path")
Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>

show more ...


# 9b29953b 04-Jun-2016 Mark Bloch <markb@mellanox.com>

IB/IPoIB: Disable bottom half when dealing with device address

Align locking usage when touching device address with rest
of the kernel. Lock the bottom half when doing so using
netif_addr_lock_bh.

IB/IPoIB: Disable bottom half when dealing with device address

Align locking usage when touching device address with rest
of the kernel. Lock the bottom half when doing so using
netif_addr_lock_bh.

This also solves the following case as reported by lockdep:
CPU0 CPU1
---- ----
lock(_xmit_INFINIBAND);
local_irq_disable();
lock(&(&mc->mca_lock)->rlock);
lock(_xmit_INFINIBAND);
<Interrupt>
lock(&(&mc->mca_lock)->rlock);

*** DEADLOCK ***

Fixes: 492a7e67ff83 ("IB/IPoIB: Allow setting the device address")
Signed-off-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>

show more ...


# 198b12f7 04-Jun-2016 Erez Shitrit <erezsh@mellanox.com>

IB/IPoIB: Fix race between ipoib_remove_one to sysfs functions

In ipoib_remove_one the driver holds the rtnl_lock and tries to do some
operation like dev_change_flags or unregister_netdev, while sys

IB/IPoIB: Fix race between ipoib_remove_one to sysfs functions

In ipoib_remove_one the driver holds the rtnl_lock and tries to do some
operation like dev_change_flags or unregister_netdev, while sysfs
callback like ipoib_vlan_delete holds sysfs mutex and tries to hold the
rtnl_lock via rtnl_trylock() and restart_syscall() if the lock is not
free, meanwhile ipoib_remove_one tries to get the sysfs lock in order to
free its sysfs directory, and we will get a->b, b->a deadlock.

Trace like the following:

schedule+0x37/0x80
schedule_preempt_disabled+0xe/0x10
__mutex_lock_slowpath+0xb5/0x120
mutex_lock+0x23/0x40
rtnl_lock+0x15/0x20
netdev_run_todo+0x17c/0x320
rtnl_unlock+0xe/0x10
ipoib_vlan_delete+0x11b/0x1b0 [ib_ipoib]
delete_child+0x54/0x80 [ib_ipoib]
dev_attr_store+0x18/0x30
sysfs_kf_write+0x37/0x40
mutex_lock+0x16/0x40
SyS_write+0x55/0xc0
entry_SYSCALL_64_fastpath+0x16/0x75
And
schedule+0x37/0x80
__kernfs_remove+0x1a8/0x260
? wake_atomic_t_function+0x60/0x60
kernfs_remove+0x25/0x40
sysfs_remove_dir+0x50/0x80
kobject_del+0x18/0x50
device_del+0x19f/0x260
netdev_unregister_kobject+0x6a/0x80
rollback_registered_many+0x1fd/0x340
rollback_registered+0x3c/0x70
unregister_netdevice_queue+0x55/0xc0
unregister_netdev+0x20/0x30
ipoib_remove_one+0x114/0x1b0 [ib_ipoib]
ib_unregister_client+0x4a/0x170 [ib_core]
? find_module_all+0x71/0xa0
ipoib_cleanup_module+0x10/0x94 [ib_ipoib]
SyS_delete_module+0x1b5/0x210
entry_SYSCALL_64_fastpath+0x16/0x75

The fix is by checking the flag IPOIB_FLAG_INTF_ON_DESTROY in order to
get out from the sysfs function.

Fixes: 862096a8bbf8 ("IB/ipoib: Add more rtnl_link_ops callbacks")
Fixes: 9baa0b036410 ("IB/ipoib: Add rtnl_link_ops support")
Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>

show more ...


Revision tags: v4.6.1, v4.4.12, openbmc-20160521-1, v4.4.11
# 492a7e67 18-May-2016 Mark Bloch <markb@mellanox.com>

IB/IPoIB: Allow setting the device address

In IB networks, and specifically in IPoIB/rdmacm traffic, the device
address of an IPoIB interface is used as a means to exchange information
between nodes

IB/IPoIB: Allow setting the device address

In IB networks, and specifically in IPoIB/rdmacm traffic, the device
address of an IPoIB interface is used as a means to exchange information
between nodes needed for communication.

Currently an IPoIB interface will always be created with a device
address based on its node GUID without a way to change that.

This change adds the ability to set the device address of an IPoIB
interface by value. We use the set mac address ndo to do that.

The flow should be broken down to two:
1) The GID value is already in the GID table,
in this case the interface will be able to set carrier up.

2) The GID value is not yet in the GID table,
in this case the interface won't try to join the multicast group
and will wait (listen on GID_CHANGE event) until the GID is inserted.

In order to track those changes, we add a new flag:
* IPOIB_FLAG_DEV_ADDR_SET.

When set, it means the dev_addr is a based on a value in the gid
table. this bit will be cleared upon a dev_addr change triggered
by the user and set after validation.

Per IB spec the port GUID can't change if the module is loaded.
port GUID is the basis for GID at index 0 which is the basis for
the default device address of a ipoib interface.

The issue is that there are devices that don't follow the spec,
they change the port GUID while HCA is powered on, so in order
not to break userspace applications. We need to check if the
user wanted to control the device address and we assume that
if he sets the device address back to be based on GID index 0,
he no longer wishs to control it.

In order to track this, we add an additional flag:
* IPOIB_FLAG_DEV_ADDR_CTRL

When setting the device address, there is no validation of the upper
twelve bytes of the device address (flags, qpn, subnet prefix) as those
bytes are not under the control of the user.

Signed-off-by: Mark Bloch <markb@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>

show more ...


# 3b561130 25-May-2016 Erez Shitrit <erezsh@mellanox.com>

IB/ipoib: Support SendOnlyFullMember MCG for SendOnly join

Check (via an SA query) if the SM supports the new option for SendOnly
multicast joins.
If the SM supports that option it will use the new

IB/ipoib: Support SendOnlyFullMember MCG for SendOnly join

Check (via an SA query) if the SM supports the new option for SendOnly
multicast joins.
If the SM supports that option it will use the new join state to create
such multicast group.
If SendOnlyFullMember is supported, we wouldn't use faked FullMember state
join for SendOnly MCG, use the correct state if supported.

This check is performed at every invocation of mcast_restart task, to be
sure that the driver stays in sync with the current state of the SM.

Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>

show more ...


12345678910>>...26