- Jun 04, 2014
-
-
Chuck Lever authored
Clean up: rpcrdma_ep_destroy() returns a value that is used only to print a debugging message. rpcrdma_ep_destroy() already prints debugging messages in all error cases. Make rpcrdma_ep_destroy() return void instead. Signed-off-by:
Chuck Lever <chuck.lever@oracle.com> Tested-by:
Steve Wise <swise@opengridcomputing.com> Signed-off-by:
Anna Schumaker <Anna.Schumaker@Netapp.com>
-
Chuck Lever authored
Clean up: All remaining callers of rpcrdma_deregister_external() pass NULL as the last argument, so remove that argument. Signed-off-by:
Chuck Lever <chuck.lever@oracle.com> Tested-by:
Steve Wise <swise@opengridcomputing.com> Signed-off-by:
Anna Schumaker <Anna.Schumaker@Netapp.com>
-
Chuck Lever authored
If the selected memory registration mode is not supported by the underlying provider/HCA, the NFS mount command reports that there was an invalid mount option, and fails. This is misleading. Reporting a problem allocating memory is a lot closer to the truth. Signed-off-by:
Chuck Lever <chuck.lever@oracle.com> Signed-off-by:
Anna Schumaker <Anna.Schumaker@Netapp.com>
-
Chuck Lever authored
An audit of in-kernel RDMA providers that do not support the FRMR memory registration shows that several of them support MTHCAFMR. Prefer MTHCAFMR when FRMR is not supported. If MTHCAFMR is not supported, only then choose ALLPHYSICAL. Signed-off-by:
Chuck Lever <chuck.lever@oracle.com> Signed-off-by:
Anna Schumaker <Anna.Schumaker@Netapp.com>
-
Chuck Lever authored
All kernel RDMA providers except amso1100 support either MTHCAFMR or FRMR, both of which are faster than REGISTER. amso1100 can continue to use ALLPHYSICAL. The only other ULP consumer in the kernel that uses the reg_phys_mr verb is Lustre. Signed-off-by:
Chuck Lever <chuck.lever@oracle.com> Signed-off-by:
Anna Schumaker <Anna.Schumaker@Netapp.com>
-
Chuck Lever authored
The MEMWINDOWS and MEMWINDOWS_ASYNC memory registration modes were intended as stop-gap modes before the introduction of FRMR. They are now considered obsolete. MEMWINDOWS_ASYNC is also considered unsafe because it can leave client memory registered and exposed for an indeterminant time after each I/O. At this point, the MEMWINDOWS modes add needless complexity, so remove them. Signed-off-by:
Chuck Lever <chuck.lever@oracle.com> Tested-by:
Steve Wise <swise@opengridcomputing.com> Signed-off-by:
Anna Schumaker <Anna.Schumaker@Netapp.com>
-
Chuck Lever authored
Clean up: This memory registration mode is slow and was never meant for use in production environments. Remove it to reduce implementation complexity. Signed-off-by:
Chuck Lever <chuck.lever@oracle.com> Tested-by:
Steve Wise <swise@opengridcomputing.com> Signed-off-by:
Anna Schumaker <Anna.Schumaker@Netapp.com>
-
Chuck Lever authored
An IB provider can invoke rpcrdma_conn_func() in an IRQ context, thus rpcrdma_conn_func() cannot be allowed to directly invoke generic RPC functions like xprt_wake_pending_tasks(). Signed-off-by:
Chuck Lever <chuck.lever@oracle.com> Tested-by:
Steve Wise <swise@opengridcomputing.com> Signed-off-by:
Anna Schumaker <Anna.Schumaker@Netapp.com>
-
Allen Andrews authored
Two memory region leaks were found during testing: 1. rpcrdma_buffer_create: While allocating RPCRDMA_FRMR's ib_alloc_fast_reg_mr is called and then ib_alloc_fast_reg_page_list is called. If ib_alloc_fast_reg_page_list returns an error it bails out of the routine dropping the last ib_alloc_fast_reg_mr frmr region creating a memory leak. Added code to dereg the last frmr if ib_alloc_fast_reg_page_list fails. 2. rpcrdma_buffer_destroy: While cleaning up, the routine will only free the MR's on the rb_mws list if there are rb_send_bufs present. However, in rpcrdma_buffer_create while the rb_mws list is being built if one of the MR allocation requests fail after some MR's have been allocated on the rb_mws list the routine never gets to create any rb_send_bufs but instead jumps to the rpcrdma_buffer_destroy routine which will never free the MR's on rb_mws list because the rb_send_bufs were never created. This leaks all the MR's on the rb_mws list that were created prior to one of the MR allocations failing. Issue(2) was seen during testing. Our adapter had a finite number of MR's available and we created enough connections to where we saw an MR allocation failure on our Nth NFS connection request. After the kernel cleaned up the resources it had allocated for the Nth connection we noticed that FMR's had been leaked due to the coding error described above. Issue(1) was seen during a code review while debugging issue(2). Signed-off-by:
Allen Andrews <allen.andrews@emulex.com> Reviewed-by:
Chuck Lever <chuck.lever@oracle.com> Signed-off-by:
Anna Schumaker <Anna.Schumaker@Netapp.com>
-
Steve Wise authored
Some rdma devices don't support a fast register page list depth of at least RPCRDMA_MAX_DATA_SEGS. So xprtrdma needs to chunk its fast register regions according to the minimum of the device max supported depth or RPCRDMA_MAX_DATA_SEGS. Signed-off-by:
Steve Wise <swise@opengridcomputing.com> Reviewed-by:
Chuck Lever <chuck.lever@oracle.com> Signed-off-by:
Anna Schumaker <Anna.Schumaker@Netapp.com>
-
- Feb 21, 2013
-
-
Shani Michaeli authored
This patch enhances the IB core support for Memory Windows (MWs). MWs allow an application to have better/flexible control over remote access to memory. Two types of MWs are supported, with the second type having two flavors: Type 1 - associated with PD only Type 2A - associated with QPN only Type 2B - associated with PD and QPN Applications can allocate a MW once, and then repeatedly bind the MW to different ranges in MRs that are associated to the same PD. Type 1 windows are bound through a verb, while type 2 windows are bound by posting a work request. The 32-bit memory key is composed of a 24-bit index and an 8-bit key. The key is changed with each bind, thus allowing more control over the peer's use of the memory key. The changes introduced are the following: * add memory window type enum and a corresponding parameter to ib_alloc_mw. * type 2 memory window bind work request support. * create a struct that contains the common part of the bind verb struct ibv_mw_bind and the bind work request into a single struct. * add the ib_inc_rkey helper function to advance the tag part of an rkey. Consumer interface details: * new device capability flags IB_DEVICE_MEM_WINDOW_TYPE_2A and IB_DEVICE_MEM_WINDOW_TYPE_2B are added to indicate device support for these features. Devices can set either IB_DEVICE_MEM_WINDOW_TYPE_2A or IB_DEVICE_MEM_WINDOW_TYPE_2B if it supports type 2A or type 2B memory windows. It can set neither to indicate it doesn't support type 2 windows at all. * modify existing provides and consumers code to the new param of ib_alloc_mw and the ib_mw_bind_info structure Signed-off-by:
Haggai Eran <haggaie@mellanox.com> Signed-off-by:
Shani Michaeli <shanim@mellanox.com> Signed-off-by:
Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by:
Roland Dreier <roland@purestorage.com>
-
- Mar 21, 2012
-
-
Tom Tucker authored
The xprtrdma FRMR mapping logic assumes that a segment is <= PAGE_SIZE. This is not true for NFS4. Signed-off-by:
Tom Tucker <tom@ogc.us> Signed-off-by:
Trond Myklebust <Trond.Myklebust@netapp.com>
-
- Jun 07, 2011
-
-
Alexey Dobriyan authored
* remove interrupt.g inclusion from netdevice.h -- not needed * fixup fallout, add interrupt.h and hardirq.h back where needed. Signed-off-by:
Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- May 25, 2011
-
-
Sean Hefty authored
The RDMA CM currently infers the QP type from the port space selected by the user. In the future (eg with RDMA_PS_IB or XRC), there may not be a 1-1 correspondence between port space and QP type. For netlink export of RDMA CM state, we want to export the QP type to userspace, so it is cleaner to explicitly associate a QP type to an ID. Modify rdma_create_id() to allow the user to specify the QP type, and use it to make our selections of datagram versus connected mode. Signed-off-by:
Sean Hefty <sean.hefty@intel.com> Signed-off-by:
Roland Dreier <roland@purestorage.com>
-
- Mar 16, 2011
-
-
Randy Dunlap authored
Fix printk format build warning: net/sunrpc/xprtrdma/verbs.c:1463: warning: format '%llx' expects type 'long long unsigned int', but argument 3 has type 'dma_addr_t' Signed-off-by:
Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by:
Trond Myklebust <Trond.Myklebust@netapp.com>
-
- Mar 11, 2011
-
-
Tom Tucker authored
When the rpc_memreg_strategy is 5, FRMR are used to map RPC data. This mode uses an FRMR to map the RPC data, then invalidates (i.e. unregisers) the data in xprt_rdma_free. These FRMR are used across connections on the same mount, i.e. if the connection goes away on an idle timeout and reconnects later, the FRMR are not destroyed and recreated. This creates a problem for transport errors because the WR that invalidate an FRMR may be flushed (i.e. fail) leaving the FRMR valid. When the FRMR is later used to map an RPC it will fail, tearing down the transport and starting over. Over time, more and more of the FRMR pool end up in the wrong state resulting in seemingly random disconnects. This fix keeps track of the FRMR state explicitly by setting it's state based on the successful completion of a reg/inv WR. If the FRMR is ever used and found to be in the wrong state, an invalidate WR is prepended, re-syncing the FRMR state and avoiding the connection loss. Signed-off-by:
Tom Tucker <tom@ogc.us> Signed-off-by:
Trond Myklebust <Trond.Myklebust@netapp.com>
-
- Aug 11, 2010
-
-
Tom Tucker authored
This patch updates the computation to include the worst case situation where three FRMR are required to map a single RPC REQ. Signed-off-by:
Tom Tucker <tom@ogc.us> Signed-off-by:
Trond Myklebust <Trond.Myklebust@netapp.com>
-
Steve Wise authored
A bad cast causes the iova_start, which in this case is a 64b DMA bus address, to be truncated on 32b systems. This breaks frmrs on 32b systems. No cast is needed. Signed-off-by:
Steve Wise <swise@opengridcomputing.com> Signed-off-by:
Trond Myklebust <Trond.Myklebust@netapp.com>
-
- Mar 30, 2010
-
-
Tejun Heo authored
include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include bloc...
-
- Nov 30, 2009
-
-
Joe Perches authored
Not including net/atm/ Compiled tested x86 allyesconfig only Added a > 80 column line or two, which I ignored. Existing checkpatch plaints willfully, cheerfully ignored. Signed-off-by:
Joe Perches <joe@perches.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- May 26, 2009
-
-
Vu Pham authored
mlx4/connectX FRMR requires local write enable together with remote rdma write enable. This fixes NFS/RDMA operation over the ConnectX Infiniband HCA in the default memreg mode. Signed-off-by:
Vu Pham <vu@mellanox.com> Signed-off-by:
Tom Talpey <tmtalpey@gmail.com> Signed-off-by:
Trond Myklebust <Trond.Myklebust@netapp.com>
-
- Nov 26, 2008
-
-
Ingo Molnar authored
fix this warning: net/sunrpc/xprtrdma/verbs.c: In function ‘rpcrdma_conn_upcall’: net/sunrpc/xprtrdma/verbs.c:279: warning: unused variable ‘addr’ Signed-off-by:
Ingo Molnar <mingo@elte.hu> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Oct 31, 2008
-
-
Harvey Harrison authored
Using NIPQUAD() with NIPQUAD_FMT, %d.%d.%d.%d or %u.%u.%u.%u can be replaced with %pI4 Signed-off-by:
Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- Oct 10, 2008
-
-
Tom Talpey authored
The RPC/RDMA connection logic could return early from reconnection attempts, leading to additional spurious retries. Signed-off-by:
Tom Talpey <talpey@netapp.com> Signed-off-by:
Trond Myklebust <Trond.Myklebust@netapp.com>
-
Tom Talpey authored
Signed-off-by:
Tom Talpey <talpey@netapp.com> Signed-off-by:
Trond Myklebust <Trond.Myklebust@netapp.com>
-
Tom Talpey authored
Add defensive timeouts to wait_for_completion() calls in RDMA address resolution, and make them interruptible. Fix the timeout units to milliseconds (formerly jiffies) and move to private header. Signed-off-by:
Tom Talpey <talpey@netapp.com> Signed-off-by:
Trond Myklebust <Trond.Myklebust@netapp.com>
-
Tom Talpey authored
The RPC/RDMA code can leak RDMA connection manager endpoints in certain error cases on connect. Don't signal unwanted events, and be certain to destroy any allocated qp. Signed-off-by:
Tom Talpey <talpey@netapp.com> Signed-off-by:
Trond Myklebust <Trond.Myklebust@netapp.com>
-
Tom Talpey authored
RDMA disconnects yield an upcall from the RDMA connection manager, which can race with rpc transport close, e.g. on ^C of a mount. Ensure any rdma cm_id and qp are fully destroyed before continuing. Signed-off-by:
Tom Talpey <talpey@netapp.com> Signed-off-by:
Trond Myklebust <Trond.Myklebust@netapp.com>
-
Tom Tucker authored
This logic sets the connection parameter that configures the local device and informs the remote peer how many concurrent incoming RDMA_READ requests are supported. The original logic didn't really do what was intended for two reasons: - The max number supported by the device is typically smaller than any one factor in the calculation used, and - The field in the connection parameter structure where the value is stored is a u8 and always overflows for the default settings. So what really happens is the value requested for responder resources is the left over 8 bits from the "desired value". If the desired value happened to be a multiple of 256, the result was zero and it wouldn't connect at all. Given the above and the fact that max_requests is almost always larger than the max responder resources supported by the adapter, this patch simplifies this logic and simply requests the max supported by the device, subject to a reasonable limit. This bug was found by Jim Schutt at Sandia. Signed-off-by:
Tom Tucker <tom@opengridcomputing.com> Acked-by:
Tom Talpey <talpey@netapp.com> Signed-off-by:
Trond Myklebust <Trond.Myklebust@netapp.com>
-
Tom Talpey authored
Configure, detect and use "fastreg" support from IB/iWARP verbs layer to perform RPC/RDMA memory registration. Make FRMR the default memreg mode (will fall back if not supported by the selected RDMA adapter). This allows full and optimal operation over the cxgb3 adapter, and others. Signed-off-by:
Tom Talpey <talpey@netapp.com> Acked-by:
Tom Tucker <tom@opengridcomputing.com> Signed-off-by:
Trond Myklebust <Trond.Myklebust@netapp.com>
-
Tom Talpey authored
At transport creation, check for, and use, any local dma lkey. Then, check that the selected memory registration mode is in fact supported by the RDMA adapter selected for the mount. Fall back to best alternative if not. Signed-off-by:
Tom Talpey <talpey@netapp.com> Acked-by:
Tom Tucker <tom@opengridcomputing.com> Signed-off-by:
Trond Myklebust <Trond.Myklebust@netapp.com>
-
Tom Talpey authored
Refactor the memory registration and deregistration routines. This saves stack space, makes the code more readable and prepares to add the new FRMR registration methods. Signed-off-by:
Tom Talpey <talpey@netapp.com> Signed-off-by:
Trond Myklebust <Trond.Myklebust@netapp.com>
-
- Apr 17, 2008
-
-
Roland Dreier authored
Add a new IB_WR_SEND_WITH_INV send opcode that can be used to mark a "send with invalidate" work request as defined in the iWARP verbs and the InfiniBand base memory management extensions. Also put "imm_data" and a new "invalidate_rkey" member in a new "ex" union in struct ib_send_wr. The invalidate_rkey member can be used to pass in an R_Key/STag to be invalidated. Add this new union to struct ib_uverbs_send_wr. Add code to copy the invalidate_rkey field in ib_uverbs_post_send(). Fix up low-level drivers to deal with the change to struct ib_send_wr, and just remove the imm_data initialization from net/sunrpc/xprtrdma/, since that code never does any send with immediate operations. Also, move the existing IB_DEVICE_SEND_W_INV flag to a new bit, since the iWARP drivers currently in the tree set the bit. The amso1100 driver at least will silently fail to honor the IB_SEND_INVALIDATE bit if passed in as part of userspace send requests (since it does not implement kernel bypass work request queueing). Remove the flag from all existing drivers that set it until we know which ones are OK. The values chosen for the new flag is not consecutive to avoid clashing with flags defined in the XRC patches, which are not merged yet but which are already in use and are likely to be merged soon. This resurrects a patch sent long ago by Mikkel Hagen <mhagen@iol.unh.edu>. Signed-off-by:
Roland Dreier <rolandd@cisco.com>
-
- Jan 30, 2008
-
-
Chuck Lever authored
Minor: Replace an empty if statement with a debugging dprintk. Signed-off-by:
Chuck Lever <chuck.lever@oracle.com> Cc: Thomas Talpey <Thomas.Talpey@netapp.com> Signed-off-by:
Trond Myklebust <Trond.Myklebust@netapp.com>
-
- Oct 16, 2007
-
-
Andrew Morton authored
sparc64: net/sunrpc/xprtrdma/verbs.c:1264: warning: long long unsigned int format, u64 arg (arg 3) net/sunrpc/xprtrdma/verbs.c:1264: warning: long long unsigned int format, u64 arg (arg 4) Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: "David S. Miller" <davem@davemloft.net> Cc: "J. Bruce Fields" <bfields@fieldses.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- Oct 09, 2007
-
-
\"Talpey, Thomas\ authored
This implements the interface from rpcrdma to the RDMA verbs interface supported by Infniband and iWARP. Signed-off-by:
Tom Talpey <talpey@netapp.com> Signed-off-by:
James Lentini <jlentini@netapp.com> Signed-off-by:
Trond Myklebust <Trond.Myklebust@netapp.com>
-
\"Talpey, Thomas\ authored
This implements the configuration and building of the core transport switch implementation of the rpcrdma transport. Stubs are provided for the rpcrdma protocol handling, and the infiniband/iwarp verbs interface. These are provided in following patches. Signed-off-by:
Tom Talpey <talpey@netapp.com> Signed-off-by:
Trond Myklebust <Trond.Myklebust@netapp.com>
-