- Feb 20, 2019
-
-
zhangliguang authored
This fixes the typo in comments of nfs_readdir_alloc_pages(). Because nfs_readdir_large_page and nfs_readdir_free_pagearray had been renamed. Signed-off-by:
Liguang Zhang <zhangliguang@linux.alibaba.com> Signed-off-by:
Trond Myklebust <trond.myklebust@hammerspace.com>
-
zhangliguang authored
This removes redundant semicolon for ending code. Fixes: c7944ebb ("NFSv4: Fix lookup revalidate of regular files") Signed-off-by:
Liguang Zhang <zhangliguang@linux.alibaba.com> Signed-off-by:
Trond Myklebust <trond.myklebust@hammerspace.com>
-
luanshi authored
When listing very large directories via NFS, clients may take a long time to complete. There are about three factors involved: First of all, ls and practically every other method of listing a directory including python os.listdir and find rely on libc readdir(). However readdir() only reads 32K of directory entries at a time, which means that if you have a lot of files in the same directory, it is going to take an insanely long time to read all the directory entries. Secondly, libc readdir() reads 32K of directory entries at a time, in kernel space 32K buffer split into 8 pages. One NFS readdirplus rpc will be called for one page, which introduces many readdirplus rpc calls. Lastly, one NFS readdirplus rpc asks for 32K data (filled by nfs_dentry) to fill one page (filled by dentry), we found that nearly one third of data was wasted. To solve above problems, pagecache mechanism was introduced. One NFS readdirplus rpc will ask for a large data (more than 32k), the data can fill more than one page, the cached pages can be used for next readdir call. This can reduce many readdirplus rpc calls and improve readdirplus performance. TESTING: When listing very large directories(include 300 thousand files) via NFS time ls -l /nfs_mount | wc -l without the patch: 300001 real 1m53.524s user 0m2.314s sys 0m2.599s with the patch: 300001 real 0m23.487s user 0m2.305s sys 0m2.558s Improved performance: 79.6% readdirplus rpc calls decrease: 85% Signed-off-by:
Liguang Zhang <zhangliguang@linux.alibaba.com> Signed-off-by:
Trond Myklebust <trond.myklebust@hammerspace.com>
-
Eric W. Biederman authored
In the rare and unsupported case of a hostname list nfs_parse_devname will modify dev_name. There is no need to modify dev_name as the all that is being computed is the length of the hostname, so the computed length can just be shorted. Fixes: dc045898 ("NFS: Use common device name parsing logic for NFSv4 and NFSv2/v3") Signed-off-by:
"Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by:
Trond Myklebust <trond.myklebust@hammerspace.com>
-
Julia Lawall authored
Drop LIST_HEAD where the variable it declares has never been used. The semantic patch that fixes this problem is as follows: (http://coccinelle.lip6.fr/ ) // <smpl> @@ identifier x; @@ - LIST_HEAD(x); ... when != x // </smpl> Fixes: 0e20162e ("NFSv4.1 Use MDS auth flavor for data server connection") Signed-off-by:
Julia Lawall <Julia.Lawall@lip6.fr> Signed-off-by:
Trond Myklebust <trond.myklebust@hammerspace.com>
-
Trond Myklebust authored
Signed-off-by:
Trond Myklebust <trond.myklebust@hammerspace.com>
-
Trond Myklebust authored
Fix up some compiler warnings about function parameters, etc not being correctly described or formatted. Signed-off-by:
Trond Myklebust <trond.myklebust@hammerspace.com>
-
Trond Myklebust authored
Signed-off-by:
Trond Myklebust <trond.myklebust@hammerspace.com>
-
Trond Myklebust authored
Signed-off-by:
Trond Myklebust <trond.myklebust@hammerspace.com>
-
Trond Myklebust authored
All the allocations that we can hit in the NFS layer and sunrpc layers themselves are already marked as GFP_NOFS, but we need to ensure that any calls to generic kernel functionality do the right thing as well. Signed-off-by:
Trond Myklebust <trond.myklebust@hammerspace.com>
-
Trond Myklebust authored
Allow the caller to pass error information when cleaning up a failed I/O request so that we can conditionally take action to cancel the request altogether if the error turned out to be fatal. Signed-off-by:
Trond Myklebust <trond.myklebust@hammerspace.com>
-
Trond Myklebust authored
In several places we're just moving the struct nfs_page from one list to another by first removing from the existing list, then adding to the new one. Signed-off-by:
Trond Myklebust <trond.myklebust@hammerspace.com>
-
Trond Myklebust authored
If the I/O completion failed with a fatal error, then we should just exit nfs_pageio_complete_mirror() rather than try to recoalesce. Fixes: a7d42ddb ("nfs: add mirroring support to pgio layer") Signed-off-by:
Trond Myklebust <trond.myklebust@hammerspace.com> Cc: stable@vger.kernel.org # v4.0+
-
Trond Myklebust authored
Whether we need to exit early, or just reprocess the list, we must not lost track of the request which failed to get recoalesced. Fixes: 03d5eb65 ("NFS: Fix a memory leak in nfs_do_recoalesce") Signed-off-by:
Trond Myklebust <trond.myklebust@hammerspace.com> Cc: stable@vger.kernel.org # v4.0+
-
Trond Myklebust authored
When we fail to add the request to the I/O queue, we currently leave it to the caller to free the failed request. However since some of the requests that fail are actually created by nfs_pageio_add_request() itself, and are not passed back the caller, this leads to a leakage issue, which can again cause page locks to leak. This commit addresses the leakage by freeing the created requests on error, using desc->pg_completion_ops->error_cleanup() Signed-off-by:
Trond Myklebust <trond.myklebust@hammerspace.com> Fixes: a7d42ddb ("nfs: add mirroring support to pgio layer") Cc: stable@vger.kernel.org # v4.0: c18b96a1: nfs: clean up rest of reqs Cc: stable@vger.kernel.org # v4.0: d600ad1f: NFS41: pop some layoutget Cc: stable@vger.kernel.org # v4.0+
-
- Feb 15, 2019
-
-
David Howells authored
In the request_key() upcall mechanism there's a dependency loop by which if a key type driver overrides the ->request_key hook and the userspace side manages to lose the authorisation key, the auth key and the internal construction record (struct key_construction) can keep each other pinned. Fix this by the following changes: (1) Killing off the construction record and using the auth key instead. (2) Including the operation name in the auth key payload and making the payload available outside of security/keys/. (3) The ->request_key hook is given the authkey instead of the cons record and operation name. Changes (2) and (3) allow the auth key to naturally be cleaned up if the keyring it is in is destroyed or cleared or the auth key is unlinked. Fixes: 7ee02a316600 ("keys: Fix dependency loop between construction record and auth key") Signed-off-by:
David Howells <dhowells@redhat.com> Signed-off-by:
James Morris <james.morris@microsoft.com>
-
- Feb 12, 2019
-
-
Benjamin Coddington authored
If nfs_page_async_flush() removes the page from the mapping, then we can't use page_file_mapping() on it as nfs_updatepate() is wont to do when receiving an error. Instead, push the mapping to the stack before the page is possibly truncated. Fixes: 8fc75bed ("NFS: Fix up return value on fatal errors in nfs_page_async_flush()") Signed-off-by:
Benjamin Coddington <bcodding@redhat.com> Signed-off-by:
Anna Schumaker <Anna.Schumaker@Netapp.com>
-
- Jan 29, 2019
-
-
Trond Myklebust authored
Ensure that we return the fatal error value that caused us to exit nfs_page_async_flush(). Fixes: c373fff7 ("NFSv4: Don't special case "launder"") Signed-off-by:
Trond Myklebust <trond.myklebust@hammerspace.com> Cc: stable@vger.kernel.org # v4.12+ Reviewed-by:
Benjamin Coddington <bcodding@redhat.com> Signed-off-by:
Anna Schumaker <Anna.Schumaker@Netapp.com>
-
- Jan 28, 2019
-
-
Yao Liu authored
There is a NULL pointer dereference of dev_name in nfs_parse_devname() The oops looks something like: BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 ... RIP: 0010:nfs_fs_mount+0x3b6/0xc20 [nfs] ... Call Trace: ? ida_alloc_range+0x34b/0x3d0 ? nfs_clone_super+0x80/0x80 [nfs] ? nfs_free_parsed_mount_data+0x60/0x60 [nfs] mount_fs+0x52/0x170 ? __init_waitqueue_head+0x3b/0x50 vfs_kern_mount+0x6b/0x170 do_mount+0x216/0xdc0 ksys_mount+0x83/0xd0 __x64_sys_mount+0x25/0x30 do_syscall_64+0x65/0x220 entry_SYSCALL_64_after_hwframe+0x49/0xbe Fix this by adding a NULL check on dev_name Signed-off-by:
Yao Liu <yotta.liu@ucloud.cn> Signed-off-by:
Anna Schumaker <Anna.Schumaker@Netapp.com>
-
- Jan 15, 2019
-
-
Olga Kornievskaia authored
Currently nfs42_proc_copy_file_range() can not return EAGAIN. Fixes: e4648aa4 ("NFS recover from destination server reboot for copies") Signed-off-by:
Olga Kornievskaia <kolga@netapp.com> Signed-off-by:
Anna Schumaker <Anna.Schumaker@Netapp.com>
-
- Jan 02, 2019
-
-
Santosh kumar pradhan authored
Multipathing: In case of NFSv3, rpc_clnt_test_and_add_xprt() adds the xprt to xprt switch (i.e. xps) if rpc_call_null_helper() returns success. But in case of NFSv4.1, it needs to do EXCHANGEID to verify the path along with check for session trunking. Add the xprt in nfs4_test_session_trunk() only when nfs4_detect_session_trunking() returns success. Also release refcount hold by rpc_clnt_setup_test_and_add_xprt(). Signed-off-by:
Santosh kumar pradhan <santoshkumar.pradhan@wdc.com> Tested-by:
Suresh Jayaraman <suresh.jayaraman@wdc.com> Reported-by:
Aditya Agnihotri <aditya.agnihotri@wdc.com> Signed-off-by:
Anna Schumaker <Anna.Schumaker@Netapp.com>
-
NeilBrown authored
As gte_current_cred() cannot return an error, this test is not necessary. It hasn't been necessary for years, but it wasn't so obvious before. Reported-by:
Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by:
NeilBrown <neilb@suse.com> Signed-off-by:
Anna Schumaker <Anna.Schumaker@Netapp.com>
-
Olga Kornievskaia authored
Original commit (e4648aa4 "NFS recover from destination server reboot for copies") used memcmp() and then it was changed to use nfs4_stateid_match_other() but that function returns opposite of memcmp. As the result, recovery can't find the copy leading to copy hanging. Fixes: 80f42368 ("NFSv4: Split out NFS v4.2 copy completion functions") Fixes: cb7a8384 ("NFS: Split out the body of nfs4_reclaim_open_state") Signed-of-by:
Olga Kornievskaia <kolga@netapp.com> Signed-off-by:
Anna Schumaker <Anna.Schumaker@Netapp.com>
-
Chuck Lever authored
These symbolic values were not being displayed in string form. TRACE_DEFINE_ENUM was missing in many cases. It also turns out that __print_symbolic wants an unsigned long in the first field... Signed-off-by:
Chuck Lever <chuck.lever@oracle.com> Signed-off-by:
Anna Schumaker <Anna.Schumaker@Netapp.com>
-
Chuck Lever authored
Having to specify "proto=rdma,port=20049" is cumbersome. RFC 8267 Section 6.3 requires NFSv4 clients to use "the alternative well-known port number", which is 20049. Make the use of the well- known port number automatic, just as it is for NFS/TCP and port 2049. For NFSv2/3, Section 4.2 allows clients to simply choose 20049 as the default or use rpcbind. I don't know of an NFS/RDMA server implementation that registers it's NFS/RDMA service with rpcbind, so automatically choosing 20049 seems like the better choice. The other widely-deployed NFS/RDMA client, Solaris, also uses 20049 as the default port. Signed-off-by:
Chuck Lever <chuck.lever@oracle.com> Signed-off-by:
Anna Schumaker <Anna.Schumaker@Netapp.com>
-
- Dec 31, 2018
-
-
Vasily Averin authored
Patch fixes compilation error in nfs_callback_up_net() serv->sv_bc_enabled is defined under enabled CONFIG_SUNRPC_BACKCHANNEL, however nfs_callback_up_net() can access it even if this config option was not set. Fixes: a289ce53 (sunrpc: replace svc_serv->sv_bc_xprt by boolean flag) Reported-by:
kbuild test robot <lkp@intel.com> Signed-off-by:
Vasily Averin <vvs@virtuozzo.com> Signed-off-by:
J. Bruce Fields <bfields@redhat.com>
-
- Dec 28, 2018
-
-
Arun KS authored
totalram_pages and totalhigh_pages are made static inline function. Main motivation was that managed_page_count_lock handling was complicating things. It was discussed in length here, https://lore.kernel.org/patchwork/patch/995739/#1181785 So it seemes better to remove the lock and convert variables to atomic, with preventing poteintial store-to-read tearing as a bonus. [akpm@linux-foundation.org: coding style fixes] Link: http://lkml.kernel.org/r/1542090790-21750-4-git-send-email-arunks@codeaurora.org Signed-off-by:
Arun KS <arunks@codeaurora.org> Suggested-by:
Michal Hocko <mhocko@suse.com> Suggested-by:
Vlastimil Babka <vbabka@suse.cz> Reviewed-by:
Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Reviewed-by:
Pavel Tatashin <pasha.tatashin@soleen.com> Acked-by:
Michal Hocko <mhocko@suse.com> Acked-by:
Vlastimil Babka <vbabka@suse.cz> Cc: David Hildenbrand <david@redhat.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Vasily Averin authored
Closing ")" was lost in debug message. Signed-off-by:
Vasily Averin <vvs@virtuozzo.com> Signed-off-by:
J. Bruce Fields <bfields@redhat.com>
-
Vasily Averin authored
svc_serv-> sv_bc_xprt is netns-unsafe and cannot be used as pointer. To prevent its misuse in future it is replaced by new boolean flag. Signed-off-by:
Vasily Averin <vvs@virtuozzo.com> Signed-off-by:
J. Bruce Fields <bfields@redhat.com>
-
- Dec 21, 2018
-
-
Chris Perl authored
This patch removes the check from nfs_compare_mount_options to see if a `sec' option was passed for the current mount before comparing auth flavors and instead just always compares auth flavors. Consider the following scenario: You have a server with the address 192.168.1.1 and two exports /export/a and /export/b. The first export supports `sys' and `krb5' security, the second just `sys'. Assume you start with no mounts from the server. The following results in EIOs being returned as the kernel nfs client incorrectly thinks it can share the underlying `struct nfs_server's: $ mkdir /tmp/{a,b} $ sudo mount -t nfs -o vers=3,sec=krb5 192.168.1.1:/export/a /tmp/a $ sudo mount -t nfs -o vers=3 192.168.1.1:/export/b /tmp/b $ df >/dev/null df: ‘/tmp/b’: Input/output error Signed-off-by:
Chris Perl <cperl@janestreet.com> Signed-off-by:
Anna Schumaker <Anna.Schumaker@Netapp.com>
-
Al Viro authored
Adding options to growing mnt_opts. NFS kludge with passing context= down into non-text-options mount switched to it, and with that the last use of ->sb_parse_opts_str() is gone. Reviewed-by:
David Howells <dhowells@redhat.com> Signed-off-by:
Al Viro <viro@zeniv.linux.org.uk>
-
Al Viro authored
Keep void * instead, allocate on demand (in parse_str_opts, at the moment). Eventually both selinux and smack will be better off with private structures with several strings in those, rather than this "counter and two pointers to dynamically allocated arrays" ugliness. This commit allows to do that at leisure, without disrupting anything outside of given module. Changes: * instead of struct security_mnt_opt use an opaque pointer initialized to NULL. * security_sb_eat_lsm_opts(), security_sb_parse_opts_str() and security_free_mnt_opts() take it as var argument (i.e. as void **); call sites are unchanged. * security_sb_set_mnt_opts() and security_sb_remount() take it by value (i.e. as void *). * new method: ->sb_free_mnt_opts(). Takes void *, does whatever freeing that needs to be done. * ->sb_set_mnt_opts() and ->sb_remount() might get NULL as mnt_opts argument, meaning "empty". Reviewed-by:
David Howells <dhowells@redhat.com> Signed-off-by:
Al Viro <viro@zeniv.linux.org.uk>
-
Al Viro authored
* if mount(2) passes something like "context=foo" with MS_REMOUNT in flags (/sbin/mount.nfs will _not_ do that - you need to issue the syscall manually), you'll get leaked copies for LSM options. The reason is that instead of nfs_{alloc,free}_parsed_mount_data() nfs_remount() uses kzalloc/kfree, which lacks the needed cleanup. * selinux options are not changed on remount (as for any other fs), but in case of NFS the failure is quiet - they are not compared to what we used to have, with complaint in case of attempted changes. Trivially fixed by converting to use of security_sb_remount(). Reviewed-by:
David Howells <dhowells@redhat.com> Signed-off-by:
Al Viro <viro@zeniv.linux.org.uk>
-
Al Viro authored
combination of alloc_secdata(), security_sb_copy_data(), security_sb_parse_opt_str() and free_secdata(). Reviewed-by:
David Howells <dhowells@redhat.com> Signed-off-by:
Al Viro <viro@zeniv.linux.org.uk>
-
- Dec 19, 2018
-
-
NeilBrown authored
SUNRPC has two sorts of credentials, both of which appear as "struct rpc_cred". There are "generic credentials" which are supplied by clients such as NFS and passed in 'struct rpc_message' to indicate which user should be used to authorize the request, and there are low-level credentials such as AUTH_NULL, AUTH_UNIX, AUTH_GSS which describe the credential to be sent over the wires. This patch replaces all the generic credentials by 'struct cred' pointers - the credential structure used throughout Linux. For machine credentials, there is a special 'struct cred *' pointer which is statically allocated and recognized where needed as having a special meaning. A look-up of a low-level cred will map this to a machine credential. Signed-off-by:
NeilBrown <neilb@suse.com> Acked-by:
J. Bruce Fields <bfields@redhat.com> Signed-off-by:
Anna Schumaker <Anna.Schumaker@Netapp.com>
-
NeilBrown authored
Use the common 'struct cred' to pass credentials for readdir. Signed-off-by:
NeilBrown <neilb@suse.com> Signed-off-by:
Anna Schumaker <Anna.Schumaker@Netapp.com>
-
NeilBrown authored
Rather than keying the access cache with 'struct rpc_cred', use 'struct cred'. Then use cred_fscmp() to compare credentials rather than comparing the raw pointer. A benefit of this approach is that in the common case we avoid the rpc_lookup_cred_nonblock() call which can be slow when the cred cache is large. This also keeps many fewer items pinned in the rpc cred cache, so the cred cache is less likely to get large. Signed-off-by:
NeilBrown <neilb@suse.com> Signed-off-by:
Anna Schumaker <Anna.Schumaker@Netapp.com>
-
NeilBrown authored
NFS needs to know when a credential is about to expire so that it can modify write-back behaviour to finish the write inside the expiry time. It currently uses functions in SUNRPC code which make use of a fairly complex callback scheme and flags in the generic credientials. As I am working to discard the generic credentials, this has to change. This patch moves the logic into NFS, in part by finding and caching the low-level credential in the open_context. We then make direct cred-api calls on that. This makes the code much simpler and removes a dependency on generic rpc credentials. Signed-off-by:
NeilBrown <neilb@suse.com> Signed-off-by:
Anna Schumaker <Anna.Schumaker@Netapp.com>
-
NeilBrown authored
When NFS creates a machine credential, it is a "generic" credential, not tied to any auth protocol, and is really just a container for the princpal name. This doesn't get linked to a genuine credential until rpcauth_bindcred() is called. The lookup always succeeds, so various places that test if the machine credential is NULL, are pointless. As a step towards getting rid of generic credentials, this patch gets rid of generic machine credentials. The nfs_client and rpc_client just hold a pointer to a constant principal name. When a machine credential is wanted, a special static 'struct rpc_cred' pointer is used. rpcauth_bindcred() recognizes this, finds the principal from the client, and binds the correct credential. Signed-off-by:
NeilBrown <neilb@suse.com> Signed-off-by:
Anna Schumaker <Anna.Schumaker@Netapp.com>
-
NeilBrown authored
This lock is no longer necessary. If nfs4_get_renew_cred() needs to hunt through the open-state creds for a user cred, it still takes the lock to stablize the rbtree, but otherwise there are no races. Note that this completely removes the lock from nfs4_renew_state(). It appears that the original need for the locking here was removed long ago, and there is no longer anything to protect. Signed-off-by:
NeilBrown <neilb@suse.com> Signed-off-by:
Anna Schumaker <Anna.Schumaker@Netapp.com>
-