Skip to content

Commit

Permalink
Merge tag 'vfs-6.12.netfs' of git://git.kernel.org/pub/scm/linux/kern…
Browse files Browse the repository at this point in the history
…el/git/vfs/vfs

Pull netfs updates from Christian Brauner:
 "This contains the work to improve read/write performance for the new
  netfs library.

  The main performance enhancing changes are:

   - Define a structure, struct folio_queue, and a new iterator type,
     ITER_FOLIOQ, to hold a buffer as a replacement for ITER_XARRAY. See
     that patch for questions about naming and form.

     ITER_FOLIOQ is provided as a replacement for ITER_XARRAY. The
     problem with an xarray is that accessing it requires the use of a
     lock (typically the RCU read lock) - and this means that we can't
     supply iterate_and_advance() with a step function that might sleep
     (crypto for example) without having to drop the lock between pages.
     ITER_FOLIOQ is the iterator for a chain of folio_queue structs,
     where each folio_queue holds a small list of folios. A folio_queue
     struct is a simpler structure than xarray and is not subject to
     concurrent manipulation by the VM. folio_queue is used rather than
     a bvec[] as it can form lists of indefinite size, adding to one end
     and removing from the other on the fly.

   - Provide a copy_folio_from_iter() wrapper.

   - Make cifs RDMA support ITER_FOLIOQ.

   - Use folio queues in the write-side helpers instead of xarrays.

   - Add a function to reset the iterator in a subrequest.

   - Simplify the write-side helpers to use sheaves to skip gaps rather
     than trying to work out where gaps are.

   - In afs, make the read subrequests asynchronous, putting them into
     work items to allow the next patch to do progressive
     unlocking/reading.

   - Overhaul the read-side helpers to improve performance.

   - Fix the caching of a partial block at the end of a file.

   - Allow a store to be cancelled.

  Then some changes for cifs to make it use folio queues instead of
  xarrays for crypto bufferage:

   - Use raw iteration functions rather than manually coding iteration
     when hashing data.

   - Switch to using folio_queue for crypto buffers.

   - Remove the xarray bits.

  Make some adjustments to the /proc/fs/netfs/stats file such that:

   - All the netfs stats lines begin 'Netfs:' but change this to
     something a bit more useful.

   - Add a couple of stats counters to track the numbers of skips and
     waits on the per-inode writeback serialisation lock to make it
     easier to check for this as a source of performance loss.

  Miscellaneous work:

   - Ensure that the sb_writers lock is taken around
     vfs_{set,remove}xattr() in the cachefiles code.

   - Reduce the number of conditional branches in netfs_perform_write().

   - Move the CIFS_INO_MODIFIED_ATTR flag to the netfs_inode struct and
     remove cifs_post_modify().

   - Move the max_len/max_nr_segs members from netfs_io_subrequest to
     netfs_io_request as they're only needed for one subreq at a time.

   - Add an 'unknown' source value for tracing purposes.

   - Remove NETFS_COPY_TO_CACHE as it's no longer used.

   - Set the request work function up front at allocation time.

   - Use bh-disabling spinlocks for rreq->lock as cachefiles completion
     may be run from block-filesystem DIO completion in softirq context.

   - Remove fs/netfs/io.c"

* tag 'vfs-6.12.netfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (25 commits)
  docs: filesystems: corrected grammar of netfs page
  cifs: Don't support ITER_XARRAY
  cifs: Switch crypto buffer to use a folio_queue rather than an xarray
  cifs: Use iterate_and_advance*() routines directly for hashing
  netfs: Cancel dirty folios that have no storage destination
  cachefiles, netfs: Fix write to partial block at EOF
  netfs: Remove fs/netfs/io.c
  netfs: Speed up buffered reading
  afs: Make read subreqs async
  netfs: Simplify the writeback code
  netfs: Provide an iterator-reset function
  netfs: Use new folio_queue data type and iterator instead of xarray iter
  cifs: Provide the capability to extract from ITER_FOLIOQ to RDMA SGEs
  iov_iter: Provide copy_folio_from_iter()
  mm: Define struct folio_queue and ITER_FOLIOQ to handle a sequence of folios
  netfs: Use bh-disabling spinlocks for rreq->lock
  netfs: Set the request work function upon allocation
  netfs: Remove NETFS_COPY_TO_CACHE
  netfs: Reserve netfs_sreq_source 0 as unset/unknown
  netfs: Move max_len/max_nr_segs from netfs_io_subrequest to netfs_io_stream
  ...
  • Loading branch information
torvalds committed Sep 16, 2024
2 parents 9020d0d + 4b40d43 commit 35219bc
Show file tree
Hide file tree
Showing 42 changed files with 3,523 additions and 1,986 deletions.
2 changes: 1 addition & 1 deletion Documentation/filesystems/netfs_library.rst
Original file line number Diff line number Diff line change
Expand Up @@ -116,7 +116,7 @@ The following services are provided:
* Handle local caching, allowing cached data and server-read data to be
interleaved for a single request.

* Handle clearing of bufferage that aren't on the server.
* Handle clearing of bufferage that isn't on the server.

* Handle retrying of reads that failed, switching reads from the cache to the
server as necessary.
Expand Down
11 changes: 8 additions & 3 deletions fs/9p/vfs_addr.c
Original file line number Diff line number Diff line change
Expand Up @@ -68,17 +68,22 @@ static void v9fs_issue_read(struct netfs_io_subrequest *subreq)
{
struct netfs_io_request *rreq = subreq->rreq;
struct p9_fid *fid = rreq->netfs_priv;
unsigned long long pos = subreq->start + subreq->transferred;
int total, err;

total = p9_client_read(fid, subreq->start + subreq->transferred,
&subreq->io_iter, &err);
total = p9_client_read(fid, pos, &subreq->io_iter, &err);

/* if we just extended the file size, any portion not in
* cache won't be on server and is zeroes */
if (subreq->rreq->origin != NETFS_DIO_READ)
__set_bit(NETFS_SREQ_CLEAR_TAIL, &subreq->flags);
if (pos + total >= i_size_read(rreq->inode))
__set_bit(NETFS_SREQ_HIT_EOF, &subreq->flags);

netfs_subreq_terminated(subreq, err ?: total, false);
if (!err)
subreq->transferred += total;

netfs_read_subreq_terminated(subreq, err, false);
}

/**
Expand Down
30 changes: 23 additions & 7 deletions fs/afs/file.c
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@
#include <linux/mm.h>
#include <linux/swap.h>
#include <linux/netfs.h>
#include <trace/events/netfs.h>
#include "internal.h"

static int afs_file_mmap(struct file *file, struct vm_area_struct *vma);
Expand Down Expand Up @@ -242,9 +243,10 @@ static void afs_fetch_data_notify(struct afs_operation *op)

req->error = error;
if (subreq) {
if (subreq->rreq->origin != NETFS_DIO_READ)
__set_bit(NETFS_SREQ_CLEAR_TAIL, &subreq->flags);
netfs_subreq_terminated(subreq, error ?: req->actual_len, false);
subreq->rreq->i_size = req->file_size;
if (req->pos + req->actual_len >= req->file_size)
__set_bit(NETFS_SREQ_HIT_EOF, &subreq->flags);
netfs_read_subreq_terminated(subreq, error, false);
req->subreq = NULL;
} else if (req->done) {
req->done(req);
Expand All @@ -262,6 +264,12 @@ static void afs_fetch_data_success(struct afs_operation *op)
afs_fetch_data_notify(op);
}

static void afs_fetch_data_aborted(struct afs_operation *op)
{
afs_check_for_remote_deletion(op);
afs_fetch_data_notify(op);
}

static void afs_fetch_data_put(struct afs_operation *op)
{
op->fetch.req->error = afs_op_error(op);
Expand All @@ -272,7 +280,7 @@ static const struct afs_operation_ops afs_fetch_data_operation = {
.issue_afs_rpc = afs_fs_fetch_data,
.issue_yfs_rpc = yfs_fs_fetch_data,
.success = afs_fetch_data_success,
.aborted = afs_check_for_remote_deletion,
.aborted = afs_fetch_data_aborted,
.failed = afs_fetch_data_notify,
.put = afs_fetch_data_put,
};
Expand All @@ -294,7 +302,7 @@ int afs_fetch_data(struct afs_vnode *vnode, struct afs_read *req)
op = afs_alloc_operation(req->key, vnode->volume);
if (IS_ERR(op)) {
if (req->subreq)
netfs_subreq_terminated(req->subreq, PTR_ERR(op), false);
netfs_read_subreq_terminated(req->subreq, PTR_ERR(op), false);
return PTR_ERR(op);
}

Expand All @@ -305,14 +313,15 @@ int afs_fetch_data(struct afs_vnode *vnode, struct afs_read *req)
return afs_do_sync_operation(op);
}

static void afs_issue_read(struct netfs_io_subrequest *subreq)
static void afs_read_worker(struct work_struct *work)
{
struct netfs_io_subrequest *subreq = container_of(work, struct netfs_io_subrequest, work);
struct afs_vnode *vnode = AFS_FS_I(subreq->rreq->inode);
struct afs_read *fsreq;

fsreq = afs_alloc_read(GFP_NOFS);
if (!fsreq)
return netfs_subreq_terminated(subreq, -ENOMEM, false);
return netfs_read_subreq_terminated(subreq, -ENOMEM, false);

fsreq->subreq = subreq;
fsreq->pos = subreq->start + subreq->transferred;
Expand All @@ -321,10 +330,17 @@ static void afs_issue_read(struct netfs_io_subrequest *subreq)
fsreq->vnode = vnode;
fsreq->iter = &subreq->io_iter;

trace_netfs_sreq(subreq, netfs_sreq_trace_submit);
afs_fetch_data(fsreq->vnode, fsreq);
afs_put_read(fsreq);
}

static void afs_issue_read(struct netfs_io_subrequest *subreq)
{
INIT_WORK(&subreq->work, afs_read_worker);
queue_work(system_long_wq, &subreq->work);
}

static int afs_symlink_read_folio(struct file *file, struct folio *folio)
{
struct afs_vnode *vnode = AFS_FS_I(folio->mapping->host);
Expand Down
9 changes: 7 additions & 2 deletions fs/afs/fsclient.c
Original file line number Diff line number Diff line change
Expand Up @@ -304,6 +304,7 @@ static int afs_deliver_fs_fetch_data(struct afs_call *call)
struct afs_vnode_param *vp = &op->file[0];
struct afs_read *req = op->fetch.req;
const __be32 *bp;
size_t count_before;
int ret;

_enter("{%u,%zu,%zu/%llu}",
Expand Down Expand Up @@ -345,10 +346,14 @@ static int afs_deliver_fs_fetch_data(struct afs_call *call)

/* extract the returned data */
case 2:
_debug("extract data %zu/%llu",
iov_iter_count(call->iter), req->actual_len);
count_before = call->iov_len;
_debug("extract data %zu/%llu", count_before, req->actual_len);

ret = afs_extract_data(call, true);
if (req->subreq) {
req->subreq->transferred += count_before - call->iov_len;
netfs_read_subreq_progress(req->subreq, false);
}
if (ret < 0)
return ret;

Expand Down
4 changes: 3 additions & 1 deletion fs/afs/write.c
Original file line number Diff line number Diff line change
Expand Up @@ -89,10 +89,12 @@ static const struct afs_operation_ops afs_store_data_operation = {
*/
void afs_prepare_write(struct netfs_io_subrequest *subreq)
{
struct netfs_io_stream *stream = &subreq->rreq->io_streams[subreq->stream_nr];

//if (test_bit(NETFS_SREQ_RETRYING, &subreq->flags))
// subreq->max_len = 512 * 1024;
//else
subreq->max_len = 256 * 1024 * 1024;
stream->sreq_max_len = 256 * 1024 * 1024;
}

/*
Expand Down
9 changes: 7 additions & 2 deletions fs/afs/yfsclient.c
Original file line number Diff line number Diff line change
Expand Up @@ -355,6 +355,7 @@ static int yfs_deliver_fs_fetch_data64(struct afs_call *call)
struct afs_vnode_param *vp = &op->file[0];
struct afs_read *req = op->fetch.req;
const __be32 *bp;
size_t count_before;
int ret;

_enter("{%u,%zu, %zu/%llu}",
Expand Down Expand Up @@ -391,10 +392,14 @@ static int yfs_deliver_fs_fetch_data64(struct afs_call *call)

/* extract the returned data */
case 2:
_debug("extract data %zu/%llu",
iov_iter_count(call->iter), req->actual_len);
count_before = call->iov_len;
_debug("extract data %zu/%llu", count_before, req->actual_len);

ret = afs_extract_data(call, true);
if (req->subreq) {
req->subreq->transferred += count_before - call->iov_len;
netfs_read_subreq_progress(req->subreq, false);
}
if (ret < 0)
return ret;

Expand Down
19 changes: 17 additions & 2 deletions fs/cachefiles/io.c
Original file line number Diff line number Diff line change
Expand Up @@ -627,11 +627,12 @@ static void cachefiles_prepare_write_subreq(struct netfs_io_subrequest *subreq)
{
struct netfs_io_request *wreq = subreq->rreq;
struct netfs_cache_resources *cres = &wreq->cache_resources;
struct netfs_io_stream *stream = &wreq->io_streams[subreq->stream_nr];

_enter("W=%x[%x] %llx", wreq->debug_id, subreq->debug_index, subreq->start);

subreq->max_len = MAX_RW_COUNT;
subreq->max_nr_segs = BIO_MAX_VECS;
stream->sreq_max_len = MAX_RW_COUNT;
stream->sreq_max_segs = BIO_MAX_VECS;

if (!cachefiles_cres_file(cres)) {
if (!fscache_wait_for_operation(cres, FSCACHE_WANT_WRITE))
Expand All @@ -647,6 +648,7 @@ static void cachefiles_issue_write(struct netfs_io_subrequest *subreq)
struct netfs_cache_resources *cres = &wreq->cache_resources;
struct cachefiles_object *object = cachefiles_cres_object(cres);
struct cachefiles_cache *cache = object->volume->cache;
struct netfs_io_stream *stream = &wreq->io_streams[subreq->stream_nr];
const struct cred *saved_cred;
size_t off, pre, post, len = subreq->len;
loff_t start = subreq->start;
Expand All @@ -660,6 +662,7 @@ static void cachefiles_issue_write(struct netfs_io_subrequest *subreq)
if (off) {
pre = CACHEFILES_DIO_BLOCK_SIZE - off;
if (pre >= len) {
fscache_count_dio_misfit();
netfs_write_subrequest_terminated(subreq, len, false);
return;
}
Expand All @@ -670,10 +673,22 @@ static void cachefiles_issue_write(struct netfs_io_subrequest *subreq)
}

/* We also need to end on the cache granularity boundary */
if (start + len == wreq->i_size) {
size_t part = len % CACHEFILES_DIO_BLOCK_SIZE;
size_t need = CACHEFILES_DIO_BLOCK_SIZE - part;

if (part && stream->submit_extendable_to >= need) {
len += need;
subreq->len += need;
subreq->io_iter.count += need;
}
}

post = len & (CACHEFILES_DIO_BLOCK_SIZE - 1);
if (post) {
len -= post;
if (len == 0) {
fscache_count_dio_misfit();
netfs_write_subrequest_terminated(subreq, post, false);
return;
}
Expand Down
34 changes: 26 additions & 8 deletions fs/cachefiles/xattr.c
Original file line number Diff line number Diff line change
Expand Up @@ -64,9 +64,15 @@ int cachefiles_set_object_xattr(struct cachefiles_object *object)
memcpy(buf->data, fscache_get_aux(object->cookie), len);

ret = cachefiles_inject_write_error();
if (ret == 0)
ret = vfs_setxattr(&nop_mnt_idmap, dentry, cachefiles_xattr_cache,
buf, sizeof(struct cachefiles_xattr) + len, 0);
if (ret == 0) {
ret = mnt_want_write_file(file);
if (ret == 0) {
ret = vfs_setxattr(&nop_mnt_idmap, dentry,
cachefiles_xattr_cache, buf,
sizeof(struct cachefiles_xattr) + len, 0);
mnt_drop_write_file(file);
}
}
if (ret < 0) {
trace_cachefiles_vfs_error(object, file_inode(file), ret,
cachefiles_trace_setxattr_error);
Expand Down Expand Up @@ -151,8 +157,14 @@ int cachefiles_remove_object_xattr(struct cachefiles_cache *cache,
int ret;

ret = cachefiles_inject_remove_error();
if (ret == 0)
ret = vfs_removexattr(&nop_mnt_idmap, dentry, cachefiles_xattr_cache);
if (ret == 0) {
ret = mnt_want_write(cache->mnt);
if (ret == 0) {
ret = vfs_removexattr(&nop_mnt_idmap, dentry,
cachefiles_xattr_cache);
mnt_drop_write(cache->mnt);
}
}
if (ret < 0) {
trace_cachefiles_vfs_error(object, d_inode(dentry), ret,
cachefiles_trace_remxattr_error);
Expand Down Expand Up @@ -208,9 +220,15 @@ bool cachefiles_set_volume_xattr(struct cachefiles_volume *volume)
memcpy(buf->data, p, volume->vcookie->coherency_len);

ret = cachefiles_inject_write_error();
if (ret == 0)
ret = vfs_setxattr(&nop_mnt_idmap, dentry, cachefiles_xattr_cache,
buf, len, 0);
if (ret == 0) {
ret = mnt_want_write(volume->cache->mnt);
if (ret == 0) {
ret = vfs_setxattr(&nop_mnt_idmap, dentry,
cachefiles_xattr_cache,
buf, len, 0);
mnt_drop_write(volume->cache->mnt);
}
}
if (ret < 0) {
trace_cachefiles_vfs_error(NULL, d_inode(dentry), ret,
cachefiles_trace_setxattr_error);
Expand Down
Loading

0 comments on commit 35219bc

Please sign in to comment.