Skip to content

Commit

Permalink
Merge tag 'bcachefs-2024-03-13' of https://evilpiepirate.org/git/bcac…
Browse files Browse the repository at this point in the history
…hefs

Pull bcachefs updates from Kent Overstreet:

 - Subvolume children btree; this is needed for providing a userspace
   interface for walking subvolumes, which will come later

 - Lots of improvements to directory structure checking

 - Improved journal pipelining, significantly improving performance on
   high iodepth write workloads

 - Discard path improvements: the discard path is more efficient, and no
   longer flushes the journal unnecessarily

 - Buffered write path can now avoid taking the inode lock

 - new mm helper: memalloc_flags_{save|restore}

 - mempool now does kvmalloc mempools

* tag 'bcachefs-2024-03-13' of https://evilpiepirate.org/git/bcachefs: (128 commits)
  bcachefs: time_stats: shrink time_stat_buffer for better alignment
  bcachefs: time_stats: split stats-with-quantiles into a separate structure
  bcachefs: mean_and_variance: put struct mean_and_variance_weighted on a diet
  bcachefs: time_stats: add larger units
  bcachefs: pull out time_stats.[ch]
  bcachefs: reconstruct_alloc cleanup
  bcachefs: fix bch_folio_sector padding
  bcachefs: Fix btree key cache coherency during replay
  bcachefs: Always flush write buffer in delete_dead_inodes()
  bcachefs: Fix order of gc_done passes
  bcachefs: fix deletion of indirect extents in btree_gc
  bcachefs: Prefer struct_size over open coded arithmetic
  bcachefs: Kill unused flags argument to btree_split()
  bcachefs: Check for writing superblocks with nonsense member seq fields
  bcachefs: fix bch2_journal_buf_to_text()
  lib/generic-radix-tree.c: Make nodes more reasonably sized
  bcachefs: copy_(to|from)_user_errcode()
  bcachefs: Split out bkey_types.h
  bcachefs: fix lost journal buf wakeup due to improved pipelining
  bcachefs: intercept mountoption value for bool type
  ...
  • Loading branch information
torvalds committed Mar 15, 2024
2 parents e5eb28f + be28368 commit 32a5054
Show file tree
Hide file tree
Showing 95 changed files with 3,770 additions and 2,253 deletions.
30 changes: 30 additions & 0 deletions Documentation/filesystems/bcachefs/errorcodes.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
.. SPDX-License-Identifier: GPL-2.0
bcachefs private error codes
----------------------------

In bcachefs, as a hard rule we do not throw or directly use standard error
codes (-EINVAL, -EBUSY, etc.). Instead, we define private error codes as needed
in fs/bcachefs/errcode.h.

This gives us much better error messages and makes debugging much easier. Any
direct uses of standard error codes you see in the source code are simply old
code that has yet to be converted - feel free to clean it up!

Private error codes may subtype another error code, this allows for grouping of
related errors that should be handled similarly (e.g. transaction restart
errors), as well as specifying which standard error code should be returned at
the bcachefs module boundary.

At the module boundary, we use bch2_err_class() to convert to a standard error
code; this also emits a trace event so that the original error code be
recovered even if it wasn't logged.

Do not reuse error codes! Generally speaking, a private error code should only
be thrown in one place. That means that when we see it in a log message we can
see, unambiguously, exactly which file and line number it was returned from.

Try to give error codes names that are as reasonably descriptive of the error
as possible. Frequently, the error will be logged at a place far removed from
where the error was generated; good names for error codes mean much more
descriptive and useful error messages.
1 change: 1 addition & 0 deletions MAINTAINERS
Original file line number Diff line number Diff line change
Expand Up @@ -3555,6 +3555,7 @@ R: Brian Foster <bfoster@redhat.com>
L: linux-bcachefs@vger.kernel.org
S: Supported
C: irc://irc.oftc.net/bcache
T: git https://evilpiepirate.org/git/bcachefs.git
F: fs/bcachefs/

BDISP ST MEDIA DRIVER
Expand Down
4 changes: 4 additions & 0 deletions fs/bcachefs/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -82,6 +82,7 @@ bcachefs-y := \
super-io.o \
sysfs.o \
tests.o \
time_stats.o \
thread_with_file.o \
trace.o \
two_state_shared_lock.o \
Expand All @@ -90,3 +91,6 @@ bcachefs-y := \
xattr.o

obj-$(CONFIG_MEAN_AND_VARIANCE_UNIT_TEST) += mean_and_variance_test.o

# Silence "note: xyz changed in GCC X.X" messages
subdir-ccflags-y += $(call cc-disable-warning, psabi)
219 changes: 177 additions & 42 deletions fs/bcachefs/alloc_background.c
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,8 @@
#include <linux/sched/task.h>
#include <linux/sort.h>

static void bch2_discard_one_bucket_fast(struct bch_fs *c, struct bpos bucket);

/* Persistent alloc info: */

static const unsigned BCH_ALLOC_V1_FIELD_BYTES[] = {
Expand Down Expand Up @@ -860,23 +862,28 @@ int bch2_trigger_alloc(struct btree_trans *trans,
*bucket_gen(ca, new.k->p.offset) = new_a->gen;

bch2_dev_usage_update(c, ca, old_a, new_a, journal_seq, false);
percpu_up_read(&c->mark_lock);

#define eval_state(_a, expr) ({ const struct bch_alloc_v4 *a = _a; expr; })
#define statechange(expr) !eval_state(old_a, expr) && eval_state(new_a, expr)
#define bucket_flushed(a) (!a->journal_seq || a->journal_seq <= c->journal.flushed_seq_ondisk)

if (new_a->data_type == BCH_DATA_free &&
(!new_a->journal_seq || new_a->journal_seq < c->journal.flushed_seq_ondisk))
if (statechange(a->data_type == BCH_DATA_free) &&
bucket_flushed(new_a))
closure_wake_up(&c->freelist_wait);

if (new_a->data_type == BCH_DATA_need_discard &&
(!bucket_journal_seq || bucket_journal_seq < c->journal.flushed_seq_ondisk))
bch2_do_discards(c);
if (statechange(a->data_type == BCH_DATA_need_discard) &&
!bch2_bucket_is_open(c, new.k->p.inode, new.k->p.offset) &&
bucket_flushed(new_a))
bch2_discard_one_bucket_fast(c, new.k->p);

if (old_a->data_type != BCH_DATA_cached &&
new_a->data_type == BCH_DATA_cached &&
if (statechange(a->data_type == BCH_DATA_cached) &&
!bch2_bucket_is_open(c, new.k->p.inode, new.k->p.offset) &&
should_invalidate_buckets(ca, bch2_dev_usage_read(ca)))
bch2_do_invalidates(c);

if (new_a->data_type == BCH_DATA_need_gc_gens)
if (statechange(a->data_type == BCH_DATA_need_gc_gens))
bch2_do_gc_gens(c);
percpu_up_read(&c->mark_lock);
}

if ((flags & BTREE_TRIGGER_GC) &&
Expand Down Expand Up @@ -1045,14 +1052,13 @@ int bch2_check_alloc_key(struct btree_trans *trans,
if (ret)
goto err;

if (k.k->type != discard_key_type &&
(c->opts.reconstruct_alloc ||
fsck_err(c, need_discard_key_wrong,
"incorrect key in need_discard btree (got %s should be %s)\n"
" %s",
bch2_bkey_types[k.k->type],
bch2_bkey_types[discard_key_type],
(bch2_bkey_val_to_text(&buf, c, alloc_k), buf.buf)))) {
if (fsck_err_on(k.k->type != discard_key_type,
c, need_discard_key_wrong,
"incorrect key in need_discard btree (got %s should be %s)\n"
" %s",
bch2_bkey_types[k.k->type],
bch2_bkey_types[discard_key_type],
(bch2_bkey_val_to_text(&buf, c, alloc_k), buf.buf))) {
struct bkey_i *update =
bch2_trans_kmalloc(trans, sizeof(*update));

Expand All @@ -1076,15 +1082,14 @@ int bch2_check_alloc_key(struct btree_trans *trans,
if (ret)
goto err;

if (k.k->type != freespace_key_type &&
(c->opts.reconstruct_alloc ||
fsck_err(c, freespace_key_wrong,
"incorrect key in freespace btree (got %s should be %s)\n"
" %s",
bch2_bkey_types[k.k->type],
bch2_bkey_types[freespace_key_type],
(printbuf_reset(&buf),
bch2_bkey_val_to_text(&buf, c, alloc_k), buf.buf)))) {
if (fsck_err_on(k.k->type != freespace_key_type,
c, freespace_key_wrong,
"incorrect key in freespace btree (got %s should be %s)\n"
" %s",
bch2_bkey_types[k.k->type],
bch2_bkey_types[freespace_key_type],
(printbuf_reset(&buf),
bch2_bkey_val_to_text(&buf, c, alloc_k), buf.buf))) {
struct bkey_i *update =
bch2_trans_kmalloc(trans, sizeof(*update));

Expand All @@ -1108,14 +1113,13 @@ int bch2_check_alloc_key(struct btree_trans *trans,
if (ret)
goto err;

if (a->gen != alloc_gen(k, gens_offset) &&
(c->opts.reconstruct_alloc ||
fsck_err(c, bucket_gens_key_wrong,
"incorrect gen in bucket_gens btree (got %u should be %u)\n"
" %s",
alloc_gen(k, gens_offset), a->gen,
(printbuf_reset(&buf),
bch2_bkey_val_to_text(&buf, c, alloc_k), buf.buf)))) {
if (fsck_err_on(a->gen != alloc_gen(k, gens_offset),
c, bucket_gens_key_wrong,
"incorrect gen in bucket_gens btree (got %u should be %u)\n"
" %s",
alloc_gen(k, gens_offset), a->gen,
(printbuf_reset(&buf),
bch2_bkey_val_to_text(&buf, c, alloc_k), buf.buf))) {
struct bkey_i_bucket_gens *g =
bch2_trans_kmalloc(trans, sizeof(*g));

Expand Down Expand Up @@ -1167,14 +1171,13 @@ int bch2_check_alloc_hole_freespace(struct btree_trans *trans,

*end = bkey_min(k.k->p, *end);

if (k.k->type != KEY_TYPE_set &&
(c->opts.reconstruct_alloc ||
fsck_err(c, freespace_hole_missing,
"hole in alloc btree missing in freespace btree\n"
" device %llu buckets %llu-%llu",
freespace_iter->pos.inode,
freespace_iter->pos.offset,
end->offset))) {
if (fsck_err_on(k.k->type != KEY_TYPE_set,
c, freespace_hole_missing,
"hole in alloc btree missing in freespace btree\n"
" device %llu buckets %llu-%llu",
freespace_iter->pos.inode,
freespace_iter->pos.offset,
end->offset)) {
struct bkey_i *update =
bch2_trans_kmalloc(trans, sizeof(*update));

Expand Down Expand Up @@ -1604,6 +1607,36 @@ int bch2_check_alloc_to_lru_refs(struct bch_fs *c)
return ret;
}

static int discard_in_flight_add(struct bch_fs *c, struct bpos bucket)
{
int ret;

mutex_lock(&c->discard_buckets_in_flight_lock);
darray_for_each(c->discard_buckets_in_flight, i)
if (bkey_eq(*i, bucket)) {
ret = -EEXIST;
goto out;
}

ret = darray_push(&c->discard_buckets_in_flight, bucket);
out:
mutex_unlock(&c->discard_buckets_in_flight_lock);
return ret;
}

static void discard_in_flight_remove(struct bch_fs *c, struct bpos bucket)
{
mutex_lock(&c->discard_buckets_in_flight_lock);
darray_for_each(c->discard_buckets_in_flight, i)
if (bkey_eq(*i, bucket)) {
darray_remove_item(&c->discard_buckets_in_flight, i);
goto found;
}
BUG();
found:
mutex_unlock(&c->discard_buckets_in_flight_lock);
}

struct discard_buckets_state {
u64 seen;
u64 open;
Expand Down Expand Up @@ -1642,6 +1675,7 @@ static int bch2_discard_one_bucket(struct btree_trans *trans,
struct bch_dev *ca;
struct bkey_i_alloc_v4 *a;
struct printbuf buf = PRINTBUF;
bool discard_locked = false;
int ret = 0;

ca = bch_dev_bkey_exists(c, pos.inode);
Expand Down Expand Up @@ -1709,6 +1743,11 @@ static int bch2_discard_one_bucket(struct btree_trans *trans,
goto out;
}

if (discard_in_flight_add(c, SPOS(iter.pos.inode, iter.pos.offset, true)))
goto out;

discard_locked = true;

if (!bkey_eq(*discard_pos_done, iter.pos) &&
ca->mi.discard && !c->opts.nochanges) {
/*
Expand Down Expand Up @@ -1740,6 +1779,8 @@ static int bch2_discard_one_bucket(struct btree_trans *trans,
count_event(c, bucket_discard);
s->discarded++;
out:
if (discard_locked)
discard_in_flight_remove(c, iter.pos);
s->seen++;
bch2_trans_iter_exit(trans, &iter);
percpu_ref_put(&ca->io_ref);
Expand Down Expand Up @@ -1779,6 +1820,93 @@ void bch2_do_discards(struct bch_fs *c)
bch2_write_ref_put(c, BCH_WRITE_REF_discard);
}

static int bch2_clear_bucket_needs_discard(struct btree_trans *trans, struct bpos bucket)
{
struct btree_iter iter;
bch2_trans_iter_init(trans, &iter, BTREE_ID_alloc, bucket, BTREE_ITER_INTENT);
struct bkey_s_c k = bch2_btree_iter_peek_slot(&iter);
int ret = bkey_err(k);
if (ret)
goto err;

struct bkey_i_alloc_v4 *a = bch2_alloc_to_v4_mut(trans, k);
ret = PTR_ERR_OR_ZERO(a);
if (ret)
goto err;

SET_BCH_ALLOC_V4_NEED_DISCARD(&a->v, false);
a->v.data_type = alloc_data_type(a->v, a->v.data_type);

ret = bch2_trans_update(trans, &iter, &a->k_i, 0);
err:
bch2_trans_iter_exit(trans, &iter);
return ret;
}

static void bch2_do_discards_fast_work(struct work_struct *work)
{
struct bch_fs *c = container_of(work, struct bch_fs, discard_fast_work);

while (1) {
bool got_bucket = false;
struct bpos bucket;
struct bch_dev *ca;

mutex_lock(&c->discard_buckets_in_flight_lock);
darray_for_each(c->discard_buckets_in_flight, i) {
if (i->snapshot)
continue;

ca = bch_dev_bkey_exists(c, i->inode);

if (!percpu_ref_tryget(&ca->io_ref)) {
darray_remove_item(&c->discard_buckets_in_flight, i);
continue;
}

got_bucket = true;
bucket = *i;
i->snapshot = true;
break;
}
mutex_unlock(&c->discard_buckets_in_flight_lock);

if (!got_bucket)
break;

if (ca->mi.discard && !c->opts.nochanges)
blkdev_issue_discard(ca->disk_sb.bdev,
bucket.offset * ca->mi.bucket_size,
ca->mi.bucket_size,
GFP_KERNEL);

int ret = bch2_trans_do(c, NULL, NULL,
BCH_WATERMARK_btree|
BCH_TRANS_COMMIT_no_enospc,
bch2_clear_bucket_needs_discard(trans, bucket));
bch_err_fn(c, ret);

percpu_ref_put(&ca->io_ref);
discard_in_flight_remove(c, bucket);

if (ret)
break;
}

bch2_write_ref_put(c, BCH_WRITE_REF_discard_fast);
}

static void bch2_discard_one_bucket_fast(struct bch_fs *c, struct bpos bucket)
{
struct bch_dev *ca = bch_dev_bkey_exists(c, bucket.inode);

if (!percpu_ref_is_dying(&ca->io_ref) &&
!discard_in_flight_add(c, bucket) &&
bch2_write_ref_tryget(c, BCH_WRITE_REF_discard_fast) &&
!queue_work(c->write_ref_wq, &c->discard_fast_work))
bch2_write_ref_put(c, BCH_WRITE_REF_discard_fast);
}

static int invalidate_one_bucket(struct btree_trans *trans,
struct btree_iter *lru_iter,
struct bkey_s_c lru_k,
Expand Down Expand Up @@ -2210,9 +2338,16 @@ void bch2_dev_allocator_add(struct bch_fs *c, struct bch_dev *ca)
set_bit(ca->dev_idx, c->rw_devs[i].d);
}

void bch2_fs_allocator_background_exit(struct bch_fs *c)
{
darray_exit(&c->discard_buckets_in_flight);
}

void bch2_fs_allocator_background_init(struct bch_fs *c)
{
spin_lock_init(&c->freelist_lock);
mutex_init(&c->discard_buckets_in_flight_lock);
INIT_WORK(&c->discard_work, bch2_do_discards_work);
INIT_WORK(&c->discard_fast_work, bch2_do_discards_fast_work);
INIT_WORK(&c->invalidate_work, bch2_do_invalidates_work);
}
1 change: 1 addition & 0 deletions fs/bcachefs/alloc_background.h
Original file line number Diff line number Diff line change
Expand Up @@ -269,6 +269,7 @@ u64 bch2_min_rw_member_capacity(struct bch_fs *);
void bch2_dev_allocator_remove(struct bch_fs *, struct bch_dev *);
void bch2_dev_allocator_add(struct bch_fs *, struct bch_dev *);

void bch2_fs_allocator_background_exit(struct bch_fs *);
void bch2_fs_allocator_background_init(struct bch_fs *);

#endif /* _BCACHEFS_ALLOC_BACKGROUND_H */
Loading

0 comments on commit 32a5054

Please sign in to comment.