Skip to content

Commit

Permalink
Merge tag 'f2fs-for-5.6' of git://git.kernel.org/pub/scm/linux/kernel…
Browse files Browse the repository at this point in the history
…/git/jaegeuk/f2fs

Pull f2fs updates from Jaegeuk Kim:
 "In this series, we've implemented transparent compression
  experimentally. It supports LZO and LZ4, but will add more later as we
  investigate in the field more.

  At this point, the feature doesn't expose compressed space to user
  directly in order to guarantee potential data updates later to the
  space. Instead, the main goal is to reduce data writes to flash disk
  as much as possible, resulting in extending disk life time as well as
  relaxing IO congestion.

  Alternatively, we're also considering to add ioctl() to reclaim
  compressed space and show it to user after putting the immutable bit.

  Enhancements:
   - add compression support
   - avoid unnecessary locks in quota ops
   - harden power-cut scenario for zoned block devices
   - use private bio_set to avoid IO congestion
   - replace GC mutex with rwsem to serialize callers

  Bug fixes:
   - fix dentry consistency and memory corruption in rename()'s error case
   - fix wrong swap extent reports
   - fix casefolding bugs
   - change lock coverage to avoid deadlock
   - avoid GFP_KERNEL under f2fs_lock_op

  And, we've cleaned up sysfs entries to prepare no debugfs"

* tag 'f2fs-for-5.6' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (31 commits)
  f2fs: fix race conditions in ->d_compare() and ->d_hash()
  f2fs: fix dcache lookup of !casefolded directories
  f2fs: Add f2fs stats to sysfs
  f2fs: delete duplicate information on sysfs nodes
  f2fs: change to use rwsem for gc_mutex
  f2fs: update f2fs document regarding to fsync_mode
  f2fs: add a way to turn off ipu bio cache
  f2fs: code cleanup for f2fs_statfs_project()
  f2fs: fix miscounted block limit in f2fs_statfs_project()
  f2fs: show the CP_PAUSE reason in checkpoint traces
  f2fs: fix deadlock allocating bio_post_read_ctx from mempool
  f2fs: remove unneeded check for error allocating bio_post_read_ctx
  f2fs: convert inline_dir early before starting rename
  f2fs: fix memleak of kobject
  f2fs: fix to add swap extent correctly
  f2fs: run fsck when getting bad inode during GC
  f2fs: support data compression
  f2fs: free sysfs kobject
  f2fs: declare nested quota_sem and remove unnecessary sems
  f2fs: don't put new_page twice in f2fs_rename
  ...
  • Loading branch information
torvalds committed Jan 30, 2020
2 parents 0196be1 + 80f2388 commit 6e135ba
Show file tree
Hide file tree
Showing 22 changed files with 3,463 additions and 648 deletions.
280 changes: 167 additions & 113 deletions Documentation/ABI/testing/sysfs-fs-f2fs

Large diffs are not rendered by default.

216 changes: 52 additions & 164 deletions Documentation/filesystems/f2fs.txt
Original file line number Diff line number Diff line change
Expand Up @@ -235,6 +235,17 @@ checkpoint=%s[:%u[%]] Set to "disable" to turn off checkpointing. Set to "en
hide up to all remaining free space. The actual space that
would be unusable can be viewed at /sys/fs/f2fs/<disk>/unusable
This space is reclaimed once checkpoint=enable.
compress_algorithm=%s Control compress algorithm, currently f2fs supports "lzo"
and "lz4" algorithm.
compress_log_size=%u Support configuring compress cluster size, the size will
be 4KB * (1 << %u), 16KB is minimum size, also it's
default size.
compress_extension=%s Support adding specified extension, so that f2fs can enable
compression on those corresponding files, e.g. if all files
with '.ext' has high compression rate, we can set the '.ext'
on compression extension list and enable compression on
these file by default rather than to enable it via ioctl.
For other files, we can still enable compression via ioctl.

================================================================================
DEBUGFS ENTRIES
Expand All @@ -259,170 +270,6 @@ The files in each per-device directory are shown in table below.

Files in /sys/fs/f2fs/<devname>
(see also Documentation/ABI/testing/sysfs-fs-f2fs)
..............................................................................
File Content

gc_urgent_sleep_time This parameter controls sleep time for gc_urgent.
500 ms is set by default. See above gc_urgent.

gc_min_sleep_time This tuning parameter controls the minimum sleep
time for the garbage collection thread. Time is
in milliseconds.

gc_max_sleep_time This tuning parameter controls the maximum sleep
time for the garbage collection thread. Time is
in milliseconds.

gc_no_gc_sleep_time This tuning parameter controls the default sleep
time for the garbage collection thread. Time is
in milliseconds.

gc_idle This parameter controls the selection of victim
policy for garbage collection. Setting gc_idle = 0
(default) will disable this option. Setting
gc_idle = 1 will select the Cost Benefit approach
& setting gc_idle = 2 will select the greedy approach.

gc_urgent This parameter controls triggering background GCs
urgently or not. Setting gc_urgent = 0 [default]
makes back to default behavior, while if it is set
to 1, background thread starts to do GC by given
gc_urgent_sleep_time interval.

reclaim_segments This parameter controls the number of prefree
segments to be reclaimed. If the number of prefree
segments is larger than the number of segments
in the proportion to the percentage over total
volume size, f2fs tries to conduct checkpoint to
reclaim the prefree segments to free segments.
By default, 5% over total # of segments.

main_blkaddr This value gives the first block address of
MAIN area in the partition.

max_small_discards This parameter controls the number of discard
commands that consist small blocks less than 2MB.
The candidates to be discarded are cached until
checkpoint is triggered, and issued during the
checkpoint. By default, it is disabled with 0.

discard_granularity This parameter controls the granularity of discard
command size. It will issue discard commands iif
the size is larger than given granularity. Its
unit size is 4KB, and 4 (=16KB) is set by default.
The maximum value is 128 (=512KB).

reserved_blocks This parameter indicates the number of blocks that
f2fs reserves internally for root.

batched_trim_sections This parameter controls the number of sections
to be trimmed out in batch mode when FITRIM
conducts. 32 sections is set by default.

ipu_policy This parameter controls the policy of in-place
updates in f2fs. There are five policies:
0x01: F2FS_IPU_FORCE, 0x02: F2FS_IPU_SSR,
0x04: F2FS_IPU_UTIL, 0x08: F2FS_IPU_SSR_UTIL,
0x10: F2FS_IPU_FSYNC.

min_ipu_util This parameter controls the threshold to trigger
in-place-updates. The number indicates percentage
of the filesystem utilization, and used by
F2FS_IPU_UTIL and F2FS_IPU_SSR_UTIL policies.

min_fsync_blocks This parameter controls the threshold to trigger
in-place-updates when F2FS_IPU_FSYNC mode is set.
The number indicates the number of dirty pages
when fsync needs to flush on its call path. If
the number is less than this value, it triggers
in-place-updates.

min_seq_blocks This parameter controls the threshold to serialize
write IOs issued by multiple threads in parallel.

min_hot_blocks This parameter controls the threshold to allocate
a hot data log for pending data blocks to write.

min_ssr_sections This parameter adds the threshold when deciding
SSR block allocation. If this is large, SSR mode
will be enabled early.

ram_thresh This parameter controls the memory footprint used
by free nids and cached nat entries. By default,
1 is set, which indicates 10 MB / 1 GB RAM.

ra_nid_pages When building free nids, F2FS reads NAT blocks
ahead for speed up. Default is 0.

dirty_nats_ratio Given dirty ratio of cached nat entries, F2FS
determines flushing them in background.

max_victim_search This parameter controls the number of trials to
find a victim segment when conducting SSR and
cleaning operations. The default value is 4096
which covers 8GB block address range.

migration_granularity For large-sized sections, F2FS can stop GC given
this granularity instead of reclaiming entire
section.

dir_level This parameter controls the directory level to
support large directory. If a directory has a
number of files, it can reduce the file lookup
latency by increasing this dir_level value.
Otherwise, it needs to decrease this value to
reduce the space overhead. The default value is 0.

cp_interval F2FS tries to do checkpoint periodically, 60 secs
by default.

idle_interval F2FS detects system is idle, if there's no F2FS
operations during given interval, 5 secs by
default.

discard_idle_interval F2FS detects the discard thread is idle, given
time interval. Default is 5 secs.

gc_idle_interval F2FS detects the GC thread is idle, given time
interval. Default is 5 secs.

umount_discard_timeout When unmounting the disk, F2FS waits for finishing
queued discard commands which can take huge time.
This gives time out for it, 5 secs by default.

iostat_enable This controls to enable/disable iostat in F2FS.

readdir_ra This enables/disabled readahead of inode blocks
in readdir, and default is enabled.

gc_pin_file_thresh This indicates how many GC can be failed for the
pinned file. If it exceeds this, F2FS doesn't
guarantee its pinning state. 2048 trials is set
by default.

extension_list This enables to change extension_list for hot/cold
files in runtime.

inject_rate This controls injection rate of arbitrary faults.

inject_type This controls injection type of arbitrary faults.

dirty_segments This shows # of dirty segments.

lifetime_write_kbytes This shows # of data written to the disk.

features This shows current features enabled on F2FS.

current_reserved_blocks This shows # of blocks currently reserved.

unusable If checkpoint=disable, this shows the number of
blocks that are unusable.
If checkpoint=enable it shows the number of blocks
that would be unusable if checkpoint=disable were
to be set.

encoding This shows the encoding used for casefolding.
If casefolding is not enabled, returns (none)

================================================================================
USAGE
Expand Down Expand Up @@ -840,3 +687,44 @@ zero or random data, which is useful to the below scenario where:
4. address = fibmap(fd, offset)
5. open(blkdev)
6. write(blkdev, address)

Compression implementation
--------------------------

- New term named cluster is defined as basic unit of compression, file can
be divided into multiple clusters logically. One cluster includes 4 << n
(n >= 0) logical pages, compression size is also cluster size, each of
cluster can be compressed or not.

- In cluster metadata layout, one special block address is used to indicate
cluster is compressed one or normal one, for compressed cluster, following
metadata maps cluster to [1, 4 << n - 1] physical blocks, in where f2fs
stores data including compress header and compressed data.

- In order to eliminate write amplification during overwrite, F2FS only
support compression on write-once file, data can be compressed only when
all logical blocks in file are valid and cluster compress ratio is lower
than specified threshold.

- To enable compression on regular inode, there are three ways:
* chattr +c file
* chattr +c dir; touch dir/file
* mount w/ -o compress_extension=ext; touch file.ext

Compress metadata layout:
[Dnode Structure]
+-----------------------------------------------+
| cluster 1 | cluster 2 | ......... | cluster N |
+-----------------------------------------------+
. . . .
. . . .
. Compressed Cluster . . Normal Cluster .
+----------+---------+---------+---------+ +---------+---------+---------+---------+
|compr flag| block 1 | block 2 | block 3 | | block 1 | block 2 | block 3 | block 4 |
+----------+---------+---------+---------+ +---------+---------+---------+---------+
. .
. .
. .
+-------------+-------------+----------+----------------------------+
| data length | data chksum | reserved | compressed data |
+-------------+-------------+----------+----------------------------+
27 changes: 26 additions & 1 deletion fs/f2fs/Kconfig
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ config F2FS_FS

config F2FS_STAT_FS
bool "F2FS Status Information"
depends on F2FS_FS && DEBUG_FS
depends on F2FS_FS
default y
help
/sys/kernel/debug/f2fs/ contains information about all the partitions
Expand Down Expand Up @@ -93,3 +93,28 @@ config F2FS_FAULT_INJECTION
Test F2FS to inject faults such as ENOMEM, ENOSPC, and so on.

If unsure, say N.

config F2FS_FS_COMPRESSION
bool "F2FS compression feature"
depends on F2FS_FS
help
Enable filesystem-level compression on f2fs regular files,
multiple back-end compression algorithms are supported.

config F2FS_FS_LZO
bool "LZO compression support"
depends on F2FS_FS_COMPRESSION
select LZO_COMPRESS
select LZO_DECOMPRESS
default y
help
Support LZO compress algorithm, if unsure, say Y.

config F2FS_FS_LZ4
bool "LZ4 compression support"
depends on F2FS_FS_COMPRESSION
select LZ4_COMPRESS
select LZ4_DECOMPRESS
default y
help
Support LZ4 compress algorithm, if unsure, say Y.
1 change: 1 addition & 0 deletions fs/f2fs/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -9,3 +9,4 @@ f2fs-$(CONFIG_F2FS_FS_XATTR) += xattr.o
f2fs-$(CONFIG_F2FS_FS_POSIX_ACL) += acl.o
f2fs-$(CONFIG_F2FS_IO_TRACE) += trace.o
f2fs-$(CONFIG_FS_VERITY) += verity.o
f2fs-$(CONFIG_F2FS_FS_COMPRESSION) += compress.o
6 changes: 3 additions & 3 deletions fs/f2fs/checkpoint.c
Original file line number Diff line number Diff line change
Expand Up @@ -1509,10 +1509,10 @@ static int do_checkpoint(struct f2fs_sb_info *sbi, struct cp_control *cpc)
f2fs_wait_on_all_pages_writeback(sbi);

/*
* invalidate intermediate page cache borrowed from meta inode
* which are used for migration of encrypted inode's blocks.
* invalidate intermediate page cache borrowed from meta inode which are
* used for migration of encrypted or verity inode's blocks.
*/
if (f2fs_sb_has_encrypt(sbi))
if (f2fs_sb_has_encrypt(sbi) || f2fs_sb_has_verity(sbi))
invalidate_mapping_pages(META_MAPPING(sbi),
MAIN_BLKADDR(sbi), MAX_BLKADDR(sbi) - 1);

Expand Down
Loading

0 comments on commit 6e135ba

Please sign in to comment.