Skip to content

Commit

Permalink
Merge branch 'for_linus' of git://git.infradead.org/~dedekind/ubifs-2.6
Browse files Browse the repository at this point in the history
* 'for_linus' of git://git.infradead.org/~dedekind/ubifs-2.6:
  UBIFS: include to compilation
  UBIFS: add new flash file system
  UBIFS: add brief documentation
  MAINTAINERS: add UBIFS section
  do_mounts: allow UBI root device name
  VFS: export sync_sb_inodes
  VFS: move inode_lock into sync_sb_inodes
  • Loading branch information
torvalds committed Jul 16, 2008
2 parents 42fdd14 + 0d7eff8 commit 9c1be0c
Show file tree
Hide file tree
Showing 41 changed files with 33,055 additions and 11 deletions.
164 changes: 164 additions & 0 deletions Documentation/filesystems/ubifs.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,164 @@
Introduction
=============

UBIFS file-system stands for UBI File System. UBI stands for "Unsorted
Block Images". UBIFS is a flash file system, which means it is designed
to work with flash devices. It is important to understand, that UBIFS
is completely different to any traditional file-system in Linux, like
Ext2, XFS, JFS, etc. UBIFS represents a separate class of file-systems
which work with MTD devices, not block devices. The other Linux
file-system of this class is JFFS2.

To make it more clear, here is a small comparison of MTD devices and
block devices.

1 MTD devices represent flash devices and they consist of eraseblocks of
rather large size, typically about 128KiB. Block devices consist of
small blocks, typically 512 bytes.
2 MTD devices support 3 main operations - read from some offset within an
eraseblock, write to some offset within an eraseblock, and erase a whole
eraseblock. Block devices support 2 main operations - read a whole
block and write a whole block.
3 The whole eraseblock has to be erased before it becomes possible to
re-write its contents. Blocks may be just re-written.
4 Eraseblocks become worn out after some number of erase cycles -
typically 100K-1G for SLC NAND and NOR flashes, and 1K-10K for MLC
NAND flashes. Blocks do not have the wear-out property.
5 Eraseblocks may become bad (only on NAND flashes) and software should
deal with this. Blocks on hard drives typically do not become bad,
because hardware has mechanisms to substitute bad blocks, at least in
modern LBA disks.

It should be quite obvious why UBIFS is very different to traditional
file-systems.

UBIFS works on top of UBI. UBI is a separate software layer which may be
found in drivers/mtd/ubi. UBI is basically a volume management and
wear-leveling layer. It provides so called UBI volumes which is a higher
level abstraction than a MTD device. The programming model of UBI devices
is very similar to MTD devices - they still consist of large eraseblocks,
they have read/write/erase operations, but UBI devices are devoid of
limitations like wear and bad blocks (items 4 and 5 in the above list).

In a sense, UBIFS is a next generation of JFFS2 file-system, but it is
very different and incompatible to JFFS2. The following are the main
differences.

* JFFS2 works on top of MTD devices, UBIFS depends on UBI and works on
top of UBI volumes.
* JFFS2 does not have on-media index and has to build it while mounting,
which requires full media scan. UBIFS maintains the FS indexing
information on the flash media and does not require full media scan,
so it mounts many times faster than JFFS2.
* JFFS2 is a write-through file-system, while UBIFS supports write-back,
which makes UBIFS much faster on writes.

Similarly to JFFS2, UBIFS supports on-the-flight compression which makes
it possible to fit quite a lot of data to the flash.

Similarly to JFFS2, UBIFS is tolerant of unclean reboots and power-cuts.
It does not need stuff like ckfs.ext2. UBIFS automatically replays its
journal and recovers from crashes, ensuring that the on-flash data
structures are consistent.

UBIFS scales logarithmically (most of the data structures it uses are
trees), so the mount time and memory consumption do not linearly depend
on the flash size, like in case of JFFS2. This is because UBIFS
maintains the FS index on the flash media. However, UBIFS depends on
UBI, which scales linearly. So overall UBI/UBIFS stack scales linearly.
Nevertheless, UBI/UBIFS scales considerably better than JFFS2.

The authors of UBIFS believe, that it is possible to develop UBI2 which
would scale logarithmically as well. UBI2 would support the same API as UBI,
but it would be binary incompatible to UBI. So UBIFS would not need to be
changed to use UBI2


Mount options
=============

(*) == default.

norm_unmount (*) commit on unmount; the journal is committed
when the file-system is unmounted so that the
next mount does not have to replay the journal
and it becomes very fast;
fast_unmount do not commit on unmount; this option makes
unmount faster, but the next mount slower
because of the need to replay the journal.


Quick usage instructions
========================

The UBI volume to mount is specified using "ubiX_Y" or "ubiX:NAME" syntax,
where "X" is UBI device number, "Y" is UBI volume number, and "NAME" is
UBI volume name.

Mount volume 0 on UBI device 0 to /mnt/ubifs:
$ mount -t ubifs ubi0_0 /mnt/ubifs

Mount "rootfs" volume of UBI device 0 to /mnt/ubifs ("rootfs" is volume
name):
$ mount -t ubifs ubi0:rootfs /mnt/ubifs

The following is an example of the kernel boot arguments to attach mtd0
to UBI and mount volume "rootfs":
ubi.mtd=0 root=ubi0:rootfs rootfstype=ubifs


Module Parameters for Debugging
===============================

When UBIFS has been compiled with debugging enabled, there are 3 module
parameters that are available to control aspects of testing and debugging.
The parameters are unsigned integers where each bit controls an option.
The parameters are:

debug_msgs Selects which debug messages to display, as follows:

Message Type Flag value

General messages 1
Journal messages 2
Mount messages 4
Commit messages 8
LEB search messages 16
Budgeting messages 32
Garbage collection messages 64
Tree Node Cache (TNC) messages 128
LEB properties (lprops) messages 256
Input/output messages 512
Log messages 1024
Scan messages 2048
Recovery messages 4096

debug_chks Selects extra checks that UBIFS can do while running:

Check Flag value

General checks 1
Check Tree Node Cache (TNC) 2
Check indexing tree size 4
Check orphan area 8
Check old indexing tree 16
Check LEB properties (lprops) 32
Check leaf nodes and inodes 64

debug_tsts Selects a mode of testing, as follows:

Test mode Flag value

Force in-the-gaps method 2
Failure mode for recovery testing 4

For example, set debug_msgs to 5 to display General messages and Mount
messages.


References
==========

UBIFS documentation and FAQ/HOWTO at the MTD web site:
http://www.linux-mtd.infradead.org/doc/ubifs.html
http://www.linux-mtd.infradead.org/faq/ubifs.html
10 changes: 10 additions & 0 deletions MAINTAINERS
Original file line number Diff line number Diff line change
Expand Up @@ -2336,6 +2336,16 @@ L: linux-mtd@lists.infradead.org
W: http://www.linux-mtd.infradead.org/doc/jffs2.html
S: Maintained

UBI FILE SYSTEM (UBIFS)
P: Artem Bityutskiy
M: dedekind@infradead.org
P: Adrian Hunter
M: ext-adrian.hunter@nokia.com
L: linux-mtd@lists.infradead.org
T: git git://git.infradead.org/~dedekind/ubifs-2.6.git
W: http://www.linux-mtd.infradead.org/doc/ubifs.html
S: Maintained

JFS FILESYSTEM
P: Dave Kleikamp
M: shaggy@austin.ibm.com
Expand Down
3 changes: 3 additions & 0 deletions fs/Kconfig
Original file line number Diff line number Diff line change
Expand Up @@ -1375,6 +1375,9 @@ config JFFS2_CMODE_FAVOURLZO

endchoice

# UBIFS File system configuration
source "fs/ubifs/Kconfig"

config CRAMFS
tristate "Compressed ROM file system support (cramfs)"
depends on BLOCK
Expand Down
1 change: 1 addition & 0 deletions fs/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -101,6 +101,7 @@ obj-$(CONFIG_NTFS_FS) += ntfs/
obj-$(CONFIG_UFS_FS) += ufs/
obj-$(CONFIG_EFS_FS) += efs/
obj-$(CONFIG_JFFS2_FS) += jffs2/
obj-$(CONFIG_UBIFS_FS) += ubifs/
obj-$(CONFIG_AFFS_FS) += affs/
obj-$(CONFIG_ROMFS_FS) += romfs/
obj-$(CONFIG_QNX4FS_FS) += qnx4/
Expand Down
22 changes: 12 additions & 10 deletions fs/fs-writeback.c
Original file line number Diff line number Diff line change
Expand Up @@ -424,8 +424,6 @@ __writeback_single_inode(struct inode *inode, struct writeback_control *wbc)
* WB_SYNC_HOLD is a hack for sys_sync(): reattach the inode to sb->s_dirty so
* that it can be located for waiting on in __writeback_single_inode().
*
* Called under inode_lock.
*
* If `bdi' is non-zero then we're being asked to writeback a specific queue.
* This function assumes that the blockdev superblock's inodes are backed by
* a variety of queues, so all inodes are searched. For other superblocks,
Expand All @@ -441,11 +439,12 @@ __writeback_single_inode(struct inode *inode, struct writeback_control *wbc)
* on the writer throttling path, and we get decent balancing between many
* throttled threads: we don't want them all piling up on inode_sync_wait.
*/
static void
sync_sb_inodes(struct super_block *sb, struct writeback_control *wbc)
void generic_sync_sb_inodes(struct super_block *sb,
struct writeback_control *wbc)
{
const unsigned long start = jiffies; /* livelock avoidance */

spin_lock(&inode_lock);
if (!wbc->for_kupdate || list_empty(&sb->s_io))
queue_io(sb, wbc->older_than_this);

Expand Down Expand Up @@ -524,8 +523,16 @@ sync_sb_inodes(struct super_block *sb, struct writeback_control *wbc)
if (!list_empty(&sb->s_more_io))
wbc->more_io = 1;
}
spin_unlock(&inode_lock);
return; /* Leave any unwritten inodes on s_io */
}
EXPORT_SYMBOL_GPL(generic_sync_sb_inodes);

static void sync_sb_inodes(struct super_block *sb,
struct writeback_control *wbc)
{
generic_sync_sb_inodes(sb, wbc);
}

/*
* Start writeback of dirty pagecache data against all unlocked inodes.
Expand Down Expand Up @@ -565,11 +572,8 @@ writeback_inodes(struct writeback_control *wbc)
* be unmounted by the time it is released.
*/
if (down_read_trylock(&sb->s_umount)) {
if (sb->s_root) {
spin_lock(&inode_lock);
if (sb->s_root)
sync_sb_inodes(sb, wbc);
spin_unlock(&inode_lock);
}
up_read(&sb->s_umount);
}
spin_lock(&sb_lock);
Expand Down Expand Up @@ -607,9 +611,7 @@ void sync_inodes_sb(struct super_block *sb, int wait)
(inodes_stat.nr_inodes - inodes_stat.nr_unused) +
nr_dirty + nr_unstable;
wbc.nr_to_write += wbc.nr_to_write / 2; /* Bit more for luck */
spin_lock(&inode_lock);
sync_sb_inodes(sb, &wbc);
spin_unlock(&inode_lock);
}

/*
Expand Down
72 changes: 72 additions & 0 deletions fs/ubifs/Kconfig
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
config UBIFS_FS
tristate "UBIFS file system support"
select CRC16
select CRC32
select CRYPTO if UBIFS_FS_ADVANCED_COMPR
select CRYPTO if UBIFS_FS_LZO
select CRYPTO if UBIFS_FS_ZLIB
select CRYPTO_LZO if UBIFS_FS_LZO
select CRYPTO_DEFLATE if UBIFS_FS_ZLIB
depends on MTD_UBI
help
UBIFS is a file system for flash devices which works on top of UBI.

config UBIFS_FS_XATTR
bool "Extended attributes support"
depends on UBIFS_FS
help
This option enables support of extended attributes.

config UBIFS_FS_ADVANCED_COMPR
bool "Advanced compression options"
depends on UBIFS_FS
help
This option allows to explicitly choose which compressions, if any,
are enabled in UBIFS. Removing compressors means inbility to read
existing file systems.

If unsure, say 'N'.

config UBIFS_FS_LZO
bool "LZO compression support" if UBIFS_FS_ADVANCED_COMPR
depends on UBIFS_FS
default y
help
LZO compressor is generally faster then zlib but compresses worse.
Say 'Y' if unsure.

config UBIFS_FS_ZLIB
bool "ZLIB compression support" if UBIFS_FS_ADVANCED_COMPR
depends on UBIFS_FS
default y
help
Zlib copresses better then LZO but it is slower. Say 'Y' if unsure.

# Debugging-related stuff
config UBIFS_FS_DEBUG
bool "Enable debugging"
depends on UBIFS_FS
select DEBUG_FS
select KALLSYMS_ALL
help
This option enables UBIFS debugging.

config UBIFS_FS_DEBUG_MSG_LVL
int "Default message level (0 = no extra messages, 3 = lots)"
depends on UBIFS_FS_DEBUG
default "0"
help
This controls the amount of debugging messages produced by UBIFS.
If reporting bugs, please try to have available a full dump of the
messages at level 1 while the misbehaviour was occurring. Level 2
may become necessary if level 1 messages were not enough to find the
bug. Generally Level 3 should be avoided.

config UBIFS_FS_DEBUG_CHKS
bool "Enable extra checks"
depends on UBIFS_FS_DEBUG
help
If extra checks are enabled UBIFS will check the consistency of its
internal data structures during operation. However, UBIFS performance
is dramatically slower when this option is selected especially if the
file system is large.
9 changes: 9 additions & 0 deletions fs/ubifs/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
obj-$(CONFIG_UBIFS_FS) += ubifs.o

ubifs-y += shrinker.o journal.o file.o dir.o super.o sb.o io.o
ubifs-y += tnc.o master.o scan.o replay.o log.o commit.o gc.o orphan.o
ubifs-y += budget.o find.o tnc_commit.o compress.o lpt.o lprops.o
ubifs-y += recovery.o ioctl.o lpt_commit.o tnc_misc.o

ubifs-$(CONFIG_UBIFS_FS_DEBUG) += debug.o
ubifs-$(CONFIG_UBIFS_FS_XATTR) += xattr.o
Loading

0 comments on commit 9c1be0c

Please sign in to comment.