Skip to content

Commit

Permalink
Merge branch 'hwpoison' of git://git.kernel.org/pub/scm/linux/kernel/…
Browse files Browse the repository at this point in the history
…git/ak/linux-mce-2.6

* 'hwpoison' of git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-mce-2.6: (34 commits)
  HWPOISON: Remove stray phrase in a comment
  HWPOISON: Try to allocate migration page on the same node
  HWPOISON: Don't do early filtering if filter is disabled
  HWPOISON: Add a madvise() injector for soft page offlining
  HWPOISON: Add soft page offline support
  HWPOISON: Undefine short-hand macros after use to avoid namespace conflict
  HWPOISON: Use new shake_page in memory_failure
  HWPOISON: Use correct name for MADV_HWPOISON in documentation
  HWPOISON: mention HWPoison in Kconfig entry
  HWPOISON: Use get_user_page_fast in hwpoison madvise
  HWPOISON: add an interface to switch off/on all the page filters
  HWPOISON: add memory cgroup filter
  memcg: add accessor to mem_cgroup.css
  memcg: rename and export try_get_mem_cgroup_from_page()
  HWPOISON: add page flags filter
  mm: export stable page flags
  HWPOISON: limit hwpoison injector to known page types
  HWPOISON: add fs/device filters
  HWPOISON: return 0 to indicate success reliably
  HWPOISON: make semantics of IGNORED/DELAYED clear
  ...
  • Loading branch information
torvalds committed Dec 16, 2009
2 parents 61cf693 + f2c03de commit d4220f9
Show file tree
Hide file tree
Showing 19 changed files with 922 additions and 126 deletions.
44 changes: 44 additions & 0 deletions Documentation/ABI/testing/sysfs-memory-page-offline
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
What: /sys/devices/system/memory/soft_offline_page
Date: Sep 2009
KernelVersion: 2.6.33
Contact: andi@firstfloor.org
Description:
Soft-offline the memory page containing the physical address
written into this file. Input is a hex number specifying the
physical address of the page. The kernel will then attempt
to soft-offline it, by moving the contents elsewhere or
dropping it if possible. The kernel will then be placed
on the bad page list and never be reused.

The offlining is done in kernel specific granuality.
Normally it's the base page size of the kernel, but
this might change.

The page must be still accessible, not poisoned. The
kernel will never kill anything for this, but rather
fail the offline. Return value is the size of the
number, or a error when the offlining failed. Reading
the file is not allowed.

What: /sys/devices/system/memory/hard_offline_page
Date: Sep 2009
KernelVersion: 2.6.33
Contact: andi@firstfloor.org
Description:
Hard-offline the memory page containing the physical
address written into this file. Input is a hex number
specifying the physical address of the page. The
kernel will then attempt to hard-offline the page, by
trying to drop the page or killing any owner or
triggering IO errors if needed. Note this may kill
any processes owning the page. The kernel will avoid
to access this page assuming it's poisoned by the
hardware.

The offlining is done in kernel specific granuality.
Normally it's the base page size of the kernel, but
this might change.

Return value is the size of the number, or a error when
the offlining failed.
Reading the file is not allowed.
52 changes: 49 additions & 3 deletions Documentation/vm/hwpoison.txt
Original file line number Diff line number Diff line change
Expand Up @@ -92,16 +92,62 @@ PR_MCE_KILL_GET

Testing:

madvise(MADV_POISON, ....)
madvise(MADV_HWPOISON, ....)
(as root)
Poison a page in the process for testing


hwpoison-inject module through debugfs
/sys/debug/hwpoison/corrupt-pfn

Inject hwpoison fault at PFN echoed into this file
/sys/debug/hwpoison/

corrupt-pfn

Inject hwpoison fault at PFN echoed into this file. This does
some early filtering to avoid corrupted unintended pages in test suites.

unpoison-pfn

Software-unpoison page at PFN echoed into this file. This
way a page can be reused again.
This only works for Linux injected failures, not for real
memory failures.

Note these injection interfaces are not stable and might change between
kernel versions

corrupt-filter-dev-major
corrupt-filter-dev-minor

Only handle memory failures to pages associated with the file system defined
by block device major/minor. -1U is the wildcard value.
This should be only used for testing with artificial injection.

corrupt-filter-memcg

Limit injection to pages owned by memgroup. Specified by inode number
of the memcg.

Example:
mkdir /cgroup/hwpoison

usemem -m 100 -s 1000 &
echo `jobs -p` > /cgroup/hwpoison/tasks

memcg_ino=$(ls -id /cgroup/hwpoison | cut -f1 -d' ')
echo $memcg_ino > /debug/hwpoison/corrupt-filter-memcg

page-types -p `pidof init` --hwpoison # shall do nothing
page-types -p `pidof usemem` --hwpoison # poison its pages

corrupt-filter-flags-mask
corrupt-filter-flags-value

When specified, only poison pages if ((page_flags & mask) == value).
This allows stress testing of many kinds of pages. The page_flags
are the same as in /proc/kpageflags. The flag bits are defined in
include/linux/kernel-page-flags.h and documented in
Documentation/vm/pagemap.txt

Architecture specific MCE injector

Expand Down
15 changes: 13 additions & 2 deletions Documentation/vm/page-types.c
Original file line number Diff line number Diff line change
@@ -1,11 +1,22 @@
/*
* page-types: Tool for querying page flags
*
* This program is free software; you can redistribute it and/or modify it
* under the terms of the GNU General Public License as published by the Free
* Software Foundation; version 2.
*
* This program is distributed in the hope that it will be useful, but WITHOUT
* ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
* FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
* more details.
*
* You should find a copy of v2 of the GNU General Public License somewhere on
* your Linux system; if not, write to the Free Software Foundation, Inc., 59
* Temple Place, Suite 330, Boston, MA 02111-1307 USA.
*
* Copyright (C) 2009 Intel corporation
*
* Authors: Wu Fengguang <fengguang.wu@intel.com>
*
* Released under the General Public License (GPL).
*/

#define _LARGEFILE64_SOURCE
Expand Down
9 changes: 9 additions & 0 deletions MAINTAINERS
Original file line number Diff line number Diff line change
Expand Up @@ -2377,6 +2377,15 @@ W: http://www.kernel.org/pub/linux/kernel/people/fseidel/hdaps/
S: Maintained
F: drivers/hwmon/hdaps.c

HWPOISON MEMORY FAILURE HANDLING
M: Andi Kleen <andi@firstfloor.org>
L: linux-mm@kvack.org
L: linux-kernel@vger.kernel.org
T: git git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-mce-2.6.git hwpoison
S: Maintained
F: mm/memory-failure.c
F: mm/hwpoison-inject.c

HYPERVISOR VIRTUAL CONSOLE DRIVER
L: linuxppc-dev@ozlabs.org
S: Odd Fixes
Expand Down
61 changes: 61 additions & 0 deletions drivers/base/memory.c
Original file line number Diff line number Diff line change
Expand Up @@ -341,6 +341,64 @@ static inline int memory_probe_init(void)
}
#endif

#ifdef CONFIG_MEMORY_FAILURE
/*
* Support for offlining pages of memory
*/

/* Soft offline a page */
static ssize_t
store_soft_offline_page(struct class *class, const char *buf, size_t count)
{
int ret;
u64 pfn;
if (!capable(CAP_SYS_ADMIN))
return -EPERM;
if (strict_strtoull(buf, 0, &pfn) < 0)
return -EINVAL;
pfn >>= PAGE_SHIFT;
if (!pfn_valid(pfn))
return -ENXIO;
ret = soft_offline_page(pfn_to_page(pfn), 0);
return ret == 0 ? count : ret;
}

/* Forcibly offline a page, including killing processes. */
static ssize_t
store_hard_offline_page(struct class *class, const char *buf, size_t count)
{
int ret;
u64 pfn;
if (!capable(CAP_SYS_ADMIN))
return -EPERM;
if (strict_strtoull(buf, 0, &pfn) < 0)
return -EINVAL;
pfn >>= PAGE_SHIFT;
ret = __memory_failure(pfn, 0, 0);
return ret ? ret : count;
}

static CLASS_ATTR(soft_offline_page, 0644, NULL, store_soft_offline_page);
static CLASS_ATTR(hard_offline_page, 0644, NULL, store_hard_offline_page);

static __init int memory_fail_init(void)
{
int err;

err = sysfs_create_file(&memory_sysdev_class.kset.kobj,
&class_attr_soft_offline_page.attr);
if (!err)
err = sysfs_create_file(&memory_sysdev_class.kset.kobj,
&class_attr_hard_offline_page.attr);
return err;
}
#else
static inline int memory_fail_init(void)
{
return 0;
}
#endif

/*
* Note that phys_device is optional. It is here to allow for
* differentiation between which *physical* devices each
Expand Down Expand Up @@ -471,6 +529,9 @@ int __init memory_dev_init(void)
}

err = memory_probe_init();
if (!ret)
ret = err;
err = memory_fail_init();
if (!ret)
ret = err;
err = block_size_init();
Expand Down
45 changes: 3 additions & 42 deletions fs/proc/page.c
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
#include <linux/proc_fs.h>
#include <linux/seq_file.h>
#include <linux/hugetlb.h>
#include <linux/kernel-page-flags.h>
#include <asm/uaccess.h>
#include "internal.h"

Expand Down Expand Up @@ -71,52 +72,12 @@ static const struct file_operations proc_kpagecount_operations = {
* physical page flags.
*/

/* These macros are used to decouple internal flags from exported ones */

#define KPF_LOCKED 0
#define KPF_ERROR 1
#define KPF_REFERENCED 2
#define KPF_UPTODATE 3
#define KPF_DIRTY 4
#define KPF_LRU 5
#define KPF_ACTIVE 6
#define KPF_SLAB 7
#define KPF_WRITEBACK 8
#define KPF_RECLAIM 9
#define KPF_BUDDY 10

/* 11-20: new additions in 2.6.31 */
#define KPF_MMAP 11
#define KPF_ANON 12
#define KPF_SWAPCACHE 13
#define KPF_SWAPBACKED 14
#define KPF_COMPOUND_HEAD 15
#define KPF_COMPOUND_TAIL 16
#define KPF_HUGE 17
#define KPF_UNEVICTABLE 18
#define KPF_HWPOISON 19
#define KPF_NOPAGE 20

#define KPF_KSM 21

/* kernel hacking assistances
* WARNING: subject to change, never rely on them!
*/
#define KPF_RESERVED 32
#define KPF_MLOCKED 33
#define KPF_MAPPEDTODISK 34
#define KPF_PRIVATE 35
#define KPF_PRIVATE_2 36
#define KPF_OWNER_PRIVATE 37
#define KPF_ARCH 38
#define KPF_UNCACHED 39

static inline u64 kpf_copy_bit(u64 kflags, int ubit, int kbit)
{
return ((kflags >> kbit) & 1) << ubit;
}

static u64 get_uflags(struct page *page)
u64 stable_page_flags(struct page *page)
{
u64 k;
u64 u;
Expand Down Expand Up @@ -219,7 +180,7 @@ static ssize_t kpageflags_read(struct file *file, char __user *buf,
else
ppage = NULL;

if (put_user(get_uflags(ppage), out)) {
if (put_user(stable_page_flags(ppage), out)) {
ret = -EFAULT;
break;
}
Expand Down
1 change: 1 addition & 0 deletions include/asm-generic/mman-common.h
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,7 @@
#define MADV_DONTFORK 10 /* don't inherit across fork */
#define MADV_DOFORK 11 /* do inherit across fork */
#define MADV_HWPOISON 100 /* poison a page for testing */
#define MADV_SOFT_OFFLINE 101 /* soft offline page for testing */

#define MADV_MERGEABLE 12 /* KSM may merge identical pages */
#define MADV_UNMERGEABLE 13 /* KSM may not merge identical pages */
Expand Down
46 changes: 46 additions & 0 deletions include/linux/kernel-page-flags.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
#ifndef LINUX_KERNEL_PAGE_FLAGS_H
#define LINUX_KERNEL_PAGE_FLAGS_H

/*
* Stable page flag bits exported to user space
*/

#define KPF_LOCKED 0
#define KPF_ERROR 1
#define KPF_REFERENCED 2
#define KPF_UPTODATE 3
#define KPF_DIRTY 4
#define KPF_LRU 5
#define KPF_ACTIVE 6
#define KPF_SLAB 7
#define KPF_WRITEBACK 8
#define KPF_RECLAIM 9
#define KPF_BUDDY 10

/* 11-20: new additions in 2.6.31 */
#define KPF_MMAP 11
#define KPF_ANON 12
#define KPF_SWAPCACHE 13
#define KPF_SWAPBACKED 14
#define KPF_COMPOUND_HEAD 15
#define KPF_COMPOUND_TAIL 16
#define KPF_HUGE 17
#define KPF_UNEVICTABLE 18
#define KPF_HWPOISON 19
#define KPF_NOPAGE 20

#define KPF_KSM 21

/* kernel hacking assistances
* WARNING: subject to change, never rely on them!
*/
#define KPF_RESERVED 32
#define KPF_MLOCKED 33
#define KPF_MAPPEDTODISK 34
#define KPF_PRIVATE 35
#define KPF_PRIVATE_2 36
#define KPF_OWNER_PRIVATE 37
#define KPF_ARCH 38
#define KPF_UNCACHED 39

#endif /* LINUX_KERNEL_PAGE_FLAGS_H */
13 changes: 13 additions & 0 deletions include/linux/memcontrol.h
Original file line number Diff line number Diff line change
Expand Up @@ -73,6 +73,7 @@ extern unsigned long mem_cgroup_isolate_pages(unsigned long nr_to_scan,
extern void mem_cgroup_out_of_memory(struct mem_cgroup *mem, gfp_t gfp_mask);
int task_in_mem_cgroup(struct task_struct *task, const struct mem_cgroup *mem);

extern struct mem_cgroup *try_get_mem_cgroup_from_page(struct page *page);
extern struct mem_cgroup *mem_cgroup_from_task(struct task_struct *p);

static inline
Expand All @@ -85,6 +86,8 @@ int mm_match_cgroup(const struct mm_struct *mm, const struct mem_cgroup *cgroup)
return cgroup == mem;
}

extern struct cgroup_subsys_state *mem_cgroup_css(struct mem_cgroup *mem);

extern int
mem_cgroup_prepare_migration(struct page *page, struct mem_cgroup **ptr);
extern void mem_cgroup_end_migration(struct mem_cgroup *mem,
Expand Down Expand Up @@ -202,6 +205,11 @@ mem_cgroup_move_lists(struct page *page, enum lru_list from, enum lru_list to)
{
}

static inline struct mem_cgroup *try_get_mem_cgroup_from_page(struct page *page)
{
return NULL;
}

static inline int mm_match_cgroup(struct mm_struct *mm, struct mem_cgroup *mem)
{
return 1;
Expand All @@ -213,6 +221,11 @@ static inline int task_in_mem_cgroup(struct task_struct *task,
return 1;
}

static inline struct cgroup_subsys_state *mem_cgroup_css(struct mem_cgroup *mem)
{
return NULL;
}

static inline int
mem_cgroup_prepare_migration(struct page *page, struct mem_cgroup **ptr)
{
Expand Down
Loading

0 comments on commit d4220f9

Please sign in to comment.