Skip to content

Commit

Permalink
Merge branches 'sched/rt' and 'sched/urgent' into sched/core
Browse files Browse the repository at this point in the history
  • Loading branch information
Ingo Molnar committed Feb 8, 2009
3 parents 34cb613 + ceacc2c + 483b4ee commit 140573d
Show file tree
Hide file tree
Showing 1,930 changed files with 35,750 additions and 15,986 deletions.
11 changes: 4 additions & 7 deletions CREDITS
Original file line number Diff line number Diff line change
Expand Up @@ -3786,14 +3786,11 @@ S: The Netherlands

N: David Woodhouse
E: dwmw2@infradead.org
D: ARCnet stuff, Applicom board driver, SO_BINDTODEVICE,
D: some Alpha platform porting from 2.0, Memory Technology Devices,
D: Acquire watchdog timer, PC speaker driver maintenance,
D: JFFS2 file system, Memory Technology Device subsystem,
D: various other stuff that annoyed me by not working.
S: c/o Red Hat Engineering
S: Rustat House
S: 60 Clifton Road
S: Cambridge. CB1 7EG
S: c/o Intel Corporation
S: Pipers Way
S: Swindon. SN3 1RJ
S: England

N: Chris Wright
Expand Down
4 changes: 3 additions & 1 deletion Documentation/Changes
Original file line number Diff line number Diff line change
Expand Up @@ -33,10 +33,12 @@ o Gnu make 3.79.1 # make --version
o binutils 2.12 # ld -v
o util-linux 2.10o # fdformat --version
o module-init-tools 0.9.10 # depmod -V
o e2fsprogs 1.29 # tune2fs
o e2fsprogs 1.41.4 # e2fsck -V
o jfsutils 1.1.3 # fsck.jfs -V
o reiserfsprogs 3.6.3 # reiserfsck -V 2>&1|grep reiserfsprogs
o xfsprogs 2.6.0 # xfs_db -V
o squashfs-tools 4.0 # mksquashfs -version
o btrfs-progs 0.18 # btrfsck
o pcmciautils 004 # pccardctl -V
o quota-tools 3.09 # quota -V
o PPP 2.4.0 # pppd --version
Expand Down
18 changes: 13 additions & 5 deletions Documentation/CodingStyle
Original file line number Diff line number Diff line change
Expand Up @@ -483,17 +483,25 @@ values. To do the latter, you can stick the following in your .emacs file:
(* (max steps 1)
c-basic-offset)))

(add-hook 'c-mode-common-hook
(lambda ()
;; Add kernel style
(c-add-style
"linux-tabs-only"
'("linux" (c-offsets-alist
(arglist-cont-nonempty
c-lineup-gcc-asm-reg
c-lineup-arglist-tabs-only))))))

(add-hook 'c-mode-hook
(lambda ()
(let ((filename (buffer-file-name)))
;; Enable kernel mode for the appropriate files
(when (and filename
(string-match "~/src/linux-trees" filename))
(string-match (expand-file-name "~/src/linux-trees")
filename))
(setq indent-tabs-mode t)
(c-set-style "linux")
(c-set-offset 'arglist-cont-nonempty
'(c-lineup-gcc-asm-reg
c-lineup-arglist-tabs-only))))))
(c-set-style "linux-tabs-only")))))

This will make emacs go better with the kernel coding style for C
files below ~/src/linux-trees.
Expand Down
11 changes: 5 additions & 6 deletions Documentation/DMA-API.txt
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@

This document describes the DMA API. For a more gentle introduction
phrased in terms of the pci_ equivalents (and actual examples) see
DMA-mapping.txt
Documentation/PCI/PCI-DMA-mapping.txt.

This API is split into two pieces. Part I describes the API and the
corresponding pci_ API. Part II describes the extensions to the API
Expand Down Expand Up @@ -170,16 +170,15 @@ Returns: 0 if successful and a negative error if not.
u64
dma_get_required_mask(struct device *dev)

After setting the mask with dma_set_mask(), this API returns the
actual mask (within that already set) that the platform actually
requires to operate efficiently. Usually this means the returned mask
This API returns the mask that the platform requires to
operate efficiently. Usually this means the returned mask
is the minimum required to cover all of memory. Examining the
required mask gives drivers with variable descriptor sizes the
opportunity to use smaller descriptors as necessary.

Requesting the required mask does not alter the current mask. If you
wish to take advantage of it, you should issue another dma_set_mask()
call to lower the mask again.
wish to take advantage of it, you should issue a dma_set_mask()
call to set the mask to the value returned.


Part Id - Streaming DMA mappings
Expand Down
88 changes: 88 additions & 0 deletions Documentation/DocBook/uio-howto.tmpl
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,12 @@ GPL version 2.
</abstract>

<revhistory>
<revision>
<revnumber>0.7</revnumber>
<date>2008-12-23</date>
<authorinitials>hjk</authorinitials>
<revremark>Added generic platform drivers and offset attribute.</revremark>
</revision>
<revision>
<revnumber>0.6</revnumber>
<date>2008-12-05</date>
Expand Down Expand Up @@ -312,6 +318,16 @@ interested in translating it, please email me
pointed to by addr.
</para>
</listitem>
<listitem>
<para>
<filename>offset</filename>: The offset, in bytes, that has to be
added to the pointer returned by <function>mmap()</function> to get
to the actual device memory. This is important if the device's memory
is not page aligned. Remember that pointers returned by
<function>mmap()</function> are always page aligned, so it is good
style to always add this offset.
</para>
</listitem>
</itemizedlist>

<para>
Expand Down Expand Up @@ -594,6 +610,78 @@ framework to set up sysfs files for this region. Simply leave it alone.
</para>
</sect1>

<sect1 id="using_uio_pdrv">
<title>Using uio_pdrv for platform devices</title>
<para>
In many cases, UIO drivers for platform devices can be handled in a
generic way. In the same place where you define your
<varname>struct platform_device</varname>, you simply also implement
your interrupt handler and fill your
<varname>struct uio_info</varname>. A pointer to this
<varname>struct uio_info</varname> is then used as
<varname>platform_data</varname> for your platform device.
</para>
<para>
You also need to set up an array of <varname>struct resource</varname>
containing addresses and sizes of your memory mappings. This
information is passed to the driver using the
<varname>.resource</varname> and <varname>.num_resources</varname>
elements of <varname>struct platform_device</varname>.
</para>
<para>
You now have to set the <varname>.name</varname> element of
<varname>struct platform_device</varname> to
<varname>"uio_pdrv"</varname> to use the generic UIO platform device
driver. This driver will fill the <varname>mem[]</varname> array
according to the resources given, and register the device.
</para>
<para>
The advantage of this approach is that you only have to edit a file
you need to edit anyway. You do not have to create an extra driver.
</para>
</sect1>

<sect1 id="using_uio_pdrv_genirq">
<title>Using uio_pdrv_genirq for platform devices</title>
<para>
Especially in embedded devices, you frequently find chips where the
irq pin is tied to its own dedicated interrupt line. In such cases,
where you can be really sure the interrupt is not shared, we can take
the concept of <varname>uio_pdrv</varname> one step further and use a
generic interrupt handler. That's what
<varname>uio_pdrv_genirq</varname> does.
</para>
<para>
The setup for this driver is the same as described above for
<varname>uio_pdrv</varname>, except that you do not implement an
interrupt handler. The <varname>.handler</varname> element of
<varname>struct uio_info</varname> must remain
<varname>NULL</varname>. The <varname>.irq_flags</varname> element
must not contain <varname>IRQF_SHARED</varname>.
</para>
<para>
You will set the <varname>.name</varname> element of
<varname>struct platform_device</varname> to
<varname>"uio_pdrv_genirq"</varname> to use this driver.
</para>
<para>
The generic interrupt handler of <varname>uio_pdrv_genirq</varname>
will simply disable the interrupt line using
<function>disable_irq_nosync()</function>. After doing its work,
userspace can reenable the interrupt by writing 0x00000001 to the UIO
device file. The driver already implements an
<function>irq_control()</function> to make this possible, you must not
implement your own.
</para>
<para>
Using <varname>uio_pdrv_genirq</varname> not only saves a few lines of
interrupt handler code. You also do not need to know anything about
the chip's internal registers to create the kernel part of the driver.
All you need to know is the irq number of the pin the chip is
connected to.
</para>
</sect1>

</chapter>

<chapter id="userspace_driver" xreflabel="Writing a driver in user space">
Expand Down
4 changes: 2 additions & 2 deletions Documentation/IO-mapping.txt
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[ NOTE: The virt_to_bus() and bus_to_virt() functions have been
superseded by the functionality provided by the PCI DMA
interface (see Documentation/DMA-mapping.txt). They continue
superseded by the functionality provided by the PCI DMA interface
(see Documentation/PCI/PCI-DMA-mapping.txt). They continue
to be documented below for historical purposes, but new code
must not use them. --davidm 00/12/12 ]

Expand Down
4 changes: 4 additions & 0 deletions Documentation/accounting/getdelays.c
Original file line number Diff line number Diff line change
Expand Up @@ -392,6 +392,10 @@ int main(int argc, char *argv[])
goto err;
}
}
if (!maskset && !tid && !containerset) {
usage();
goto err;
}

do {
int i;
Expand Down
11 changes: 6 additions & 5 deletions Documentation/block/biodoc.txt
Original file line number Diff line number Diff line change
Expand Up @@ -186,8 +186,9 @@ a virtual address mapping (unlike the earlier scheme of virtual address
do not have a corresponding kernel virtual address space mapping) and
low-memory pages.

Note: Please refer to DMA-mapping.txt for a discussion on PCI high mem DMA
aspects and mapping of scatter gather lists, and support for 64 bit PCI.
Note: Please refer to Documentation/PCI/PCI-DMA-mapping.txt for a discussion
on PCI high mem DMA aspects and mapping of scatter gather lists, and support
for 64 bit PCI.

Special handling is required only for cases where i/o needs to happen on
pages at physical memory addresses beyond what the device can support. In these
Expand Down Expand Up @@ -953,14 +954,14 @@ elevator_allow_merge_fn called whenever the block layer determines
results in some sort of conflict internally,
this hook allows it to do that.

elevator_dispatch_fn fills the dispatch queue with ready requests.
elevator_dispatch_fn* fills the dispatch queue with ready requests.
I/O schedulers are free to postpone requests by
not filling the dispatch queue unless @force
is non-zero. Once dispatched, I/O schedulers
are not allowed to manipulate the requests -
they belong to generic dispatch queue.

elevator_add_req_fn called to add a new request into the scheduler
elevator_add_req_fn* called to add a new request into the scheduler

elevator_queue_empty_fn returns true if the merge queue is empty.
Drivers shouldn't use this, but rather check
Expand Down Expand Up @@ -990,7 +991,7 @@ elevator_activate_req_fn Called when device driver first sees a request.
elevator_deactivate_req_fn Called when device driver decides to delay
a request by requeueing it.

elevator_init_fn
elevator_init_fn*
elevator_exit_fn Allocate and free any elevator specific storage
for a queue.

Expand Down
63 changes: 63 additions & 0 deletions Documentation/block/queue-sysfs.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
Queue sysfs files
=================

This text file will detail the queue files that are located in the sysfs tree
for each block device. Note that stacked devices typically do not export
any settings, since their queue merely functions are a remapping target.
These files are the ones found in the /sys/block/xxx/queue/ directory.

Files denoted with a RO postfix are readonly and the RW postfix means
read-write.

hw_sector_size (RO)
-------------------
This is the hardware sector size of the device, in bytes.

max_hw_sectors_kb (RO)
----------------------
This is the maximum number of kilobytes supported in a single data transfer.

max_sectors_kb (RW)
-------------------
This is the maximum number of kilobytes that the block layer will allow
for a filesystem request. Must be smaller than or equal to the maximum
size allowed by the hardware.

nomerges (RW)
-------------
This enables the user to disable the lookup logic involved with IO merging
requests in the block layer. Merging may still occur through a direct
1-hit cache, since that comes for (almost) free. The IO scheduler will not
waste cycles doing tree/hash lookups for merges if nomerges is 1. Defaults
to 0, enabling all merges.

nr_requests (RW)
----------------
This controls how many requests may be allocated in the block layer for
read or write requests. Note that the total allocated number may be twice
this amount, since it applies only to reads or writes (not the accumulated
sum).

read_ahead_kb (RW)
------------------
Maximum number of kilobytes to read-ahead for filesystems on this block
device.

rq_affinity (RW)
----------------
If this option is enabled, the block layer will migrate request completions
to the CPU that originally submitted the request. For some workloads
this provides a significant reduction in CPU cycles due to caching effects.

scheduler (RW)
--------------
When read, this file will display the current and available IO schedulers
for this block device. The currently active IO scheduler will be enclosed
in [] brackets. Writing an IO scheduler name to this file will switch
control of this block device to that new IO scheduler. Note that writing
an IO scheduler name to this file will attempt to load that IO scheduler
module, if it isn't already present in the system.



Jens Axboe <jens.axboe@oracle.com>, February 2009
5 changes: 3 additions & 2 deletions Documentation/cgroups/cgroups.txt
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
CGROUPS
-------

Written by Paul Menage <menage@google.com> based on Documentation/cpusets.txt
Written by Paul Menage <menage@google.com> based on
Documentation/cgroups/cpusets.txt

Original copyright statements from cpusets.txt:
Portions Copyright (C) 2004 BULL SA.
Expand Down Expand Up @@ -68,7 +69,7 @@ On their own, the only use for cgroups is for simple job
tracking. The intention is that other subsystems hook into the generic
cgroup support to provide new attributes for cgroups, such as
accounting/limiting the resources which processes in a cgroup can
access. For example, cpusets (see Documentation/cpusets.txt) allows
access. For example, cpusets (see Documentation/cgroups/cpusets.txt) allows
you to associate a set of CPUs and a set of memory nodes with the
tasks in each cgroup.

Expand Down
File renamed without changes.
File renamed without changes.
File renamed without changes.
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
Memory Resource Controller(Memcg) Implementation Memo.
Last Updated: 2008/12/15
Base Kernel Version: based on 2.6.28-rc8-mm.
Last Updated: 2009/1/19
Base Kernel Version: based on 2.6.29-rc2.

Because VM is getting complex (one of reasons is memcg...), memcg's behavior
is complex. This is a document for memcg's internal behavior.
Please note that implementation details can be changed.

(*) Topics on API should be in Documentation/controllers/memory.txt)
(*) Topics on API should be in Documentation/cgroups/memory.txt)

0. How to record usage ?
2 objects are used.
Expand Down Expand Up @@ -340,3 +340,23 @@ Under below explanation, we assume CONFIG_MEM_RES_CTRL_SWAP=y.
# mount -t cgroup none /cgroup -t cpuset,memory,cpu,devices

and do task move, mkdir, rmdir etc...under this.

9.7 swapoff.
Besides management of swap is one of complicated parts of memcg,
call path of swap-in at swapoff is not same as usual swap-in path..
It's worth to be tested explicitly.

For example, test like following is good.
(Shell-A)
# mount -t cgroup none /cgroup -t memory
# mkdir /cgroup/test
# echo 40M > /cgroup/test/memory.limit_in_bytes
# echo 0 > /cgroup/test/tasks
Run malloc(100M) program under this. You'll see 60M of swaps.
(Shell-B)
# move all tasks in /cgroup/test to /cgroup
# /sbin/swapoff -a
# rmdir /test/cgroup
# kill malloc task.

Of course, tmpfs v.s. swapoff test should be tested, too.
File renamed without changes.
File renamed without changes.
Loading

0 comments on commit 140573d

Please sign in to comment.