Skip to content

Commit 4b06042

Browse files
Mike Travistorvalds
authored andcommitted
bitmap, irq: add smp_affinity_list interface to /proc/irq
Manually adjusting the smp_affinity for IRQ's becomes unwieldy when the cpu count is large. Setting smp affinity to cpus 256 to 263 would be: echo 000000ff,00000000,00000000,00000000,00000000,00000000,00000000,00000000 > smp_affinity instead of: echo 256-263 > smp_affinity_list Think about what it looks like for cpus around say, 4088 to 4095. We already have many alternate "list" interfaces: /sys/devices/system/cpu/cpuX/indexY/shared_cpu_list /sys/devices/system/cpu/cpuX/topology/thread_siblings_list /sys/devices/system/cpu/cpuX/topology/core_siblings_list /sys/devices/system/node/nodeX/cpulist /sys/devices/pci***/***/local_cpulist Add a companion interface, smp_affinity_list to use cpu lists instead of cpu maps. This conforms to other companion interfaces where both a map and a list interface exists. This required adding a bitmap_parselist_user() function in a manner similar to the bitmap_parse_user() function. [akpm@linux-foundation.org: make __bitmap_parselist() static] Signed-off-by: Mike Travis <travis@sgi.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Jack Steiner <steiner@sgi.com> Cc: Lee Schermerhorn <lee.schermerhorn@hp.com> Cc: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
1 parent e50c1f6 commit 4b06042

File tree

6 files changed

+188
-23
lines changed

6 files changed

+188
-23
lines changed

Documentation/IRQ-affinity.txt

Lines changed: 13 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -4,10 +4,11 @@ ChangeLog:
44

55
SMP IRQ affinity
66

7-
/proc/irq/IRQ#/smp_affinity specifies which target CPUs are permitted
8-
for a given IRQ source. It's a bitmask of allowed CPUs. It's not allowed
9-
to turn off all CPUs, and if an IRQ controller does not support IRQ
10-
affinity then the value will not change from the default 0xffffffff.
7+
/proc/irq/IRQ#/smp_affinity and /proc/irq/IRQ#/smp_affinity_list specify
8+
which target CPUs are permitted for a given IRQ source. It's a bitmask
9+
(smp_affinity) or cpu list (smp_affinity_list) of allowed CPUs. It's not
10+
allowed to turn off all CPUs, and if an IRQ controller does not support
11+
IRQ affinity then the value will not change from the default of all cpus.
1112

1213
/proc/irq/default_smp_affinity specifies default affinity mask that applies
1314
to all non-active IRQs. Once IRQ is allocated/activated its affinity bitmask
@@ -54,3 +55,11 @@ round-trip min/avg/max = 0.1/0.5/585.4 ms
5455
This time around IRQ44 was delivered only to the last four processors.
5556
i.e counters for the CPU0-3 did not change.
5657

58+
Here is an example of limiting that same irq (44) to cpus 1024 to 1031:
59+
60+
[root@moon 44]# echo 1024-1031 > smp_affinity
61+
[root@moon 44]# cat smp_affinity
62+
1024-1031
63+
64+
Note that to do this with a bitmask would require 32 bitmasks of zero
65+
to follow the pertinent one.

Documentation/filesystems/proc.txt

Lines changed: 9 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -574,6 +574,12 @@ The contents of each smp_affinity file is the same by default:
574574
> cat /proc/irq/0/smp_affinity
575575
ffffffff
576576

577+
There is an alternate interface, smp_affinity_list which allows specifying
578+
a cpu range instead of a bitmask:
579+
580+
> cat /proc/irq/0/smp_affinity_list
581+
1024-1031
582+
577583
The default_smp_affinity mask applies to all non-active IRQs, which are the
578584
IRQs which have not yet been allocated/activated, and hence which lack a
579585
/proc/irq/[0-9]* directory.
@@ -583,12 +589,13 @@ reports itself as being attached. This hardware locality information does not
583589
include information about any possible driver locality preference.
584590

585591
prof_cpu_mask specifies which CPUs are to be profiled by the system wide
586-
profiler. Default value is ffffffff (all cpus).
592+
profiler. Default value is ffffffff (all cpus if there are only 32 of them).
587593

588594
The way IRQs are routed is handled by the IO-APIC, and it's Round Robin
589595
between all the CPUs which are allowed to handle it. As usual the kernel has
590596
more info than you and does a better job than you, so the defaults are the
591-
best choice for almost everyone.
597+
best choice for almost everyone. [Note this applies only to those IO-APIC's
598+
that support "Round Robin" interrupt distribution.]
592599

593600
There are three more important subdirectories in /proc: net, scsi, and sys.
594601
The general rule is that the contents, or even the existence of these

include/linux/bitmap.h

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -55,7 +55,8 @@
5555
* bitmap_parse(buf, buflen, dst, nbits) Parse bitmap dst from kernel buf
5656
* bitmap_parse_user(ubuf, ulen, dst, nbits) Parse bitmap dst from user buf
5757
* bitmap_scnlistprintf(buf, len, src, nbits) Print bitmap src as list to buf
58-
* bitmap_parselist(buf, dst, nbits) Parse bitmap dst from list
58+
* bitmap_parselist(buf, dst, nbits) Parse bitmap dst from kernel buf
59+
* bitmap_parselist_user(buf, dst, nbits) Parse bitmap dst from user buf
5960
* bitmap_find_free_region(bitmap, bits, order) Find and allocate bit region
6061
* bitmap_release_region(bitmap, pos, order) Free specified bit region
6162
* bitmap_allocate_region(bitmap, pos, order) Allocate specified bit region
@@ -129,6 +130,8 @@ extern int bitmap_scnlistprintf(char *buf, unsigned int len,
129130
const unsigned long *src, int nbits);
130131
extern int bitmap_parselist(const char *buf, unsigned long *maskp,
131132
int nmaskbits);
133+
extern int bitmap_parselist_user(const char __user *ubuf, unsigned int ulen,
134+
unsigned long *dst, int nbits);
132135
extern void bitmap_remap(unsigned long *dst, const unsigned long *src,
133136
const unsigned long *old, const unsigned long *new, int bits);
134137
extern int bitmap_bitremap(int oldbit,

include/linux/cpumask.h

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -546,6 +546,21 @@ static inline int cpumask_parse_user(const char __user *buf, int len,
546546
return bitmap_parse_user(buf, len, cpumask_bits(dstp), nr_cpumask_bits);
547547
}
548548

549+
/**
550+
* cpumask_parselist_user - extract a cpumask from a user string
551+
* @buf: the buffer to extract from
552+
* @len: the length of the buffer
553+
* @dstp: the cpumask to set.
554+
*
555+
* Returns -errno, or 0 for success.
556+
*/
557+
static inline int cpumask_parselist_user(const char __user *buf, int len,
558+
struct cpumask *dstp)
559+
{
560+
return bitmap_parselist_user(buf, len, cpumask_bits(dstp),
561+
nr_cpumask_bits);
562+
}
563+
549564
/**
550565
* cpulist_scnprintf - print a cpumask into a string as comma-separated list
551566
* @buf: the buffer to sprintf into

kernel/irq/proc.c

Lines changed: 50 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@ static struct proc_dir_entry *root_irq_dir;
1919

2020
#ifdef CONFIG_SMP
2121

22-
static int irq_affinity_proc_show(struct seq_file *m, void *v)
22+
static int show_irq_affinity(int type, struct seq_file *m, void *v)
2323
{
2424
struct irq_desc *desc = irq_to_desc((long)m->private);
2525
const struct cpumask *mask = desc->irq_data.affinity;
@@ -28,7 +28,10 @@ static int irq_affinity_proc_show(struct seq_file *m, void *v)
2828
if (irqd_is_setaffinity_pending(&desc->irq_data))
2929
mask = desc->pending_mask;
3030
#endif
31-
seq_cpumask(m, mask);
31+
if (type)
32+
seq_cpumask_list(m, mask);
33+
else
34+
seq_cpumask(m, mask);
3235
seq_putc(m, '\n');
3336
return 0;
3437
}
@@ -59,7 +62,18 @@ static int irq_affinity_hint_proc_show(struct seq_file *m, void *v)
5962
#endif
6063

6164
int no_irq_affinity;
62-
static ssize_t irq_affinity_proc_write(struct file *file,
65+
static int irq_affinity_proc_show(struct seq_file *m, void *v)
66+
{
67+
return show_irq_affinity(0, m, v);
68+
}
69+
70+
static int irq_affinity_list_proc_show(struct seq_file *m, void *v)
71+
{
72+
return show_irq_affinity(1, m, v);
73+
}
74+
75+
76+
static ssize_t write_irq_affinity(int type, struct file *file,
6377
const char __user *buffer, size_t count, loff_t *pos)
6478
{
6579
unsigned int irq = (int)(long)PDE(file->f_path.dentry->d_inode)->data;
@@ -72,7 +86,10 @@ static ssize_t irq_affinity_proc_write(struct file *file,
7286
if (!alloc_cpumask_var(&new_value, GFP_KERNEL))
7387
return -ENOMEM;
7488

75-
err = cpumask_parse_user(buffer, count, new_value);
89+
if (type)
90+
err = cpumask_parselist_user(buffer, count, new_value);
91+
else
92+
err = cpumask_parse_user(buffer, count, new_value);
7693
if (err)
7794
goto free_cpumask;
7895

@@ -100,11 +117,28 @@ static ssize_t irq_affinity_proc_write(struct file *file,
100117
return err;
101118
}
102119

120+
static ssize_t irq_affinity_proc_write(struct file *file,
121+
const char __user *buffer, size_t count, loff_t *pos)
122+
{
123+
return write_irq_affinity(0, file, buffer, count, pos);
124+
}
125+
126+
static ssize_t irq_affinity_list_proc_write(struct file *file,
127+
const char __user *buffer, size_t count, loff_t *pos)
128+
{
129+
return write_irq_affinity(1, file, buffer, count, pos);
130+
}
131+
103132
static int irq_affinity_proc_open(struct inode *inode, struct file *file)
104133
{
105134
return single_open(file, irq_affinity_proc_show, PDE(inode)->data);
106135
}
107136

137+
static int irq_affinity_list_proc_open(struct inode *inode, struct file *file)
138+
{
139+
return single_open(file, irq_affinity_list_proc_show, PDE(inode)->data);
140+
}
141+
108142
static int irq_affinity_hint_proc_open(struct inode *inode, struct file *file)
109143
{
110144
return single_open(file, irq_affinity_hint_proc_show, PDE(inode)->data);
@@ -125,6 +159,14 @@ static const struct file_operations irq_affinity_hint_proc_fops = {
125159
.release = single_release,
126160
};
127161

162+
static const struct file_operations irq_affinity_list_proc_fops = {
163+
.open = irq_affinity_list_proc_open,
164+
.read = seq_read,
165+
.llseek = seq_lseek,
166+
.release = single_release,
167+
.write = irq_affinity_list_proc_write,
168+
};
169+
128170
static int default_affinity_show(struct seq_file *m, void *v)
129171
{
130172
seq_cpumask(m, irq_default_affinity);
@@ -289,6 +331,10 @@ void register_irq_proc(unsigned int irq, struct irq_desc *desc)
289331
proc_create_data("affinity_hint", 0400, desc->dir,
290332
&irq_affinity_hint_proc_fops, (void *)(long)irq);
291333

334+
/* create /proc/irq/<irq>/smp_affinity_list */
335+
proc_create_data("smp_affinity_list", 0600, desc->dir,
336+
&irq_affinity_list_proc_fops, (void *)(long)irq);
337+
292338
proc_create_data("node", 0444, desc->dir,
293339
&irq_node_proc_fops, (void *)(long)irq);
294340
#endif

lib/bitmap.c

Lines changed: 97 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -571,8 +571,11 @@ int bitmap_scnlistprintf(char *buf, unsigned int buflen,
571571
EXPORT_SYMBOL(bitmap_scnlistprintf);
572572

573573
/**
574-
* bitmap_parselist - convert list format ASCII string to bitmap
574+
* __bitmap_parselist - convert list format ASCII string to bitmap
575575
* @bp: read nul-terminated user string from this buffer
576+
* @buflen: buffer size in bytes. If string is smaller than this
577+
* then it must be terminated with a \0.
578+
* @is_user: location of buffer, 0 indicates kernel space
576579
* @maskp: write resulting mask here
577580
* @nmaskbits: number of bits in mask to be written
578581
*
@@ -587,20 +590,63 @@ EXPORT_SYMBOL(bitmap_scnlistprintf);
587590
* %-EINVAL: invalid character in string
588591
* %-ERANGE: bit number specified too large for mask
589592
*/
590-
int bitmap_parselist(const char *bp, unsigned long *maskp, int nmaskbits)
593+
static int __bitmap_parselist(const char *buf, unsigned int buflen,
594+
int is_user, unsigned long *maskp,
595+
int nmaskbits)
591596
{
592597
unsigned a, b;
598+
int c, old_c, totaldigits;
599+
const char __user *ubuf = buf;
600+
int exp_digit, in_range;
593601

602+
totaldigits = c = 0;
594603
bitmap_zero(maskp, nmaskbits);
595604
do {
596-
if (!isdigit(*bp))
597-
return -EINVAL;
598-
b = a = simple_strtoul(bp, (char **)&bp, BASEDEC);
599-
if (*bp == '-') {
600-
bp++;
601-
if (!isdigit(*bp))
605+
exp_digit = 1;
606+
in_range = 0;
607+
a = b = 0;
608+
609+
/* Get the next cpu# or a range of cpu#'s */
610+
while (buflen) {
611+
old_c = c;
612+
if (is_user) {
613+
if (__get_user(c, ubuf++))
614+
return -EFAULT;
615+
} else
616+
c = *buf++;
617+
buflen--;
618+
if (isspace(c))
619+
continue;
620+
621+
/*
622+
* If the last character was a space and the current
623+
* character isn't '\0', we've got embedded whitespace.
624+
* This is a no-no, so throw an error.
625+
*/
626+
if (totaldigits && c && isspace(old_c))
627+
return -EINVAL;
628+
629+
/* A '\0' or a ',' signal the end of a cpu# or range */
630+
if (c == '\0' || c == ',')
631+
break;
632+
633+
if (c == '-') {
634+
if (exp_digit || in_range)
635+
return -EINVAL;
636+
b = 0;
637+
in_range = 1;
638+
exp_digit = 1;
639+
continue;
640+
}
641+
642+
if (!isdigit(c))
602643
return -EINVAL;
603-
b = simple_strtoul(bp, (char **)&bp, BASEDEC);
644+
645+
b = b * 10 + (c - '0');
646+
if (!in_range)
647+
a = b;
648+
exp_digit = 0;
649+
totaldigits++;
604650
}
605651
if (!(a <= b))
606652
return -EINVAL;
@@ -610,13 +656,52 @@ int bitmap_parselist(const char *bp, unsigned long *maskp, int nmaskbits)
610656
set_bit(a, maskp);
611657
a++;
612658
}
613-
if (*bp == ',')
614-
bp++;
615-
} while (*bp != '\0' && *bp != '\n');
659+
} while (buflen && c == ',');
616660
return 0;
617661
}
662+
663+
int bitmap_parselist(const char *bp, unsigned long *maskp, int nmaskbits)
664+
{
665+
char *nl = strchr(bp, '\n');
666+
int len;
667+
668+
if (nl)
669+
len = nl - bp;
670+
else
671+
len = strlen(bp);
672+
673+
return __bitmap_parselist(bp, len, 0, maskp, nmaskbits);
674+
}
618675
EXPORT_SYMBOL(bitmap_parselist);
619676

677+
678+
/**
679+
* bitmap_parselist_user()
680+
*
681+
* @ubuf: pointer to user buffer containing string.
682+
* @ulen: buffer size in bytes. If string is smaller than this
683+
* then it must be terminated with a \0.
684+
* @maskp: pointer to bitmap array that will contain result.
685+
* @nmaskbits: size of bitmap, in bits.
686+
*
687+
* Wrapper for bitmap_parselist(), providing it with user buffer.
688+
*
689+
* We cannot have this as an inline function in bitmap.h because it needs
690+
* linux/uaccess.h to get the access_ok() declaration and this causes
691+
* cyclic dependencies.
692+
*/
693+
int bitmap_parselist_user(const char __user *ubuf,
694+
unsigned int ulen, unsigned long *maskp,
695+
int nmaskbits)
696+
{
697+
if (!access_ok(VERIFY_READ, ubuf, ulen))
698+
return -EFAULT;
699+
return __bitmap_parselist((const char *)ubuf,
700+
ulen, 1, maskp, nmaskbits);
701+
}
702+
EXPORT_SYMBOL(bitmap_parselist_user);
703+
704+
620705
/**
621706
* bitmap_pos_to_ord - find ordinal of set bit at given position in bitmap
622707
* @buf: pointer to a bitmap

0 commit comments

Comments
 (0)