WIP: Seq file #232

adamrk · 2021-04-28T10:02:52Z

Add a Rust trait equivalent to the C seq_operations interface and the ability to create a /proc file from a type implementing that trait. Instead of just translating each function in the interface to Rust I used an iterator instead since it maps pretty cleanly. But happy to change it if we thing it makes more sense to keep things close to the C side.

A couple notes:

I split this into two commits, where the second contains only CI/CD changes, so the first commit can be eventually upstreamed, but not the second. Does this make sense or should we be doing something else?
In the sample driver I called some panicking functions, but I can't remember what notation we decided on for for the comments on why they won't panic (just used NO PANIC).
There's one point in seq_file.rs where I needed to allocate a string, but couldn't get what I needed using try_reserve, so just left a TODO about fixing it when we have more fallible options.
Added some testing in the CI that actually uses the sample driver instead of just loading/unloading it. I needed this for my own testing, but it seems like we're not doing it for any other sample drivers. Is there a reason for that?

samples/rust/Kconfig

ojeda · 2021-04-28T11:40:46Z

rust/kernel/seq_file.rs

+/// Create a `/proc` file.
+///
+/// The returned value must not be dropped until the module is unloaded.
+pub fn proc_create<T: SeqOperations + Sync>(


Do we need #[cfg] on proc etc.?

(Not necessarily on the function itself, it could be on the implementation instead so that callers do not need to care, i.e. skipping the bindings call if PROC is not there -- please check what the kernel does currently for PROC-related calls).

Hmmm... It looks like other places have an #ifdef CONFIG_PROC_FS block around all the code for setting up the seq_file. Not sure which way makes more sense?

Ended up splitting this into seq_file.rs and proc_fs.rs. The proc specific parts are all under #[cfg(CONFIG_PROC_FS)], but the seq_file parts aren't so we can later add e.g. debugfs seq_files.

samples/rust/rust_seq_file.rs

.github/workflows/ci.yaml

ojeda · 2021-04-28T11:53:05Z

I split this into two commits, where the second contains only CI/CD changes, so the first commit can be eventually upstreamed, but not the second. Does this make sense or should we be doing something else?

Thanks for being mindful of that! It does make it easier for me when pushing to rust-next, but I take care of that manually otherwise.

In the sample driver I called some panicking functions, but I can't remember what notation we decided on for for the comments on why they won't panic (just used NO PANIC).

NOPANIC -- let me document that today, I will send a PR. Thanks a lot for starting its usage!

There's one point in seq_file.rs where I needed to allocate a string, but couldn't get what I needed using try_reserve, so just left a TODO about fixing it when we have more fallible options.

That's perfectly fine!

Added some testing in the CI that actually uses the sample driver instead of just loading/unloading it. I needed this for my own testing, but it seems like we're not doing it for any other sample drivers. Is there a reason for that?

We have some tests for the printing sample, for instance. But yeah, no particular reason, some tests that could be there were never added -- feel free to send a PR; otherwise I will do it at some point.

wedsonaf

Comments from a quick pass. I'll take a closer look tomorrow.

rust/kernel/seq_file.rs

samples/rust/rust_seq_file.rs

wedsonaf · 2021-04-28T15:10:35Z

rust/kernel/seq_file.rs

+        // SAFETY: Calling a C function. `s` is guaranteed to be null terminated
+        // because we explicitly constructed it just above.
+        unsafe {
+            bindings::seq_puts(m, s.as_ptr() as *const u8 as *const c_types::c_char);


Can we use seq_printf here and specify the string length so that we don't need to depend on to_string and concatenation?

Looked into it a bit, but I can't really see how to do it (at least not in a way that's generic over Display). Here are some options I found:

If we use seq_printf with the length in the format string, we'll still need to allocate a String to calculate the length and get the pointer to pass. It would get rid of the concat though.

Instead of having Item: Display, we could require a fn(&Item) -> &str. Then the implementer can decide if they want to allocate a String or not.

We could have the trait define a print function that take Item and seq_file and let the implementer use the various seq print functions. But then they'll need to use unsafe and they'd have the same problem as us if they want to print a string or type that doesn't have a printf format.

Instead of having Item: Display we could make our own trait for types that can be formatted with printf and write safe wrappers around seq_printf for each of those.

There is the VaList type which seems to handle varargs, so we could have that type instead of the generic Item, but it seems more for defining Rust code that can be called on the C side with varargs, not for generating a Rust type that can be passed to C functions expecting varargs.

I don't think there's any support for putting a dynamically sized string on the stack.

Went with 2.

I think it's fine for now. In the future if there's a need for it, I think it would make sense to do a variant of option 3, where we would provide abstractions that would allow users to use the various seq_file write variants without having to call unsafe functions.

Happy to make a follow-up PR for that.

adamrk · 2021-04-28T18:51:13Z

Just fixed the compiler error in the sample by making SeqFile Sync. Why does KernelModule need to be Sync by the way?

ojeda · 2021-04-30T04:13:56Z

samples/rust/rust_seq_file.rs

+        let template = "rust_seq_file: device opened this many times: ";
+        message.try_reserve_exact(template.len() + 4).ok()?;
+        // NOPANIC: We reserved space for `template` above.
+        message.push_str(template);
+        if *count < 1000 {
+            // NOPANIC: There are 4 characters remaining in the string which
+            // leaves space for a 3 digit number and the newline.
+            write!(&mut message, "{}\n", *count).ok()?;
+        }


Something more explicit for the 3, 4 and 1000 constants:

const MAX_DIGITS: u32 = 3; const MAX_LENGTH: u32 = MAX_DIGITS + 1; const MAX_COUNT: u32 = 10u32.pow(MAX_DIGITS);

I would put them local to the function, given they are not used elsewhere.

rust/kernel/seq_file.rs

wedsonaf · 2021-05-02T13:22:09Z

rust/kernel/seq_file.rs

+    /// `m` must have been created from a proc file generated by calling
+    /// `proc_create::<T>`.
+    unsafe fn convert(m: *mut bindings::seq_file) -> *const T {
+        let reg = crate::container_of!((*m).op, Self, seq_ops);


You should use (*m).private to save state. This way op can be constant shared by everyone.

Had to make some changes to get this to work. I couldn't see how to stare the data in (*m).private using proc_create_seq_private because that always uses the default seq_open which doesn't give us the chance to mutate the seq_file created on open.

So I switched it to use proc_create_data which let's us use our own open.

Looking at how others have done this, it looks like it's done this way: call proc_create_seq_private with your state in data, then retrieve it using PDE_DATA(file_inode(seq->file)).

Thanks, got that working now.

adamrk · 2021-05-06T08:10:42Z

Think I've addressed everything, but it required quite a few changes so probably needs a fresh review.

wedsonaf · 2021-05-11T01:02:38Z

rust/kernel/bindings_helper.h

@@ -14,6 +16,10 @@
 #include <linux/mm.h>
 #include <uapi/linux/android/binder.h>

+#ifdef CONFIG_PROC_FS
+#include "../../fs/proc/internal.h"


Why do we need this?

I don't think this would be acceptable to upstream maintainers. We shouldn't be peeking into internal stuff.

I was using it to get access to the data field of proc_dir_entry so that we can drop the data when dropping the entry.

Removed this by having the Rust ProcDirEntry store it's own copy of the data pointer so that we don't need to extract it from the proc_dir_entry.

wedsonaf · 2021-05-11T15:06:26Z

rust/kernel/proc_fs.rs

+/// # Invariants
+///
+/// The pointer [`ProcDirEntry::proc_dir_entry`] is a valid pointer and
+/// it's field [`bindings::proc_dir_entry::data`] is a valid pointer to


typo: its field

wedsonaf · 2021-05-11T15:09:30Z

rust/kernel/proc_fs.rs

+        unsafe {
+            bindings::proc_remove(self.proc_dir_entry);
+        }
+        drop(data);


nit: this drop is implicit, we don't really need to call it explicitly.

wedsonaf · 2021-05-11T15:14:35Z

rust/kernel/proc_fs.rs

+    proc_ops: &'static bindings::proc_ops,
+    data: T,
+) -> KernelResult<ProcDirEntry<T>> {
+    let data_ptr = Box::into_raw(Box::try_new(data)?) as *mut c_types::c_void;


I think we need to improve this abstraction.

Forcing data to move into a Box means that it cannot contain any self-referential data structures, including mutexes. In the end, I think we'll need this to use PointerWrapper so that users can choose what wrapper they want (if any).

Makes sense. Now the user is responsible for choosing the PointerWrapper and passing in the data already wrapped.

wedsonaf · 2021-05-11T15:25:31Z

rust/kernel/seq_file.rs

+    type Iterator: Iterator<Item = Self::Item>;
+
+    /// Called once each time the `seq_file` is opened.
+    fn start(arg: &Self) -> Option<Box<Peekable<Self::Iterator>>>;


Here too I think we may want to consider PointerWrapper rather than Box. (Although here it seems less problematic than in the previous usage.)

Still looking into this one. The one thing I'm thinking about is that we'll also need to mutate the iterator on the inside, so we can't just make it generic over any PointerWrapper, right?

Switched this one to using PointerWrapper.

wedsonaf · 2021-05-11T15:32:11Z

rust/kernel/seq_file.rs

+        *pos += 1;
+    }
+
+    iterator.next();


We may want to check the return value of next here, mostly because the documentation says:

Advances the iterator and returns the next value.

Returns None when iteration is finished. Individual iterator implementations may choose to resume iteration, and so calling next() again may or may not eventually start returning Some(Item) again at some point.

That is, if we are at the end already (e.g., because the underlying data structure changed in the case of an iterator that allows mutation), then we risk restarting the iteration process if we don't check the return value.

Thanks, I didn't realize that iterators could resume like that.

I changed it, but I think in this case we might actually be ok without checking next. Since we have a Peekable and we've already called peek on it and checked for None (in an earlier call to next_callback or start_callback) so next will always return Some here. See impl of next for Peekable.

adamrk · 2021-05-13T19:59:42Z

Thanks for the review @wedsonaf 👍

adamrk · 2021-05-27T07:55:34Z

Hi @wedsonaf, just letting you know I think everything's been addressed now.

adamrk · 2021-06-23T19:09:51Z

Ping @wedsonaf - any further comments on this one?

The usage of PointerWrapper with calls to from_pointer followed by forget are strictly speaking a violation of the API because you're creating more than one mutable reference to an object, which leads to UB if they're ever called concurrently (which the C API allows).

This could occur if the file is opened once and then there are concurrent reads on the same file descriptor?

wedsonaf · 2021-06-24T11:08:52Z

Ping @wedsonaf - any further comments on this one?

The usage of PointerWrapper with calls to from_pointer followed by forget are strictly speaking a violation of the API because you're creating more than one mutable reference to an object, which leads to UB if they're ever called concurrently (which the C API allows).

This could occur if the file is opened once and then there are concurrent reads on the same file descriptor?

from_pointer is called on a shared resource, so it doesn't need to happen on the same file descriptor. For example, if multiple processes open and read from the same proc file concurrently, there's a window where they can all call from_pointer concurrently and each have a supposedly unique instance of the same object.

Also note start and the matching stop are called on read, not when the file is opened (and released), see https://elixir.bootlin.com/linux/v5.12.13/source/fs/seq_file.c#L222. You may want to update the documentation to reflect this.

adamrk · 2021-06-29T01:39:31Z

LGTM. Suggestion for next time: break the PR up into smaller chunks to make it easier to review.

The usage of PointerWrapper with calls to from_pointer followed by forget are strictly speaking a violation of the API because you're creating more than one mutable reference to an object, which leads to UB if they're ever called concurrently (which the C API allows).

I think #386 improves this, allowing you to borrow the contents without having to call from_pointer, and #387 I think allows for a simpler sample.

Once those are in, I think we should update your implementation. I'll leave iy up to you whether you want to wait for them or do it in a subsequent PR; if the latter, would you mind opening an issue so that we don't lose track of it?

I opened #394 which contains these changes and uses the new PointerWrapper so that we don't do from_pointer -> forget in the start callback. I don't think this will exactly work for the cases where we want to borrow the iterator in show and next since we actually need mutable borrows in those cases. Opened #395 to discuss that (and so we don't forget to fix those).

ksquirrel · 2021-06-29T08:39:40Z

Review of 2fc05b510fdd:

⚠️ This PR changes more than 20 files.
⚠️ This PR adds/deletes more than 500 lines.
✔️ Commit 4f0aeff: Looks fine!
✔️ Commit 2fc05b5: Looks fine!

nbdd0121 · 2021-09-15T18:32:15Z

You could use the Display implementation and print using seq_printf("%pA", &format_args!("{}", item_to_display)).

This way you would just need to put type Item: Display and would not need the display method anymore!

Adds a trait which allows Rust modules to implement the seq_operations interface and use it to create a `/proc` file. Signed-off-by: Adam Bratschi-Kaye <ark.email@gmail.com>

Signed-off-by: Adam Bratschi-Kaye <ark.email@gmail.com>

The namespace percpu counter protects pending I/O, and we can only safely diable the namespace once the counter drop to zero. Otherwise we end up with a crash when running blktests/nvme/058 (eg for loop transport): [ 2352.930426] [ T53909] Oops: general protection fault, probably for non-canonical address 0xdffffc0000000005: 0000 [Rust-for-Linux#1] PREEMPT SMP KASAN PTI [ 2352.930431] [ T53909] KASAN: null-ptr-deref in range [0x0000000000000028-0x000000000000002f] [ 2352.930434] [ T53909] CPU: 3 UID: 0 PID: 53909 Comm: kworker/u16:5 Tainted: G W 6.13.0-rc6 Rust-for-Linux#232 [ 2352.930438] [ T53909] Tainted: [W]=WARN [ 2352.930440] [ T53909] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-3.fc41 04/01/2014 [ 2352.930443] [ T53909] Workqueue: nvmet-wq nvme_loop_execute_work [nvme_loop] [ 2352.930449] [ T53909] RIP: 0010:blkcg_set_ioprio+0x44/0x180 as the queue is already torn down when calling submit_bio(); So we need to init the percpu counter in nvmet_ns_enable(), and wait for it to drop to zero in nvmet_ns_disable() to avoid having I/O pending after the namespace has been disabled. Fixes: 74d1696 ("nvmet-loop: avoid using mutex in IO hotpath") Signed-off-by: Hannes Reinecke <hare@kernel.org> Reviewed-by: Nilay Shroff <nilay@linux.ibm.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Tested-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Signed-off-by: Keith Busch <kbusch@kernel.org>

adamrk mentioned this pull request Apr 28, 2021

Dropping chrdev/miscdev::Registration and similar #233

Closed

adamrk force-pushed the seq-file branch from a7b7759 to d1e5c8c Compare April 28, 2021 11:43

ojeda reviewed Apr 28, 2021

View reviewed changes

wedsonaf reviewed Apr 28, 2021

View reviewed changes

adamrk force-pushed the seq-file branch from d1e5c8c to 80628f9 Compare April 28, 2021 18:49

adamrk force-pushed the seq-file branch 2 times, most recently from 231f3b4 to dbb38aa Compare April 28, 2021 20:40

ojeda reviewed Apr 30, 2021

View reviewed changes

adamrk force-pushed the seq-file branch from dbb38aa to f66d622 Compare May 2, 2021 10:35

wedsonaf suggested changes May 2, 2021

View reviewed changes

adamrk force-pushed the seq-file branch 3 times, most recently from 15244a0 to d96b676 Compare May 6, 2021 08:07

adamrk force-pushed the seq-file branch 2 times, most recently from 0aeaa42 to 6f05b79 Compare May 7, 2021 16:43

wedsonaf reviewed May 11, 2021

View reviewed changes

adamrk force-pushed the seq-file branch from 6f05b79 to 12ec4b2 Compare May 13, 2021 19:51

adamrk force-pushed the seq-file branch 2 times, most recently from 1d5d31a to d53b853 Compare May 13, 2021 21:11