Introduce API to safely initialize Packets #3533

steviez · 2024-11-07T23:11:33Z

Problem

We currently abuse uninitialized data with the Packet type that leaves us open to UB in several places. A common pattern is:

Initialize a PacketBatch (mostly a Vec<Packet>) with capacity N
Call set_len(N) on that PacketBatch

Access individual Packet's as if they are properly initialized

agave/entry/src/entry.rs

Lines 546 to 569 in f621667

    
           let mut packet_batch = PacketBatch::new_with_recycler( 
        
               &verify_recyclers.packet_recycler, 
        
               num_transactions, 
        
               "entry-sig-verify", 
        
           ); 
        
           // We use set_len here instead of resize(num_txs, Packet::default()), to save 
        
           // memory bandwidth and avoid writing a large amount of data that will be overwritten 
        
           // soon afterwards. As well, Packet::default() actually leaves the packet data 
        
           // uninitialized, so the initialization would simply write junk into 
        
           // the vector anyway. 
        
           unsafe { 
        
               packet_batch.set_len(num_transactions); 
        
           } 
        
           let transaction_iter = transaction_chunk 
        
               .iter() 
        
               .map(|tx| tx.to_versioned_transaction()); 
        
           let res = packet_batch 
        
               .iter_mut() 
        
               .zip(transaction_iter) 
        
               .all(|(packet, tx)| { 
        
                   *packet.meta_mut() = Meta::default(); 
        
                   Packet::populate_packet(packet, None, &tx).is_ok() 
        
               });

As that comment suggests, we did this to avoid writing data / 0's that we will immediately overwrite. However, the manner in which we're doing it is not safe.

Summary of Changes

Introduce PacketWriter to help fill the buffer of a MaybeUninit<Packet> through pointers instead of references
Add several methods to Packet to initialize from some serializable data or a regular byte stream
Update abusers in entry.rs and ShredFetchStage

Subsequent PR's

Some other work that I have in mind that I'd like to defer to subsequent PR's for the sake of keeping each individual PR as small as possible.

Remaining packet_batch.set_len()

There is still one more location where we do set_len() and access the item:

agave/streamer/src/nonblocking/quic.rs

Lines 960 to 972 in f621667

    
           unsafe { 
        
               packet_batch.set_len(packet_batch.len() + 1); 
        
           } 
        
           let i = packet_batch.len() - 1; 
        
           *packet_batch[i].meta_mut() = packet_accumulator.meta; 
        
           let num_chunks = packet_accumulator.chunks.len(); 
        
           let mut offset = 0; 
        
           for chunk in packet_accumulator.chunks { 
        
               packet_batch[i].buffer_mut()[offset..offset + chunk.len()] 
        
                   .copy_from_slice(&chunk); 
        
               offset += chunk.len(); 
        
           }

This one is slightly different as the writes could come out of order, so I was thinking of refactoring that in a different PR

clippy::uninit_assumed_init()

Another instance where me may immediately access uninitialized data is here:

agave/sdk/packet/src/lib.rs

Lines 210 to 219 in f621667

    
           #[allow(clippy::uninit_assumed_init)] 
        
           impl Default for Packet { 
        
               fn default() -> Self { 
        
                   let buffer = std::mem::MaybeUninit::<[u8; PACKET_DATA_SIZE]>::uninit(); 
        
                   Self { 
        
                       buffer: unsafe { buffer.assume_init() }, 
        
                       meta: Meta::default(), 
        
                   } 
        
               } 
        
           }

I would like to address this separately as well, as I think the proper solution is to deprecate Packet::default(). However, Packet lives in sdk so we have to be a little more careful with that one

This avoids potential UB by calling .set_len() on the PacketBatch before the items have properly been initialized

apfitzge

Left a few initial comments.

@alessandrod all this work to avoid setting zeros. Additional motivation for Packet!

sdk/packet/src/lib.rs

apfitzge · 2024-11-08T15:33:40Z

sdk/packet/src/lib.rs

+    fn init_packet_meta(packet: &mut mem::MaybeUninit<Packet>, meta: Meta) {
+        // SAFETY: Access the field by pointer as creating a reference to
+        // and/or within the uninitialized Packet is undefined behavior
+        unsafe { ptr::addr_of_mut!((*packet.as_mut_ptr()).meta).write(meta) };


Wasn't there a new syntax for getting ptr of a field introduced in 1.82?

Iirc all this concern came from the upgrade to that version

Wasn't there a new syntax for getting ptr of a field introduced in 1.82?

Oh nice, didn't know about this; thanks for mentioning and will read up on it

irc all this concern came from the upgrade to that version

Yep, validator was panicking with 1.82. The panic was addressed with #3325, so we could hypothetically go back to 1.82 to take advantage of the new syntax (&raw) in 1.82

apfitzge · 2024-11-08T15:36:22Z

sdk/packet/src/lib.rs

@@ -224,6 +292,61 @@ impl PartialEq for Packet {
    }
 }

+/// A custom implementation of io::Write to facilitate safe (non-UB)
+/// initialization of a MaybeUninit<Packet>
+struct PacketWriter {


Why not simple wrapper around MaybeUninit packet?

We know the capacity of the buffer, and can determine remaining bytes from the current length and the fixed capacity

We know the capacity of the buffer, and can determine remaining bytes from the current length and the fixed capacity

The main motivation for writing this wrapper was to have something that implements std::io::Write that we could pass to bincode::serialize_into().

If you drill down into bincode, writer.write() might get called repeatedly for one invocation of bincode::serialize_into(). Thus, we need to track how many bytes we have written after each call to write; we don't have the ability to update packet.meta (which may not have been initialized yet) as we go.

apfitzge · 2024-11-08T15:46:33Z

Not convinced right now that keeping set_len is the ideal safe API.

Can we not do some sort of 'push_data'? This could be a safe fn that can set_len internally. but our function would be safe.

steviez

Not convinced right now that keeping set_len is the ideal safe API.
Can we not do some sort of 'push_data'? This could be a safe fn that can set_len internally. but our function would be safe.

Would push_data() do anything else besides call set_len()? I'd be open to demoting set_len() to private in favor of something else where we do the set_len() + add comments to hopefully avoid future abuse. Just want to make sure I follow your suggestion

steviez · 2024-11-08T15:47:45Z

sdk/packet/src/lib.rs

+    fn init_packet_meta(packet: &mut mem::MaybeUninit<Packet>, meta: Meta) {
+        // SAFETY: Access the field by pointer as creating a reference to
+        // and/or within the uninitialized Packet is undefined behavior
+        unsafe { ptr::addr_of_mut!((*packet.as_mut_ptr()).meta).write(meta) };


Wasn't there a new syntax for getting ptr of a field introduced in 1.82?

Oh nice, didn't know about this; thanks for mentioning and will read up on it

irc all this concern came from the upgrade to that version

Yep, validator was panicking with 1.82. The panic was addressed with #3325, so we could hypothetically go back to 1.82 to take advantage of the new syntax (&raw) in 1.82

sdk/packet/src/lib.rs

steviez · 2024-11-08T15:59:39Z

sdk/packet/src/lib.rs

@@ -224,6 +292,61 @@ impl PartialEq for Packet {
    }
 }

+/// A custom implementation of io::Write to facilitate safe (non-UB)
+/// initialization of a MaybeUninit<Packet>
+struct PacketWriter {


We know the capacity of the buffer, and can determine remaining bytes from the current length and the fixed capacity

The main motivation for writing this wrapper was to have something that implements std::io::Write that we could pass to bincode::serialize_into().

If you drill down into bincode, writer.write() might get called repeatedly for one invocation of bincode::serialize_into(). Thus, we need to track how many bytes we have written after each call to write; we don't have the ability to update packet.meta (which may not have been initialized yet) as we go.

apfitzge · 2024-11-08T19:28:02Z

Would push_data() do anything else besides call set_len()? I'd be open to demoting set_len() to private in favor of something else where we do the set_len() + add comments to hopefully avoid future abuse. Just want to make sure I follow your suggestion

I was imagining something like this:

fn try_push_data(&mut self, bytes: &[u8], addr: Option<&SocketAddr>) -> bool {
    if self.len() == self.capacity() {
        return false;
    }
    
    let uninitialized_packet = self.x.spare_capacity_mut()[0];
    Packet::init_packet_from_bytes(uninitialized_packet, bytes, addr);
    self.set_len(self.len() + 1);
    
    true
}

behzadnouri

Can you please clarify where exactly the undefined behavior happens with the existing code?

behzadnouri · 2024-11-10T23:59:00Z

core/src/shred_fetch_stage.rs

-        unsafe {
-            packet_batch.set_len(PACKETS_PER_BATCH);
-        };


This is removing one unsafe but then adding two new ones.
Why is the new code better than the old one?

https://doc.rust-lang.org/std/vec/struct.Vec.html#method.set_len

Safety

new_len must be less than or equal to capacity().

The elements at old_len..new_len must be initialized.

We are not initializing the data first so we are violating the second bullet

behzadnouri · 2024-11-11T00:23:42Z

entry/src/entry.rs

-                    packet_batch.set_len(num_transactions);
-                }
+
+                let uninitialized_packets = packet_batch.spare_capacity_mut().iter_mut();


don't we need assume_init somewhere below?

We do not, set_len() does the work for us at the end:

assume_init() yields a T from a MaybeUninit<T>

We are starting with a Vec<Packet> (the type under the hood for PacketBatch)

packet_batch.spare_capacity_mut() allows us to access elements at index i where vec_length <= i < vec_capacity

But, our access to these is of type MaybeUninit<Packet>

We initialize the elements in place, so calling set_len() is saying "these are valid elements of the Vec now and can be accessed normally; also drop them normally when dropping the Vec`

behzadnouri · 2024-11-11T00:30:21Z

sdk/packet/src/lib.rs

+        addr: Option<&SocketAddr>,
+    ) -> io::Result<()> {
+        let mut writer = PacketWriter::new_from_uninit_packet(packet);
+        let num_bytes_written = writer.write(bytes)?;


This should probably use Write::write_all.
Write::write is not meant to write the entire buffer.
And there is no need to rely on the implementation details of PacketWriter.

And there is no need to rely on the implementation details of PacketWriter.

Yeah, that's fair and also simplifies things on the caller side (ie no longer need the debug_assert); will make this change

behzadnouri · 2024-11-11T00:35:39Z

sdk/packet/src/lib.rs

+        // SAFETY: We previously verifed that buf.len() <= self.spare_capacity
+        // so this write will not push us past the end of the buffer. Likewise,
+        // we can update self.spare_capacity without fear of overflow
+        unsafe {


all these new instances of unsafe are not ideal.

I agree that we should be very stingy with our use of unsafe. However, our current code has the potential for UB which is even less ideal than unsafe's

steviez

Can you please clarify where exactly the undefined behavior happens with the existing code?

Basically, we're doing something the docs tell us not to:
https://doc.rust-lang.org/beta/std/mem/union.MaybeUninit.html#initialization-invariant

For example, when we do:

*packet.meta_mut() = Meta::default();

On a packet that wasn't actually initialized, drop will get called on the old Meta that wasn't actually initialized.

The particular areas this PR changes don't currently appear to be having an observable effect, but this PR was made to be proactive instead of reactive in light of the issue that #3325 addressed

steviez · 2024-11-11T06:01:00Z

core/src/shred_fetch_stage.rs

-        unsafe {
-            packet_batch.set_len(PACKETS_PER_BATCH);
-        };


https://doc.rust-lang.org/std/vec/struct.Vec.html#method.set_len

Safety

new_len must be less than or equal to capacity().

The elements at old_len..new_len must be initialized.

We are not initializing the data first so we are violating the second bullet

steviez · 2024-11-11T06:11:27Z

entry/src/entry.rs

-                    packet_batch.set_len(num_transactions);
-                }
+
+                let uninitialized_packets = packet_batch.spare_capacity_mut().iter_mut();


We do not, set_len() does the work for us at the end:

assume_init() yields a T from a MaybeUninit<T>

We are starting with a Vec<Packet> (the type under the hood for PacketBatch)

packet_batch.spare_capacity_mut() allows us to access elements at index i where vec_length <= i < vec_capacity

But, our access to these is of type MaybeUninit<Packet>

We initialize the elements in place, so calling set_len() is saying "these are valid elements of the Vec now and can be accessed normally; also drop them normally when dropping the Vec`

steviez · 2024-11-11T06:17:20Z

sdk/packet/src/lib.rs

+        addr: Option<&SocketAddr>,
+    ) -> io::Result<()> {
+        let mut writer = PacketWriter::new_from_uninit_packet(packet);
+        let num_bytes_written = writer.write(bytes)?;


And there is no need to rely on the implementation details of PacketWriter.

Yeah, that's fair and also simplifies things on the caller side (ie no longer need the debug_assert); will make this change

steviez · 2024-11-11T06:19:03Z

sdk/packet/src/lib.rs

+        // SAFETY: We previously verifed that buf.len() <= self.spare_capacity
+        // so this write will not push us past the end of the buffer. Likewise,
+        // we can update self.spare_capacity without fear of overflow
+        unsafe {


I agree that we should be very stingy with our use of unsafe. However, our current code has the potential for UB which is even less ideal than unsafe's

steviez · 2024-11-11T06:31:03Z

I was imagining something like this:
...

Gotcha, we could do something like that and I don't feel too strongly either way. To confirm, is the idea to concentrate as much as possible (if not all) of the unsafe in PacketBatch. If so, this seems reasonable and I'm open to making that adjustment

behzadnouri · 2024-11-11T14:59:08Z

If I understand this correctly rust says doing assume_init on an uninitialized memory which is not written to yet is undefined behavior. But I do not understand yet if that is still a problem if you never read that uninitialized memory unless you first write to it. Is this clarified anywhere?

To avoid assume_init then write issue, this code is trying to first write to that uninitialized memory then do assume_init. However even with the new code, of the 1232 bytes of the packet buffer, you may only write to a couple of bytes of it but you do assume_init on the whole packet. Why is that ok in the new code but not ok in the old code?

alessandrod · 2024-11-11T15:38:34Z

If I understand this correctly rust says doing assume_init on an uninitialized memory which is not written to yet is undefined behavior. But I do not understand yet if that is still a problem if you never read that uninitialized memory unless you first write to it. Is this clarified anywhere?

The issue is Drop. Say that you have let v = Vec::<u8>::with_capacity(1024). Here v.spare_capacity_mut() returns 1024 bytes of uninitialized memory. But it's safe, and drop(v) is safe because it only drops v.len() elements - it won't drop uninitialized elements. This is why you can't do v.set_len(x) before actually having written to x elements: if between the set_len and writing the elements you panic, then you have UB.

However even with the new code, of the 1232 bytes of the packet buffer, you may only write to a couple of bytes of it but you do assume_init on the whole packet. Why is that ok in the new code but not ok in the old code?

This is never ok. Because Packet::buffer is an array, and drop(array) always drops the whole thing (unlike vec which has variable length), the spare capacity must be initialized with .fill(0). So you save the memset cost on the actual payload size, the rest must be memset.

steviez · 2024-11-11T15:55:51Z

However even with the new code, of the 1232 bytes of the packet buffer, you may only write to a couple of bytes of it but you do assume_init on the whole packet. Why is that ok in the new code but not ok in the old code?

This is never ok. Because Packet::buffer is an array, and drop(array) always drops the whole thing (unlike vec which has variable length), the spare capacity must be initialized with .fill(0). So you save the memset cost on the actual payload size, the rest must be memset.

I now understand the issue you're calling out Behzad & thanks for the elaboration Alessandro; the fill-rest-with-zero is something I didn't consider / account for. So I guess we have several options:

Continue to have possible UB by not filling rest of buffer
Update code to do ptr.write_bytes(0, capacity - num_written)

I don't immediately see any other options without a much more major change to avoid the ptr.write_bytes(0, ...) to fill the rest of the buffer; certainly open to ideas 😄

behzadnouri · 2024-11-11T15:56:27Z

The issue is Drop.

But does drop implementation for type Packet (or u8 for the buffer field) going to read any of the memory?
I don't think it does, and if it doesn't why should that still be a problem?

alessandrod · 2024-11-11T16:11:47Z

Continue to have possible UB by not filling rest of buffer
Update code to do ptr.write_bytes(0, capacity - num_written)

The latter is what we should do

EDIT:

OR we create our custom array type that doesn't drop the uninitialized part

But does drop implementation for type Packet (or u8 for the buffer field) going to read any of the memory?
I don't think it does, and if it doesn't why should that still be a problem?

drop of [T; N] is drop(each T in the array) and Drop::drop() takes &mut self so it's like doing uninitialized_u8.drop()

behzadnouri · 2024-11-11T16:19:37Z

But does drop implementation for type Packet (or u8 for the buffer field) going to read any of the memory?
I don't think it does, and if it doesn't why should that still be a problem?

drop of [T; N] is drop(each T in the array) and Drop::drop() takes &mut self so it's like doing uninitialized_u8.drop()

okay, but where does this emit any instructions to read the memory?
and if it doesn't, what is the problem then?

alessandrod · 2024-11-11T16:26:13Z

But does drop implementation for type Packet (or u8 for the buffer field) going to read any of the memory?
I don't think it does, and if it doesn't why should that still be a problem?

drop of [T; N] is drop(each T in the array) and Drop::drop() takes &mut self so it's like doing uninitialized_u8.drop()

okay, but where does this emit any instructions to read the memory? and if it doesn't, what is the problem then?

The problem is that this is not C: the compiler is telling you what happens if you make me call drop on something that isn't initialized is undefined and I might crash or sell your SOL. See the SIGILL bug we just fixed.

apfitzge · 2024-11-11T17:39:08Z

Continue to have possible UB by not filling rest of buffer
Update code to do ptr.write_bytes(0, capacity - num_written)

Please feel free to tell me I'm dumb. Can we not make Packet store a MaybeUninit buffer?
We can write into it, and whenever we read we assume_init just the slice we know we've written?
This seems like it should be safe and avoids us setting zeroes.
I'm not sure this avoid the Drop of array though as alessandro described.

steviez · 2024-11-11T17:49:08Z

Please feel free to tell me I'm dumb. Can we not make Packet store a MaybeUninit buffer? We can write into it, and whenever we read we assume_init just the slice we know we've written? This seems like it should be safe and avoids us setting zeroes. I'm not sure this avoid the Drop of array though as alessandro described.

I was thinking along the same line (MaybeUninit<[u8; PACKET_DATA_SIZE]>) so will continue reading & see if anyone spots any issues right out of the gate. We would have to do more pointer manipulation within Packet, but I think we could use stuff like std::slice::from_raw_parts() to still return a &[u8] to a caller wanting to read the contents.

By having the entire thing be MaybeUninit, I believe that drop would not get called at all on the array. But if the array is u8's, I don't think this is problem ?

alessandrod · 2024-11-11T17:58:29Z

I was thinking along the same line (MaybeUninit<[u8; PACKET_DATA_SIZE]>) so will continue reading & see if anyone spots any issues right out of the gate. We would have to do more pointer manipulation within Packet, but I think we could use stuff like std::slice::from_raw_parts() to still return a &[u8] to a caller wanting to read the contents.

By having the entire thing be MaybeUninit, I believe that drop would not get called at all on the array. But if the array is u8's, I don't think this is problem ?

all correct!

behzadnouri · 2024-11-11T22:04:33Z

okay, but where does this emit any instructions to read the memory? and if it doesn't, what is the problem then?

The problem is that this is not C: the compiler is telling you what happens if you make me call drop on something that isn't initialized is undefined and I might crash or sell your SOL.

Okay, so there is no uninitialized memory read but this is still considered UB arbitrarily by the compiler. Is that what you are saying? and do you have any reference for this?

See the SIGILL bug we just fixed.

That was different though. There was a pointer there and a pointer initialized with garbage can be invalid.
But there is no pointer here and whatever garbage you write to the memory allocated to Packet, the packet is still valid.

Trying this out on godbolt I only get a ud2 only if I uncomment the last line which uses a pointer:
https://godbolt.org/z/bd6EcGM7b

steviez force-pushed the packet_maybe_uninit_interface branch 4 times, most recently from 5db3dcd to 9cdcfc6 Compare November 8, 2024 00:02

steviez added 8 commits November 7, 2024 23:01

Introduce API to initialize MaybeUninit<Packet>

140909f

Rewrite Packet::from_data() in terms of Packet::init_packet()

ad3760f

Update PacketBatch to support .spare_capacity_mut()

792f154

Update entry.rs to use Packet MaybeUninit interface

29814b2

This avoids potential UB by calling .set_len() on the PacketBatch before the items have properly been initialized

Add another helper to init packet from [&u8]

111a86b

Update ShredFetchStage to use Packet MaybeUninit interface

adda120

Use saturating math

bf90368

Run test_packet_buffer_writer through miri

c904a9b

steviez force-pushed the packet_maybe_uninit_interface branch from 9cdcfc6 to c904a9b Compare November 8, 2024 05:01

steviez requested review from alessandrod, behzadnouri and apfitzge November 8, 2024 15:16

apfitzge reviewed Nov 8, 2024

View reviewed changes

steviez commented Nov 8, 2024

View reviewed changes

Use ptr::copy_nonoverlapping instead of ptr::copy

0856264

behzadnouri reviewed Nov 11, 2024

View reviewed changes

steviez commented Nov 11, 2024

View reviewed changes

	let mut packet_batch = PacketBatch::new_with_recycler(
	&verify_recyclers.packet_recycler,
	num_transactions,
	"entry-sig-verify",
	);
	// We use set_len here instead of resize(num_txs, Packet::default()), to save
	// memory bandwidth and avoid writing a large amount of data that will be overwritten
	// soon afterwards. As well, Packet::default() actually leaves the packet data
	// uninitialized, so the initialization would simply write junk into
	// the vector anyway.
	unsafe {
	packet_batch.set_len(num_transactions);
	}
	let transaction_iter = transaction_chunk
	.iter()
	.map(\|tx\| tx.to_versioned_transaction());

	let res = packet_batch
	.iter_mut()
	.zip(transaction_iter)
	.all(\|(packet, tx)\| {
	*packet.meta_mut() = Meta::default();
	Packet::populate_packet(packet, None, &tx).is_ok()
	});

	unsafe {
	packet_batch.set_len(packet_batch.len() + 1);
	}

	let i = packet_batch.len() - 1;
	*packet_batch[i].meta_mut() = packet_accumulator.meta;
	let num_chunks = packet_accumulator.chunks.len();
	let mut offset = 0;
	for chunk in packet_accumulator.chunks {
	packet_batch[i].buffer_mut()[offset..offset + chunk.len()]
	.copy_from_slice(&chunk);
	offset += chunk.len();
	}

	#[allow(clippy::uninit_assumed_init)]
	impl Default for Packet {
	fn default() -> Self {
	let buffer = std::mem::MaybeUninit::<[u8; PACKET_DATA_SIZE]>::uninit();
	Self {
	buffer: unsafe { buffer.assume_init() },
	meta: Meta::default(),
	}
	}
	}

Introduce API to safely initialize Packets #3533

Are you sure you want to change the base?

Introduce API to safely initialize Packets #3533

Conversation

steviez commented Nov 7, 2024 • edited Loading

Problem

Summary of Changes

Subsequent PR's

apfitzge left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

apfitzge commented Nov 8, 2024

steviez left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

apfitzge commented Nov 8, 2024

behzadnouri left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

steviez left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

steviez commented Nov 11, 2024

behzadnouri commented Nov 11, 2024

alessandrod commented Nov 11, 2024

steviez commented Nov 11, 2024

behzadnouri commented Nov 11, 2024

alessandrod commented Nov 11, 2024 • edited Loading

behzadnouri commented Nov 11, 2024

alessandrod commented Nov 11, 2024

apfitzge commented Nov 11, 2024

steviez commented Nov 11, 2024

alessandrod commented Nov 11, 2024

behzadnouri commented Nov 11, 2024

steviez commented Nov 7, 2024 •

edited

Loading

steviez left a comment •

edited

Loading

alessandrod commented Nov 11, 2024 •

edited

Loading