Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ASLR, W^X, etc. #651

Open
copumpkin opened this issue Aug 15, 2015 · 7 comments
Open

ASLR, W^X, etc. #651

copumpkin opened this issue Aug 15, 2015 · 7 comments

Comments

@copumpkin
Copy link

Intuitively, it seems like these measures still apply to something like OSv. Grepping the source tree revealed no nontrivial matches for 'ASLR', so I'm wondering if:

  1. I'm wrong and the protections don't make sense in this paradigm
  2. They do make sense, but there are significant difficulties to implementing them properly so it hasn't been done yet
  3. They do make sense, should be easy, but nobody's done it yet
  4. They do make sense, and are already implemented in OSv; I'm just grepping for the wrong thing!

Any pointers?

@nyh
Copy link
Contributor

nyh commented Aug 15, 2015

I agree with your observations, and I think the answer is 3 - they do make sense, should be easy (or at least, not exceedingly difficult), but nobody's done it yet.

ASLR (address-space layout randomization) does make sense even in OSv: One might argue that OSv's single-application philosophy means that if someone breaks into the application he cannot do any sort of "privilege escalation" and break into other applications, because there are none on this VM. But nevertheless, it is always better, if there is an exploit which can break into an application, that the result of this break will be an application crash (denial of service) rather than execution of the attacker's code. This is why things like ASLR do make sense even in OSv.

OSv is alreay very-much "prepared" for ASLR, most importantly all executables are position-independent so can be moved around, but more work is needed for a full ASLR implementation:

  1. We need to randomize the base address of loading PIEs and shared objects. Right now it's quite predictable.
  2. We need to randomize the locations of stacks, of results of mmap, and similar things. These are already "somewhat" random, because they depend on the order that threads run, etc., but probably not random enough.
  3. We also need to randomize the location of the kernel (which in OSv, also includes the C and C++ libraries). See also issue Allow running a single unmodified regular (non-PIE) Linux executable #190 which discusses moving the kernel to a location determined at run-time - if we can do that, randomizing the location is another small step.

See https://en.wikipedia.org/wiki/Address_space_layout_randomization#Linux for some more notes about how Linux did ASLR.

About W^X, I agree it's also useful to have. Our ELF object loading code already supports non-writable text pages, and even pages which are temporarily writable during the object's load and after doing relocations, made unwritable - this is the so-called "relro" feature, which OSv supports and even supports the "full relro" variant described in http://tk-blog.blogspot.co.il/2009/02/relro-not-so-well-known-memory.html. But I guess we could also offer a more strict "W^X" feature - where every page marked writable is automatically marked non-executable (the NX bit on x86_64), including (for example) stacks. I don't know if all applications can run this way, but it wouldn't hurt to allow it for applications which can run this way. By the way, OSv itself modifies its own code to enable tracepoints which might break W^X in the kernel, but since this is the only place it does this (beyond the usual relro feature that is only written on load time), I think we could start with simply not supporting tracepoints with W^X.

@copumpkin
Copy link
Author

About W^X, I agree it's also useful to have. Our ELF object loading code already supports non-writable text pages, and even pages which are temporarily writable during the object's load and after doing relocations, made unwritable - this is the so-called "relro" feature, which OSv supports and even supports the "full relro" variant described in http://tk-blog.blogspot.co.il/2009/02/relro-not-so-well-known-memory.html. But I guess we could also offer a more strict "W^X" feature - where every page marked writable is automatically marked non-executable (the NX bit on x86_64), including (for example) stacks.

Is protecting the ELF loading process alone sufficient? I don't know the OSv code nearly well enough, but is the kernel itself loaded via the ELF loader? If so, that's probably fine, as long as newly allocated memory also enforces the separation. If the kernel is loaded in a special manner, I'd be concerned about the safely W^X-loaded program file somehow modifying (or convincing the kernel to modify) kernel memory for nefarious purposes during an exploit.

Re: the tracepoints, making the options mutually exclusive seems fine.

@nyh
Copy link
Contributor

nyh commented Sep 2, 2015

One small note about the state of non-executable stacks in OSv:

OSv allocates stack in two ways:

  1. Threads created by pthread_create - including all the application's threads - have their stacks allocated in pthread::allocate_stack(), using mmap (actually, our internal mmu::map_anon() is called directly), and the permission mmu::perm_rw - in other words, these stacks are not executable, so this is good news.
  2. OSv's internal threads (not created with the pthread API) default to having a small stack (65K by default - see sched::thread::init_stack()), and are allocated with malloc(). Our malloc() returns memory from the linear map, which (see linear_page_mapper) is configured with mmu::perm_rwx, i.e., is executable by default. We cannot "mprotect()" this stack because the linear map usually has 2 MB pages ("huge pages") for efficiencies, and the 65 KB stack is just part of a page.

@n03l
Copy link

n03l commented Nov 12, 2015

Out of curiosity, how many are the internal threads?

@nyh
Copy link
Contributor

nyh commented Nov 15, 2015

On Fri, Nov 13, 2015 at 1:02 AM, n03l notifications@github.com wrote:

Out of curiosity, how many are the internal threads?

OSv uses threads liberally in the kernel, because they have very little
overhead besides the memory use of the stack. In particular, sleeping
threads do not slow down scheduling at all.

To list the running threads, you can use gdb's "osv info threads" or
scripts/top.py.

As an example I ran "make image=rogue; scripts/run.py -c1" and got 116
threads - only one of them is the application thread. Many of these threads,
perhaps too many (see issue #247) - belong to ZFS. But the vast majority
are idle and their only overhead are is the memory their stacks take.

@nyh
Copy link
Contributor

nyh commented Apr 13, 2023

Interesting discussion on ASLR and W^X support (and non-support) on OSv and other unikernels: https://x41-dsec.de/news/missing-or-weak-mitigations-in-various-unikernels/. The table suggests that OSv already supports W^X but it is contradicted by the text which says that their "test scripts" verified that it isn't implemented on OSv.

@wkozaczuk
Copy link
Collaborator

wkozaczuk commented Apr 13, 2023

I did come across this article some time ago as well. There is actually a bit of work to accomplish your 3 original parts. Especially the kernel randomization (KASLR) would not be that easy given some of that would require tweaking assembly for both x64 and aarch64, probably writing some code to update some bits of kernel ELF to update addresses, etc.

So looking at your original 3 steps, I think the 1st one is the easiest one:
Randomize the base address of loading PIEs and shared objects.

The 2nd one is nice to have but given it somewhat random already maybe it is of lower priority:
Randomize the locations of stacks, results of mmap, and similar things. (what are other things?)

The 3rd is the most complicated one:
Randomize the location of the kernel.

It would be also nice to implement W^X for kernel ELF. Right now I believe it runs with all permissions on, right? You mention tracepoints that modify some portions of kernel text. But I also think there is also some memcpy related code (look at arch/x64/string.cc) that picks best implementation depending on cpuid which is similar. I wonder if we can change the kernel code memory protections to X during the boot after we enable all tracepoints but before we load the apps.

I wonder if we should create 4 new (or more) finer granularity issues and kill this one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants