Compress our elf image #98

glommer · 2013-11-21T11:22:16Z

One of the ways to boot faster is to read a smaller image. We have open issues that aim at reducing the size of the image per se, but something that goes well even after all those optimization are applied is compression.

BZIP2 is BSD licensed so we could just include the source in our tree. It reduces the image size from 9.5 Mb to 4.6Mb in my System. Maybe we can fit into 1Mb someday...

We can compress the code in our build process, and decompress as the bootloader's last step. I am suggesting bz2, but others can be used as well.

avikivity · 2013-11-21T11:53:57Z

On 11/21/2013 01:22 PM, Glauber Costa wrote:

One of the ways to boot faster is to read a smaller image. We have
open issues that aim at reducing the size of the image per se, but
something that goes well even after all those optimization are applied
is compression.

BZIP2 is BSD licensed so we could just include the source in our tree.
It reduces the image size from 9.5 Mb to 4.6Mb in my System. Maybe we
can fit into 1Mb someday...

We can compress the code in our build process, and decompress as the
bootloader's last step. I am suggesting bz2, but others can be used as
well.

We'll need a static library for that, and one with a minimal dependency
on libc, since the decompressor won't have any support code around it.

glommer · 2013-11-21T11:55:28Z

Absolutely. That is why I considered using bzip2. Since it is BSD licensed, we can just pull the source as a submodule, and build the code the way we want as part of our build process. We make it as tiny as we can and load it together with our binary

nyh · 2013-11-24T12:15:07Z

On Thu, Nov 21, 2013 at 1:55 PM, Glauber Costa notifications@github.comwrote:

Absolutely. That is why I considered using bzip2. Since it is BSD
licensed, we can just pull the source as a submodule, and build the code
the way we want as part of our build process. We make it as tiny as we can
and load it together with our binary

You want to decompress our binary, so the decompressor cannot be part of
our binary, nor can it use any of the normal standard-library facilities we
offer (even malloc(), read(), etc.). So I think you'll need to do quite a
bit of modifications to the bzip2 decompressor to use it. But I didn't look
at it to see how difficult this would be.
As another option to consider, I'm guessing that FreeBSD and others also
have some kernel compression support, so we can look at what they are doing.

I definitely agree that the general direction - of compressing the kernel -
is a good one.

See scripts/trace.py prof-wait -h The command is using sched_wait and sched_wait_ret tracepoints to calculate the amount of time a thread was waiting. Samples are collected and presented in a form of call graph tree. By default callees are closer to the root. To inverse the order pass -r|--caller-oriented. If there is too much output, it can be narrowed down using --max-levels and --min-duration options. The presented time spectrum can be narrowed down using --since and --until options which accept timestamps. Example: scripts/trace.py prof-wait --max-levels 3 trace-file === Thread 0xffffc0003eaeb010 === 12.43 s (100.00%, #7696) All |-- 12.43 s (99.99%, #7658) sched::thread::do_wait_until | |-- 10.47 s (84.22%, #6417) condvar::wait(lockfree::mutex*, unsigned long) | | condvar_wait | | |-- 6.47 s (52.08%, #6250) cv_timedwait | | | txg_delay | | | dsl_pool_tempreserve_space | | | dsl_dir_tempreserve_space | | | dmu_tx_try_assign | | | dmu_tx_assign | | | | | |-- 2.37 s (19.06%, #24) arc_read_nolock | | | arc_read | | | dsl_read | | | traverse_visitbp | | | | | |-- 911.75 ms (7.33%, #3) txg_wait_open | | | dmu_tx_wait | | | zfs_write | | | vfs_file::write(uio*, int) | | | sys_write | | | pwritev | | | writev | | | __stdio_write | | | __fwritex | | | fwrite | | | 0x100000005a5f | | | osv::run(std::string, int, char**, int*) By default every thread has a separate tree, because duration is best interpreted in the context of particular thread. There is however an option to merge samples from all threads into one tree: -m|--merge-threads. It may be useful if you want to inspect all paths going in/out to/from particular function. The direction can be changed with -r|--caller-oriented option. Function names is passed to --function parameter. Example: check where zfs_write() blocks: scripts/trace.py prof-wait -rm --function=zfs_write trace-file 7.46 s (100.00%, #7314) All zfs_write |-- 6.48 s (86.85%, #6371) dmu_tx_assign | |-- 6.47 s (86.75%, #6273) dmu_tx_try_assign | | dsl_dir_tempreserve_space | | |-- 6.47 s (86.75%, #6248) dsl_pool_tempreserve_space | | | txg_delay | | | cv_timedwait | | | condvar_wait | | | condvar::wait(lockfree::mutex*, unsigned long) | | | sched::thread::do_wait_until | | | | | |-- 87.87 us (0.00%, #24) mutex_lock | | | sched::thread::do_wait_until | | | | | \-- 6.40 us (0.00%, #1) dsl_dir_tempreserve_impl | | mutex_lock | | sched::thread::do_wait_until | | | \-- 7.32 ms (0.10%, #98) mutex_lock | sched::thread::do_wait_until | |-- 911.75 ms (12.22%, #3) dmu_tx_wait | txg_wait_open | condvar_wait | condvar::wait(lockfree::mutex*, unsigned long) | sched::thread::do_wait_until Signed-off-by: Tomasz Grabiec <tgrabiec@cloudius-systems.com> Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>

efpiva · 2014-01-22T17:33:08Z

Just for the record, I'm working on this issue...

I already have a bzloader that has the core of the bunzip library without any syscall, unconpressing the loader-stripped.bz2 available as a char[].

The size was reduced by half.

I'll start putting everything together this night, may have something to post next week.

nyh · 2014-01-26T13:58:33Z

On Wed, Jan 22, 2014 at 7:33 PM, efpiva notifications@github.com wrote:

Just for the record, I'm working on this issue...

I already have a bzloader that has the core of the bunzip library without
any syscall, unconpressing the loader-stripped.bz2 available as a char[].

The size was reduced by half.

I'll start putting everything together this night, may have something to
post next week.

Looking forward to seeing this, I'm a fan of the block sorting algorithm...

What is the "bunzip library", is it this?
http://www.bzip.org/1.0.5/bzip2-manual-1.0.5.html? According to what I read
about library, its "low-level" API doesn't do any IO, just memory-to-memory
decompression, so supposedly can work without any system calls, right out
of the box.

Nadav Har'El
nyh@cloudius-systems.com

nyh · 2014-05-07T06:29:59Z

Eduardo Piva already added a kernel-compression feature a few months ago - see commit 597db05

So @glommer or someone with permissions - please close this bug.

nyh mentioned this issue Feb 6, 2014

Allow images without ZFS #195

Closed

glommer closed this as completed May 7, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Compress our elf image #98

Compress our elf image #98

glommer commented Nov 21, 2013

avikivity commented Nov 21, 2013

glommer commented Nov 21, 2013

nyh commented Nov 24, 2013

efpiva commented Jan 22, 2014

nyh commented Jan 26, 2014

nyh commented May 7, 2014

Compress our elf image #98

Compress our elf image #98

Comments

glommer commented Nov 21, 2013

avikivity commented Nov 21, 2013

glommer commented Nov 21, 2013

nyh commented Nov 24, 2013

efpiva commented Jan 22, 2014

nyh commented Jan 26, 2014

nyh commented May 7, 2014