Gzip profile files #71

jonashaag · 2016-05-13T09:26:37Z

Depends on #70

Gzip profile files, reduces file size up to 97%. Backwards compatible. Increases parse duration by about 5-10%. (Note that my other PR reduces parse duration by 150-250%.)

I also tried other file size reduction techniques:

Use 16-bit integer for instruction pointers instead of 32-bit ones (short instead of long). Map 32-bit IPs to 16-bit IPs using an associative array when writing the profile. This works well and isn't too difficult to implement. It cuts profile file size up to 50%.
Count repeated IPs within a trace (recursion etc.) and replace by a (IP, count) tuple. Easy to implement, saves 10% in my benchmarks (depends on the program that is being profiled of course)

An idea I did not look into is de-duplicating whole traces, i.e. for each trace look if an equivalent trace has already been written and if so only use a pointer to that other trace.

Gzipping everything is probably the easiest and most robust solution to reducing file sizes.

Read multiple words at once if we know the exact number of words to read. Speedup on a 60MiB profile: PyPy: 1.24s vs 0.84s (1.48x) CPython: 7.85s vs 3.05s (2.57x)

Force Python to re-use existing integer objects instead of allocating multiple objects for the same integer value. This can save a ton of memory depending on your profile. In my benchmarks the savings were somewhere in the range of 30-70%.

This can reduce profile file size by multiple orders of magnitude. In my benchmarks I found the compressed files to be up to 97% (!) smaller than uncompressed ones. The main reason for this is that traces contain the same instruction pointers again and again; thus they are very well suited for compression. Old (uncompressed) profile files are still supported. In fact, the profile file format did not change, it only got a gzip wrapper. In the reader we simply check if a given profile is gzip-compressed using the gzip magic bytes "0x1f 0x8b"; if it is compressed, we decompress it and then continue parsing the now-uncompressed profile just like we parse an uncompressed profile.

fijal · 2016-05-13T10:40:08Z

Two concerns: one is that travis failed, two, are we sure gzwrite is signal-safe (e.g. can't allocate memory). If it's not, then we need to introduce an async pipe that will do gzipping (or even a pre-analysis)

jonashaag · 2016-05-13T12:18:55Z

I'll have a look at the signal-safety problem soon; my current guess is that it's not safe.

jonashaag · 2016-05-13T19:15:55Z

While I haven't found any official documentation about whether gzwrite (or deflate, for that matter) does any allocations, I had a look at the zlib source code. It doesn't look like it does allocations; the only allocations are done during the first gzwrite call, which is always made before any traces are written.

I also print-debugged allocations (by modifying the allocation routines used by zlib in the zlib source code) which confirmed that no allocations are done.

I will check with the zlib mailing list to be 100% sure.

fijal · 2016-05-14T10:00:05Z

Thanks! Signal-safe is more advanced though than just allocations, so please ask about that

methane · 2016-06-03T09:31:32Z

gzip output may affects sigprof.
I think piping to outer gzip (or lzop, lz4c, etc...) is better for accurate profiling.

fijal · 2016-06-04T07:58:06Z

Yes, I'm far more keen to pipe it out to some other program (then you can do accumulation and other tricks)

jonashaag · 2016-06-05T05:51:26Z

Thing is that for large profiles the compression step can easily take 30s to a few minutes, which is terrible... We can make gzip optional, and you can always change the sample frequency if it it slows down your program too much or you get the impression that the profile isn't accurate enough (or accurate profiles are of high importance to you).

If you want j can also try to collect some data on the sample recording duration variance etc.

fijal · 2016-06-06T07:19:33Z

I think you misunderstood the idea - the idea is not to first finish vmprof and then run gzip, but instead run vmprof to a pipe that will be gzipping it on the fly.

jonashaag · 2016-06-06T07:51:39Z

I see. In terms of performance this is probably a bit worse. In terms of code architecture it could be done very similar to what is done in this PR; we could simply spawn a new gzip process from within vmprof, get the pipe file descriptor and write to that (as opposed to writing to the output file directly). This doesn't require lots of changes in vmprof (only in the CLI module) and it's very convenient to use as we can integrate it with the CLI etc. Otherwise we could simply use the -o option to write to stdout and pipe that into gzip on the command line, but that wouldn't be as convenient to use from within Python because everyone would have to roll out their own gzip wiring. So I'd opt for the first option.

fijal · 2016-06-06T08:15:53Z

I agree (note that the performance can be better because you're suddenly using two cores).

methane · 2016-06-06T08:24:11Z

Another pros on piping: pipe buffer acts as write buffer. write(2) to file is blocked easily.
When using piped gzip, python process isn't blocked until pipe buffer (64KB) is filled.

jonashaag · 2016-06-13T19:36:19Z

I made a prototype "gzip as a subprocess" implementation. Please have a look here and give me feedback: https://github.com/jonashaag/vmprof-python/tree/gzip-subproc

jonashaag · 2016-07-11T20:53:13Z

Bump

fijal · 2016-07-11T20:54:19Z

Yes, sorry I'll get back to you
On 11 Jul 2016 10:53 PM, "Jonas Haag" notifications@github.com wrote:

Bump

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#71 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/AAFPuJIOvwmb6kyeQjEmByqINIS5pEdcks5qUq05gaJpZM4Id2ES
.

fijal · 2016-07-13T07:14:16Z

Hi

I'm a bit unhappy about packing the gzip stuff into C. I can be convinced it's a good idea, but a pure python solution would make it magically work with PyPy too for example. Does not seem too hard, if you have opinions I would like to know them

jonashaag · 2016-07-13T08:13:24Z

It's just a matter of performance, although I haven't done any benchmarking with a pure-Python solution. It shouldn't be a problem if you have multiple cores and large enough pipe buffers, so I guess a pure-Python solution would be fine as well!

jonashaag · 2016-07-13T08:15:05Z

Let's move discussion to #88

Jonas Haag and others added 4 commits May 2, 2016 12:48

Speed up parsing of profile files

9e935e7

Read multiple words at once if we know the exact number of words to read. Speedup on a 60MiB profile: PyPy: 1.24s vs 0.84s (1.48x) CPython: 7.85s vs 3.05s (2.57x)

Fix Python 3 compatibility

fe5d4b2

Reduce reader memory usage by caching integers

6f0d8f9

Force Python to re-use existing integer objects instead of allocating multiple objects for the same integer value. This can save a ton of memory depending on your profile. In my benchmarks the savings were somewhere in the range of 30-70%.

jonashaag added 2 commits May 13, 2016 14:04

Travis: Install zlib

70151f6

Link with -lz

0736f00

jonashaag mentioned this pull request Jul 5, 2016

<unknown code> <unknown> when copying vmprof log while it is being written #83

Closed

jonashaag closed this Jul 13, 2016

jonashaag mentioned this pull request Jul 14, 2016

Add --gzip option to cli #89

Closed

Gzip profile files #71

Gzip profile files #71

Uh oh!

Conversation

jonashaag commented May 13, 2016

Uh oh!

fijal commented May 13, 2016

Uh oh!

jonashaag commented May 13, 2016

Uh oh!

jonashaag commented May 13, 2016

Uh oh!

fijal commented May 14, 2016

Uh oh!

methane commented Jun 3, 2016

Uh oh!

fijal commented Jun 4, 2016

Uh oh!

jonashaag commented Jun 5, 2016

Uh oh!

fijal commented Jun 6, 2016

Uh oh!

jonashaag commented Jun 6, 2016

Uh oh!

fijal commented Jun 6, 2016

Uh oh!

methane commented Jun 6, 2016

Uh oh!

jonashaag commented Jun 13, 2016

Uh oh!

jonashaag commented Jul 11, 2016

Uh oh!

fijal commented Jul 11, 2016

Uh oh!

fijal commented Jul 13, 2016

Uh oh!

jonashaag commented Jul 13, 2016

Uh oh!

jonashaag commented Jul 13, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants