Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add collapser for built in Visual Studio profiler #218

Merged
merged 6 commits into from
Apr 24, 2022
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -75,6 +75,11 @@ name = "inferno-collapse-vtune"
path = "src/bin/collapse-vtune.rs"
required-features = ["cli"]

[[bin]]
name = "inferno-collapse-vsprof"
path = "src/bin/collapse-vsprof.rs"
required-features = ["cli"]

[[bin]]
name = "inferno-collapse-guess"
path = "src/bin/collapse-guess.rs"
Expand Down
52 changes: 52 additions & 0 deletions src/bin/collapse-vsprof.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
use std::io;
use std::path::PathBuf;

use clap::Parser;
use env_logger::Env;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to self: we should really move over to tracing.

use inferno::collapse::vsprof::Folder;
use inferno::collapse::Collapse;

#[derive(Debug, Parser)]
#[clap(
name = "inferno-collapse-vsprof",
about,
after_help = "\
[1] This processes the call tree summary of the built in Visual Studio profiler"
)]
struct Opt {
// ************* //
// *** FLAGS *** //
// ************* //
/// Silence all log output
#[clap(short = 'q', long = "quiet")]
quiet: bool,

/// Verbose logging mode (-v, -vv, -vvv)
#[clap(short = 'v', long = "verbose", parse(from_occurrences))]
verbose: usize,

// ************ //
// *** ARGS *** //
// ************ //
#[clap(value_name = "PATH")]
/// Call tree summary file from the built in Visual Studio profiler, or STDIN if not specified
infile: Option<PathBuf>,
}

fn main() -> io::Result<()> {
let opt = Opt::parse();

// Initialize logger
if !opt.quiet {
env_logger::Builder::from_env(Env::default().default_filter_or(match opt.verbose {
0 => "warn",
1 => "info",
2 => "debug",
_ => "trace",
}))
.format_timestamp(None)
.init();
}

Folder::default().collapse_file_to_stdout(opt.infile)
}
6 changes: 4 additions & 2 deletions src/collapse/guess.rs
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ use std::io::{self, Cursor};

use log::{error, info};

use crate::collapse::{self, dtrace, perf, sample, vtune, Collapse};
use crate::collapse::{self, dtrace, perf, sample, vsprof, vtune, Collapse};

const LINES_PER_ITERATION: usize = 10;

Expand Down Expand Up @@ -69,10 +69,11 @@ impl Collapse for Folder {
};
let mut sample = sample::Folder::default();
let mut vtune = vtune::Folder::default();
let mut vsprof = vsprof::Folder::default();

// Each Collapse impl gets its own flag in this array.
// It gets set to true when the impl has been ruled out.
let mut not_applicable = [false; 4];
let mut not_applicable = [false; 5];

let mut buffer = String::new();
loop {
Expand Down Expand Up @@ -106,6 +107,7 @@ impl Collapse for Folder {
try_collapse_impl!(dtrace, 1);
try_collapse_impl!(sample, 2);
try_collapse_impl!(vtune, 3);
try_collapse_impl!(vsprof, 4);

if eof {
break;
Expand Down
7 changes: 7 additions & 0 deletions src/collapse/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,13 @@ pub mod sample;
/// [crate-level documentation]: ../../index.html
pub mod vtune;

/// Stack collapsing for the output of the [Visual Studio built-in profiler](https://docs.microsoft.com/en-us/visualstudio/profiling/profiling-feature-tour?view=vs-2019).
///
/// See the [crate-level documentation] for details.
///
/// [crate-level documentation]: ../../index.html
pub mod vsprof;

// DEFAULT_NTHREADS is public because we use it in the help text of the binaries,
// but it doesn't need to be exposed to library users, hence #[doc(hidden)].
#[doc(hidden)]
Expand Down
288 changes: 288 additions & 0 deletions src/collapse/vsprof.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,288 @@
use std::{cmp::Ordering, io};

use log::warn;

use crate::collapse::{common::Occurrences, Collapse};

static START_LINE: &str = "Level,Function Name,Number of Calls,Elapsed Inclusive Time %,Elapsed Exclusive Time %,Avg Elapsed Inclusive Time,Avg Elapsed Exclusive Time,Module Name,";
jonhoo marked this conversation as resolved.
Show resolved Hide resolved

/// A stack collapser for the output of the Visual Studio built in profiler.
#[derive(Default)]
pub struct Folder {
/// Function entries on the stack in this entry thus far.
stack: Vec<(String, usize)>,
}

impl Collapse for Folder {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want a slight challenge (definitely not required for the PR to land!), try implementing the CollapsePrivate trait instead:

/// Private trait for internal library authors.
///
/// If you implement this trait, your type will implement the public-facing
/// `Collapse` trait as well. Implementing this trait gives you parallelism
/// for free as long as you adhere to the requirements described in the
/// comments below.
pub trait CollapsePrivate: Send + Sized {

That will give you multi-core file processing, which can speed up collapsing quite a lot.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This point still stands :)

fn collapse<R, W>(&mut self, mut reader: R, writer: W) -> io::Result<()>
where
R: std::io::BufRead,
W: std::io::Write,
{
// Skip the header
KnapSac marked this conversation as resolved.
Show resolved Hide resolved
let mut line = Vec::new();
if reader.read_until(b'\n', &mut line)? == 0 {
warn!("File ended before start of call graph");
return Ok(());
};

let header = String::from_utf8_lossy(&line).to_string();
if !line_matches_start_line(&header) {
return invalid_data_error!(
"Expected first line to be header line\n {}\nbut instead got\n {}",
START_LINE,
header
);
}

// Process the data
let mut occurences = Occurrences::new(1);
loop {
line.clear();
if reader.read_until(b'\n', &mut line)? == 0 {
break;
}
let l = String::from_utf8_lossy(&line);
let line = l.trim_end();
if line.is_empty() {
continue;
} else {
self.on_line(line, &mut occurences)?;
}
}

self.write_stack(&mut occurences);

// Write the results
occurences.write_and_clear(writer)?;

// Reset the state
self.stack.clear();
Ok(())
}

fn is_applicable(&mut self, input: &str) -> Option<bool> {
let line = input
.lines()
.next()
.expect("there is always at least one line (even if empty)");

Some(line_matches_start_line(line))
}
}

impl Folder {
// Parse lines with values in the order as specified by `START_LINE`, comma delimited.
// Level,Function Name,Number of Calls,...
// 6,"System.String.IsNullOrEmpty(string)",4,0.00,0.00,0.00,0.00,"mscorlib.dll",
jonhoo marked this conversation as resolved.
Show resolved Hide resolved
fn on_line(&mut self, line: &str, occurences: &mut Occurrences) -> io::Result<()> {
let (depth, remainder) = get_next_number(line)?;

// Function names are always wrapped in quotes. By trimming the leading double quote, we
// know that the next double quote is the double quote closing the function name. Splitting
jonhoo marked this conversation as resolved.
Show resolved Hide resolved
// on this double quote, we get the function name and the remainder of the line.
let split = if let Some(remainder) = remainder.strip_prefix('"') {
remainder.split_once('"')
} else {
return invalid_data_error!("Unable to parse function name from line:\n{}", line);
};

if let Some((function_name, remainder)) = split {
let (number_of_calls, _) = get_next_number(remainder)?;

let prev_depth = self.stack.len();
// There are 3 separate cases to handle regarding the depth:
// 1. prev_depth + 1 == depth -> a new function is called, we only need to
// store the function name and the number of times it is called from the
// outer function
// 2. prev_depth == depth -> the previous function call was a leaf node, so we
// need to save the current stack and replace the top node with our node
// call
// 3. prev_depth > depth -> the previous function call was a leaf node, so we
// need to save the current stack and than we need to pop the top nodes
// until the top node is our parent (i.e. the function which called us)
match prev_depth.cmp(&depth) {
// Case 1
jonhoo marked this conversation as resolved.
Show resolved Hide resolved
Ordering::Less => {
assert_eq!(prev_depth + 1, depth);
self.stack
.push((function_name.to_string(), number_of_calls));
}
// Case 2
Ordering::Equal => {
self.write_stack(occurences);
self.stack.pop();
self.stack
.push((function_name.to_string(), number_of_calls));
}
// Case 3
Ordering::Greater => {
// The Visual Studio profiler outputs the number of times a function is called.
//
// Let's say we have a function `A()` which is called 500 times, and which
// calls a function `B()` 300 times. If we didn't do anything special here,
// this would result in `A()` being assigned 800 samples, giving the impression
// that `A()` only calls `B()` less than 50% of the time, while in fact it is
// called more than 50% of the time.
//
// To prevent this from happening, we instead subtract the number of calls from
// the previous node (in this case `B()`) from the current node (in this case
// `A()`. This leaves `A()` with the correct number of 500 samples.
//
// The top node is always written, but while walking down the stack, if the
// previous number of calls is equal to the current number of calls, we don't
// want to write the current top node, because that would duplicate the number
// of samples for the current node.
let mut prev_number_of_calls = 0;
for _ in 0..(prev_depth - depth + 1) {
if prev_number_of_calls != self.stack.last().unwrap().1 {
self.write_stack(occurences);
}
prev_number_of_calls = self.stack.pop().unwrap().1;

if self.stack.is_empty() {
break;
}

let last = self.stack.len() - 1;
let number_of_calls = &self.stack[last].1;
if prev_number_of_calls < *number_of_calls {
self.stack[last].1 -= prev_number_of_calls;
}
}

self.stack
.push((function_name.to_string(), number_of_calls));
}
}
} else {
return invalid_data_error!("Unable to parse function name from line:\n{}", line);
}

Ok(())
}

// Store the current stack in `occurences`
fn write_stack(&self, occurrences: &mut Occurrences) {
if let Some(nsamples) = self.stack.last().map(|(_, n)| *n).filter(|n| *n > 0) {
let functions: Vec<_> = self.stack.iter().map(|(f, _)| &f[..]).collect();
occurrences.insert(functions.join(";"), nsamples);
}
}
}

/// Gets the number from the start of the line. This can either be a number <1000, in which case the
/// line doesn't contain double quotes, or the number can be >1000, in which case the line does
/// contain double quotes. In both cases `line` may start with a leading comma, which will be
/// ignored.
jonhoo marked this conversation as resolved.
Show resolved Hide resolved
///
/// ### Example inputs
/// - Number <1000: `471,91.25,18.39,401.92,81.02,"Raytracer.exe",`
/// - Number >1000: `"2,893,824",54.37,4.21,0.04,0.00,"Raytracer.exe",`
fn get_next_number(line: &str) -> io::Result<(usize, &str)> {
// Trim the leading comma, if any
let line = line.strip_prefix(',').unwrap_or(line);

let mut remove_leading_comma = false;
let field = if let Some(line) = line.strip_prefix('"') {
remove_leading_comma = true;
line.split_once('"')
} else {
line.split_once(',')
jonhoo marked this conversation as resolved.
Show resolved Hide resolved
};

if let Some((num, remainder)) = field {
// Parse the number
let thousands = num.split(|c: char| !c.is_ascii_digit());
jonhoo marked this conversation as resolved.
Show resolved Hide resolved
let mut n = 0;
for thousand in thousands {
if n == 0 {
if thousand.len() > 3 {
return invalid_data_error!(
"Expected thousands separators for number bigger than 1000, found '{}'",
num
);
}
} else if thousand.len() != 3 {
return invalid_data_error!(
"Floating point numbers are not valid here, expected an integer, found '{}'",
num
);
}

n *= 1000;
if let Ok(num) = thousand.parse::<usize>() {
n += num;
} else {
return invalid_data_error!(
"Unable to parse number '{}', expected an integer",
num
);
}
}

if remove_leading_comma {
// `remainder` still has a leading comma, because the number is >1000. We need to
// remove it so we are consistent regardless of whether the number was wrapped in
// double quotes or not.
if let Some(remainder) = remainder.strip_prefix(',') {
return Ok((n, remainder));
}
jonhoo marked this conversation as resolved.
Show resolved Hide resolved
}

return Ok((n, remainder));
}

invalid_data_error!("Invalid number in line:\n{}", line)
}

/// Some files may start with the <U+FEFF> character (zero width no-break space). This
/// causes the call to `starts_with` to return false, which in this case isn't what we want.
/// As this character has no influence on the rest of the file, we can safely ignore it.
fn line_matches_start_line(line: &str) -> bool {
line.trim()
.trim_start_matches('\u{feff}')
.starts_with(START_LINE)
}

#[cfg(test)]
mod tests {
use super::get_next_number;

#[test]
fn get_next_number_default() {
let result = get_next_number(r#"471,91.25,18.39,401.92,81.02,"Raytracer.exe","#);
assert!(result.is_ok());

let (result, _) = result.unwrap();
assert_eq!(result, 471);
}

#[test]
fn get_next_number_with_leading_comma() {
let result = get_next_number(r#",471,91.25,18.39,401.92,81.02,"Raytracer.exe","#);
assert!(result.is_ok());

let (result, _) = result.unwrap();
assert_eq!(result, 471);
}

#[test]
fn get_next_number_with_thousands_sep() {
let result = get_next_number(r#""2,893,824",54.37,4.21,0.04,0.00,"Raytracer.exe","#);
assert!(result.is_ok());

let (result, _) = result.unwrap();
assert_eq!(result, 2_893_824);
}

#[test]
fn get_next_number_missing_thousands_sep() {
assert!(get_next_number(r#""2893824",54.37,4.21,0.04,0.00,"Raytracer.exe","#).is_err());
}

#[test]
fn get_next_number_with_float() {
assert!(get_next_number(r#""2,893.82",54.37,4.21,0.04,0.00,"Raytracer.exe","#).is_err());
}
}
Loading