Skip to content

[llvm-cov] Add support to 'llvm-cov show' HTML output to strip out the base absolute base source directory path #138350

@bartlettroscoe

Description

@bartlettroscoe

Description

The current implementation of llvm-cov show -format html [other options] -o <cov-output-dir> produces HTML files and subdirectories that show the absolute path the source files. For example, for a coverage build for Trilinos, the base index.html file shows the absolute directory path /scratch/rabartl/Trilinos.base/Trilinos/ like:

Image

I realize that this allows reporting coverage from source files all over the directory tree, but this is not ideal in most cases because:

  1. Most coverage reporting will be for defined projects under a single base source directory
  2. We don't want to clutter the HTML output with the full absolute directory path
  3. The full absolute directory path may actually be considered sensitive information in some contexts where the relative directory path would not be
  4. The generated directory tree under cov-output-dir>/coverage/ has a bunch of worthless empty directories starting from the root /.

Proposed solution

One potential solution would be to add a new argument to llvm-cov show like --base-project-path <base-project-path> which would strip the base path off of the sources specified by --sources <src1> <src2> .... For example, for Trilinos we could run something like:

llvm-cov show [other args] -o <cov-output-dir> --base-project-path /scratch/rabartl/Trilinos.base/Trilinos \
    --sources /scratch/rabartl/Trilinos.base/Trilinos/packages/teuchos

This would produce HTML files that shows the base path as Trilinos/packages/teuchos and the subdirectory path under <cov-output-dir>/coverage/ would be:

<cov-output-dir>/coverage/packages/teuchos/

and so on under that.

So the output HTML coverage file would look like:

Image

NOTE: The justification to leave the last subdir path of <base-project-path> (which is Trilinos in this example) is that this subdir name is typically the project name, and therefore makes a meaningful name for the report and the page.

NOTE: The llvm-cov show command should assert that <base-project-path> is a base directory for all of the paths passed into --sources <src1> <src2> .... Otherwise, all source paths could not be represented.

For all of Trilinos, one would call llvm-cov show where --base-project-path and --sources would point to the same directory path as:

llvm-cov show [other args] -o <cov-output-dir> --base-project-path /scratch/rabartl/Trilinos.base/Trilinos \
    --sources /scratch/rabartl/Trilinos.base/Trilinos

and this would produce the base HTML index.html file with Trilinos/ as the base that looks like:

Image

Workaround

One can workaround this by post-processing the generated HTML files and directories to achieve the desired result.

Here is a fairly simple shell script that makes the desired motifications:

post-process-llvm-coverage-html-files.sh (click to expand)
#!/bin/bash
#
# Post-process all of the generated HLML files from 'llmv-cov show' to make
# more reasoanble:
#
#   post-process-llvm-coverage-html-files.sh <abs-proj-src-dir> \
#     <llv-cov-show-output-dir>
#
# This script performs several tasks:
#
# * Replaces <abs-proj-src-dir> base directory displayed in the displayed HTML
#   file paths with the last subdir in <abs-proj-src-dir> (which should be
#   close to the name of the project?).
#
# * Removes the silly stack of absolute path subdirs
#   <llv-cov-show-output-dir>/<abs-proj-src-dir> in the generated HTML
#   directory and file tree and adjusts the HTML links for this move.
#
# This makes the generated HTML agnostic as to the absolute file directory
# path for the source code on a given machine.
#
# See the issue https://github.com/llvm/llvm-project/issues/138350
#
absProjSrcDir=$1; shift
llmvCovShowOutputDir=$1; shift

# Use the last subdir in /a/b/c/d/.../<projName> for the name of the project
projName=$(basename ${absProjSrcDir})

echo "Replace '${absProjSrcDir}' with '${projName}' in dispalyed paths in HTML files ..."
find ${llmvCovShowOutputDir} -name "*.html" -exec sed -i "s|>${absProjSrcDir}|>${projName}|g" {} \;
# NOTE: Above replacement must come before below path replacements involving
# ${absProjSrcDir}!

echo "Removing absolute path directories for base '${absProjSrcDir}' ..."
mv ${llmvCovShowOutputDir}/coverage${absProjSrcDir}/* ${llmvCovShowOutputDir}/coverage/
firstAbsProjSrcDir=$(echo ${absProjSrcDir} | cut -d '/' -f 2)  # First subdir after /
rm -r ${llmvCovShowOutputDir}/coverage/${firstAbsProjSrcDir}/

echo "Adjusting links in HTML files for move of HTML directories ..."

# Get relative path back for removal of abs project base subdirs
# (e.g., ../../../)
numDirs=$(echo ${absProjSrcDir} | tr -cd '/' | wc -c)
relPath=""
for ((i = 0 ; i < numDirs; i++)); do
    relPath+="../"
done
#echo "relPath= '${relPath}'"

find ${llmvCovShowOutputDir} -name "*.html" -exec sed -i "s|${absProjSrcDir}/|/|g" {} \;
find ${llmvCovShowOutputDir} -name "*.html" -exec sed -i "s|'${relPath}|'|g" {} \;
# NOTE: Above, the ' in the regex is critical so that only the beginning
# relative back path ../../../ and not more!

NOTE: The screenshots for the proposed solution were produced with this script.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementImproving things as opposed to bug fixing, e.g. new or missing featuretools:llvm-cov

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions