Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

makereport.sh fails when having more than one result file within subdirectory #225

Open
marcemq opened this issue Oct 17, 2019 · 2 comments
Labels

Comments

@marcemq
Copy link
Contributor

marcemq commented Oct 17, 2019

What
makereport.sh fails when having more than one result file within subdirectory, the system details table gets generated and only the final PDF report which reports some errors, see the generated attached (metrics_report.pdf)

Steps to reproduce

  1. Have at list two result json file within directory, i.e.:
$ tree results/
results/
├── Env.R
└── scaling
    ├── k8s-scaling-many-nodes.json
    └── k8s-scaling.json

1 directory, 3 files
  1. Launch report execution:
$ ./report/makereport.sh
...
[1] "html.md"
Executing: pandoc -t html -o 'html.html' 'html.md'
[1] "html.html"
PNGs of graphs and tables can be found in the output directory.
The report, named metrics_report.pdf, can be found in the output directory
-rw-r--r-- 1 root root   7886 Oct 17 14:44 /<user>/cloud-native-setup/metrics/report/output/dut-1.png
-rw-r--r-- 1 root root 163725 Oct 17 14:44 /<user>/cloud-native-setup/metrics/report/output/metrics_report.pdf
@marcemq
Copy link
Contributor Author

marcemq commented Oct 18, 2019

Debugging the code the error occurs in tidy_scaling.R, a file that gets called from makereport.sh and it turns out that it fails even having one output file because there is a mismatch in the naming of the result file and the test name inside it at this point, i.e.:

$ tree results
results
├── Env.R
└── scaling
    └── k8s-scaling-many.json

when calling makereport.sh with debug mode:

> source('/scripts/tidy_scaling.R')
> source('/scripts/tidy_scaling.R')
[1] "Output json file:"
[1] "/inputdir/scaling/k8s-scaling-many.json"
[1] "Content of fdata:"
NULL
Error in do.call("rbind", br$launched_pods) :
  second argument must be a list

From the print log, tidy_scaling.R can process any json output file, but the test name into the results is k8s-scaling, so the questions to be answered to fix this issue:

  • should we enable the functionality to ever-ride TEST_NAME (k8s scaling) when calling k8s_scale.sh? Currently it's hard-coded. By enabling this feature, then the name of the result file will be used for the test name used.
  • even if we add the TEST_NAME functionality, what would be the expected behavior when having more than one output file in subdirectory? should the results be kind of merged in the images generated? Or should we force to have only one file per subdirectory?

@dklyle dklyle added the scaling label Oct 18, 2019
marcemq pushed a commit to marcemq/cloud-native-setup that referenced this issue Oct 18, 2019
Generate a final report for k8s-scaling.json result in diferent
subdirectories which follows the current hard-coded test name
k8s-scaling.

Signed-off-by: Morales Quispe, Marcela <marcela.morales.quispe@intel.com>
marcemq pushed a commit to marcemq/cloud-native-setup that referenced this issue Oct 18, 2019
Generate a final report for k8s-scaling.json result in diferent
subdirectories which follows the current hard-coded test name
k8s-scaling.

Signed-off-by: Morales Quispe, Marcela <marcela.morales.quispe@intel.com>
@grahamwhaley
Copy link

Hi @marcemq . First, thanks for reporting the pdf generation failure :-)
I think the failure comes from two things:

  1. I think you have copied a .json scaling file to a different name (k8s-scaling-many.json) into the same subdir as the original results file - that is not how the data is meant to be 'structured'. Each set of results should go in a seprate subdir - so, you would have the subdirs named maybe onenode and manynodes, each with a single k8s-scaling.json file within it.
  2. but, there does indeed look to be a bug in the tidy_scaling.R script, which then fell over this directory/file structure... ideally it should have ignored the 'extra' file in the subdir.

Looking at the tidy_scaling.R code, we can see the file match line at the top:

testnames=c(
        "k8s-scaling.*"
)

and then further down the file regexp matching code of:

                matchfile=paste(testname, '\\.json', sep="")
                files=list.files(matchdir, pattern=matchfile)

That produces the matchfile pattern of k8s-scaling.*\\.json, which I think is incorrect. That first .* is a greedy match, and thus matches the file k8s-scaling-many.json, when that was not the intention. It should only match files called k8s-scaling.json. I suspect this was a copy/inherit issue from another R file that did do many-file matching....

Sooo, I think the simple fix is to:

diff --git a/metrics/report/report_dockerfile/tidy_scaling.R b/metrics/report/report_dockerfile/tidy_scaling.R
index 3e56d99..27842a9 100755
--- a/metrics/report/report_dockerfile/tidy_scaling.R
+++ b/metrics/report/report_dockerfile/tidy_scaling.R
@@ -14,7 +14,7 @@ suppressMessages(library(scales))                     # For de-science notation of axis
 library(tibble)                                                # tibbles for tidy data

 testnames=c(
-       "k8s-scaling.*"
+       "k8s-scaling"
 )

which results in a matchfile pattern of k8s-scaling\\.json, which then does the correct thing.

Let me know what you think - I think the above fix is correct, but bolstering the robustness of the code is never a bad thing, so please assess if we still want to add your other patches. Feel free to add the above to your commits - and again, thanks for spotting this - there was a real bug there ;-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants