Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Task] adding a version control for excel file and the regression testing for a excel file #2283

Open
wants to merge 7 commits into
base: devel
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions .gitattributes
Original file line number Diff line number Diff line change
Expand Up @@ -2,3 +2,13 @@ include/contrib/* linguist-vendored
src/contrib/* linguist-vendored
crow/contrib/* linguist-vendored
framework/contrib/* linguist-vendored

*.xla diff=exceldiff
*.xlam diff=exceldiff
*.xls diff=exceldiff
*.xlsb diff=exceldiff
*.xlsm diff=exceldiff
*.xlsx diff=exceldiff
*.xlt diff=exceldiff
*.xltm diff=exceldiff
*.xltx diff=exceldiff
13 changes: 13 additions & 0 deletions Diff_Results.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
[Inputs]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file seems to be in the base level of RAVEN, but should it be?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am okay based on Dylan's description of the diffing script's location, but I still believe this one is probably not in the right location.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed this file and in the process of testing the best location on putting the outputs. I am thinking executing the diff through alias and put the output in the same folder but not sure if that will work for a custom diff.

Diff_values(row, column, new value, old value, new formula, old formula):(2, 3, Decimal('400000000'), Decimal('100000000'), '400000000', '100000000')
[Simluation_Tool_Example]
Diff_values(row, column, new value, old value, new formula, old formula):(2, 2, 400000000.0, 100000000.0, '=Inputs!C2', '=Inputs!C2')
Diff_values(row, column, new value, old value, new formula, old formula):(2, 5, Decimal('-400000000'), Decimal('-100000000'), '=C2-B2', '=C2-B2')
Diff_values(row, column, new value, old value, new formula, old formula):(23, 2, Decimal('2230416199.7481'), Decimal('1930416199.7481'), '=B2+NPV(Inputs!$C$8,B3:B22)', '=B2+NPV(Inputs!$C$8,B3:B22)')
Diff_values(row, column, new value, old value, new formula, old formula):(24, 2, Decimal('1.6615'), Decimal('1.438'), '=B23/D23', '=B23/D23')
[Outputs]
Diff_values(row, column, new value, old value, new formula, old formula):(3, 3, Decimal('2230416199.7481'), Decimal('1930416199.7481'), '=Simluation_Tool_Example!B23', '=Simluation_Tool_Example!B23')
Diff_values(row, column, new value, old value, new formula, old formula):(4, 3, Decimal('1.6615'), Decimal('1.438'), '=Simluation_Tool_Example!B24', '=Simluation_Tool_Example!B24')
Read time: 4.3417253494262695
Loop Checking Time:1.5703620910644531
Save time: 0.2653176784515381
65 changes: 65 additions & 0 deletions scripts/Excel_diff.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
import os
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be in raven/tests/scripts/TestHarness/testers?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@PaulTalbot-INL since it's a general script for use outside of the testing harness, I instructed @wenchichenginl to put it in raven/scripts/ but let me know if you think it should only live within the testing harness.

import json
import sys
import xlwings as xw
import time
from pathlib import Path # Core Python Module

def compare_json(file1, file2):
PaulTalbot-INL marked this conversation as resolved.
Show resolved Hide resolved
"""
Compares two excel files sheet-by-sheet and row-by-row.
@ In, file1, the initial version of the excel file
@ In, file2, the updated version of the excel file
@ Out, (Read, Loop, Save) time for running the diff
@ Out, a text file would be generated through this process
PaulTalbot-INL marked this conversation as resolved.
Show resolved Hide resolved
"""
## Compare the excel sheets cell by cell
start=time.time()
PaulTalbot-INL marked this conversation as resolved.
Show resolved Hide resolved
intial_version= Path.cwd()/file1
updated_version=Path.cwd()/file2
file_dir= os.path.dirname(file1)
file_name=os.path.basename(file1)
with xw.App(visible=False) as app:
initial_wb_2=app.books.open(intial_version)
initial_wb_2.save(file_dir+"/new_"+file_name)
initial_wb_2.close()
intial_version_2=Path.cwd()/(file_dir+"/new_"+file_name)
num_sheet=len(app.books.open(updated_version).sheet_names)
f = open(file_dir+"/Diff_Results.txt", mode="wt",encoding="utf-8")
read=time.time()
for i in range (num_sheet):
initial_wb=app.books.open(intial_version_2)
initial_ws=initial_wb.sheets(i+1)
updated_wb=app.books.open(updated_version)
updated_ws=updated_wb.sheets(i+1)
f.write("["+str(initial_wb.sheet_names[i])+"]")
f.write("\n")
# print (updated_ws.used_range)
PaulTalbot-INL marked this conversation as resolved.
Show resolved Hide resolved
for cell in updated_ws.used_range:
OV= initial_ws.range((cell.row,cell.column)).value
OF= initial_ws.range((cell.row,cell.column)).formula
if cell.formula!= OF or cell.value!= OV:
# Print the differences in a format you prefer
f.write("Diff_values(row, column, new value, old value, new formula, old formula):")
f.write(str((cell.row, cell.column,cell.value, OV, cell.formula,OF)))
f.write("\n")
start_check=time.time()
end=time.time()
os.remove (file_dir+"/new_"+file_name)
f.write("Read time: ")
f.write(str(read-start))
f.write("\n")
f.write("Loop Checking Time:")
f.write(str(start_check-read))
f.write("\n")
f.write("Save time: ")
f.write(str(end-start_check))
f.write("\n")
f.close()
print ("Read",read-start)
print ("Loop",start_check-read)
print ("Save",end-start_check)
PaulTalbot-INL marked this conversation as resolved.
Show resolved Hide resolved

if __name__ == "__main__":
#print (sys.argv[1], sys.argv[2])
PaulTalbot-INL marked this conversation as resolved.
Show resolved Hide resolved
compare_json(sys.argv[1], sys.argv[2])
Binary file not shown.
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
[Inputs]
Diff_values(row, column, new value, old value, new formula, old formula):(2, 3, Decimal('400000000'), Decimal('100000000'), '400000000', '100000000')
[Simluation_Tool_Example]
Diff_values(row, column, new value, old value, new formula, old formula):(2, 2, 400000000.0, 100000000.0, '=Inputs!C2', '=Inputs!C2')
Diff_values(row, column, new value, old value, new formula, old formula):(2, 5, Decimal('-400000000'), Decimal('-100000000'), '=C2-B2', '=C2-B2')
Diff_values(row, column, new value, old value, new formula, old formula):(23, 2, Decimal('2230416199.7481'), Decimal('1930416199.7481'), '=B2+NPV(Inputs!$C$8,B3:B22)', '=B2+NPV(Inputs!$C$8,B3:B22)')
Diff_values(row, column, new value, old value, new formula, old formula):(24, 2, Decimal('1.6615'), Decimal('1.438'), '=B23/D23', '=B23/D23')
[Outputs]
Diff_values(row, column, new value, old value, new formula, old formula):(3, 3, Decimal('2230416199.7481'), Decimal('1930416199.7481'), '=Simluation_Tool_Example!B23', '=Simluation_Tool_Example!B23')
Diff_values(row, column, new value, old value, new formula, old formula):(4, 3, Decimal('1.6615'), Decimal('1.438'), '=Simluation_Tool_Example!B24', '=Simluation_Tool_Example!B24')
Read time: 4.793208599090576
Loop Checking Time:1.8191144466400146
Save time: 0.2797675132751465
27 changes: 27 additions & 0 deletions tests/reg_self_tests/D_01_Excel_Version_Control_Tool/ReadMe.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
• Prerequisite:
○ The user would need to install raven and put the excel file (in the format of either .xlsx, .xlsb, .xla, .xlam, .xlsm, .xlt, .xltm, .xltx) that need to be version-controlled under \projects\raven\tests\reg_self_tests\D_01_Excel_Version_Control_Tool.
○ Save and commit and changes for the tool.
• Objectives:
○ This tool is developed to help the users control the versions of the excel files. The differences of the existing version and the previous version would be documented so that the user can decide whether to modify the content and resume and previous version at any time.
○ Track the file using "git add excel_file_name" to perform version control of the excel file
• Steps for version-control
○ You must close the excel file before executing the "git diff"
○ There are two ways for the version control. The first is to compare the excel files in the same branch [1] while the other condition is to compare the excel files in the different branches [2]. Please follow the instructions bellow to diff the two excel files:
§ [1] In the same branch
□ Make changes and save the excel book
□ Run "git status" to make sure some changes has been made
□ Run "git diff <PathToFile/file_name>" and wait until it finished
□ Check the "Diff_Results.txt" in the same folder
§ [2] In the different branches
□ Create a new branch using "git branch new"
□ Switch to the "new" branch from "master" branch
□ Open an excel to make changes. Then save and close the file.
□ Use "git add "file name"" to accept the changes of the file
□ Commit the changes using "git commit -m "adding text over here"
□ Use "git diff master…HEAD" (meaning to compare the file in the new branch to the file in the master branch)
□ Option to merge to the master branch
® Git checkout master
® Git merge new
® *note that after following this, the git diff will be empty.
○ The outputs (Diff_Results.txt) of the comparisons of the "git diff" will show the difference between the two different versions of the excel if you made some changes. You will need to review the changes row by row and see if you agree. If yes, you would need to type "git add excel_file_name". Then, type "git commit -m "text to commit"". The committed text would be saved in the log file under git repository.
If you do not agree with the changes, you would need to type "git restore excel_file_name" to remove the changes and back to the original version.
Binary file not shown.
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
# The Interface between Python and Excel
## Warning: Please close the tool before running this code
import os
import json
import numpy as np
import xlwings as xw
import time
import sys
import argparse
from pathlib import Path # Core Python MOdule
import pandas as pd

# Read the file name of the excel from the tests file
if __name__ == "__main__":
parser = argparse.ArgumentParser(description="pass the excel file name to python script")
# Inputs from the user
parser.add_argument("xlsb_python", help="xlsb_python")
args = parser.parse_args()
file_name=args.xlsb_python
PaulTalbot-INL marked this conversation as resolved.
Show resolved Hide resolved
wr=xw.Book(r'./'+ file_name)
wr2=xw.Book(r'./Gold_'+ file_name)

# Read the inputs and outputs from the gold file
sht_inp2=wr2.sheets['Inputs']
input_old=pd.DataFrame(sht_inp2.range('Inputs_table').value,columns=['Parameter', 'Unit', 'value'])
#print (np.array(input_old.iloc[:,2]))
sht_out2=wr2.sheets['Outputs']
output_old=pd.DataFrame(sht_out2.range('Outputs_table').value,columns=['Parameter', 'Unit', 'value'])

# Write the inputs to the existing file
sht_inp=wr.sheets['Inputs']
# store the inputs to be resoted
input_restore=sht_inp.range('Inputs_table').value
input_new=pd.DataFrame(sht_inp.range('Inputs_table').value, columns=['Parameter', 'Unit', 'value'])
sht_inp.range('Inputs_table').value=sht_inp2.range('Inputs_table').value
sht_out=wr.sheets['Outputs']
output_new=pd.DataFrame(sht_out.range('Outputs_table').value,columns=['Parameter', 'Unit', 'value'])
# Pass for T while Failure for F
T=0
F=0
for i in range (len(output_new.iloc[:,2])):
#(check if error is less than 0.1% for each output)
if abs(output_new.iloc[i,2]-output_old.iloc[i,2])/output_old.iloc[i,2]<0.001:
T=T+1
print ('Test passed for calculating'+str(output_new.iloc[i,0]))
else:
F=F+1
PaulTalbot-INL marked this conversation as resolved.
Show resolved Hide resolved
print ('Test failed for calculating'+str(output_new.iloc[i,0]))

sht_inp.range('Inputs_table').value=input_restore

if F==0:
wr.save()
wr.close()
wr2.close()
xw.App().quit()
sys.exit(0)
else:
wr.save()
wr.close()
wr2.close()
xw.App().quit()
PaulTalbot-INL marked this conversation as resolved.
Show resolved Hide resolved
sys.exit(1)
Binary file not shown.
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
• Prerequisite:
○ You would need to install RAVEN (https://github.com/idaholab/raven) to be able to run the tests.
○ The excel sheets must to have "Inputs" and "Outputs" tab in the excel file to be tested and it is the user's responsibility to connect the inputs and outputs to your own calculations. There is an example file inside this folder for user's information.
• Objective
○ If the user makes some changes of the excel files, this tool can help test if the formula inside the calculation is messed up and generate a different sets of the outputs.
• Assumptions:
○ It is assumed that outputs of the initial version of the excel is the gold values for comparisons
○ The current testing tool only works if the order and the total amount of the inputs and outputs do not change.
• Execution of the testing
○ Go to project/raven
○ ./run_tests --re=D_02_Excel_Python_Regression_Testing_Tool
• Example cases:
○ If there is no change of the file, the test must be passed.
○ Modify the inputs values in the excel sheet and you should get "pass" if you did not change the other formula
○ Modify some formula inside the "Simluation_Tool_Example" tab and you should get "failure" if you modify some critical formula that affect the outputs.

Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
[Tests]
[./D_02_Excel_Python_Regression_Testing_Tool]
required_libraries='xlwings'
type = RavenPython
input = 'Excel_regression_testing.py 02_example_file_for_testing.xlsx'
[../]
[]