Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test AddressHelper::convertFormulaToA1 #2086

Merged
merged 7 commits into from
May 20, 2021
Merged

Test AddressHelper::convertFormulaToA1 #2086

merged 7 commits into from
May 20, 2021

Conversation

ndench
Copy link
Contributor

@ndench ndench commented May 11, 2021

This is:

- [x] a bugfix
- [ ] a new feature

Checklist:

Why this change is needed?

As discussed in #2076, the `AddressHelper::convertFormulaToA1 function was not tested. This is quite a logically intensive function and as a result I actually identified a bug while writing the tests. The bug was identified with this test:

1) PhpOffice\PhpSpreadsheetTests\Cell\AddressHelperTest::testConvertFormulaToA1FromR1C1Relative with data set #5 ('=E3-E7-G5-C5-E5-5', '=R[-2]C-R[2]C-RC[2]-RC[-2]-RC-5', 5, 5)
PhpOffice\PhpSpreadsheet\Exception: Invalid R1C1-format Cell Reference, Value out of range

As you can see from this regex101 example the old regex fails for relative R1C1 conversion when the column is the current column and the reference is followed by a minus sign, eg R[-2]C-R[2]C.

image

I updated the regex to use non-capturing groups to enforce that the R or C is followed by either:

  • A positive number
  • A number inside square brackets with an optional negation sign

Which you can see from this regex101 example that it extracts the references correctly.

image

I decided against splitting this function into two different functions (as it handles both R1C1 and SpreadsheetXML formats) in this PR as I thought it best to get some functioning tests first. Then we can split it up later.

@@ -72,7 +72,7 @@ public static function convertFormulaToA1(
foreach ($temp as &$value) {
// Only replace in alternate array entries (i.e. non-quoted blocks)
if ($key = !$key) {
preg_match_all('/(R(\[?-?\d*\]?))(C(\[?-?\d*\]?))/', $value, $cellReferences, PREG_SET_ORDER + PREG_OFFSET_CAPTURE);
preg_match_all('/(R((?:\[-?\d*\])|(?:\d*))?)(C((?:\[-?\d*\])|(?:\d*))?)/i', $value, $cellReferences, PREG_SET_ORDER + PREG_OFFSET_CAPTURE);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As with our previous PR, a class constant is useful for regexp, even if it's only used in the one place.... there's a number of other regexp features (e.g. the /x switch) that can be used to help document them, and having them as a class constant is the easiest place to do this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, good point. I mainly kept it here so you could see the change I made easily. But it's certainly better as a constant. Fixed up 👍

['=D3*F7*G4*C6*5', '=R3C4*R7C6*R4C7*R6C3*5'],
['=D3/F7/G4/C6/5', '=R3C4/R7C6/R4C7/R6C3/5'],
// Formulas
['=SUM(E1:E5)', '=SUM(R1C5:R5C5)'],
Copy link
Member

@MarkBaker MarkBaker May 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just as a useful little technique for your data provider array.... using associative indexes for each test can make it easier to recognise which tests are failing, and also has benefits if you report tests using additional phpunit tools
e.g.

return [
    'Sum with relative range' => ['=SUM(E1:E5)', '=SUM(R1C5:R5C5)'],
    'Sum with relative range and additional value' => ['=SUM(E1:E5, D5)', '=SUM(R1C5:R5C5, R5C4)'],
];

When a test does fail, it's reported by the name that you've given it, rather than simply it's offset number (with data set #5), which can make it a lot easier to identify in large data providers.

And not every test case needs to be named, so the array can be a mix of enumerated and associative

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, I completely forgot about this. I've added in proper names for each data set.

Copy link
Member

@MarkBaker MarkBaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you ever had one of those moments, when reviewing a PR, when you notice a bug in the code... not a new bug, not one that you've introduced, but something that has been in the code for years, and nobody has ever noticed.... one that I introduced when I created the original SpreadsheetML Reader.

Consider the formula

of:=1.23+2.34

or

of:=[.D3]+[.F7]+[.G4]+[.C6]+5.67

in a SpreadsheetML file... it'll process cleanly and with no error through the convertor, but it won't be the same formula.

Don't worry about it; I'll implement a fix as a separate PR, with some additional unit tests.... I'm just very, very annoyed at myself

@ndench
Copy link
Contributor Author

ndench commented May 16, 2021

Ah yes, I can say I've definitely had that before 😂. Should we merge this PR then? So that we can add that as a unit test case? I'm happy to do it if you'd like a hand?

I've rebased this onto master as well, so that it's not blocked.

@MarkBaker MarkBaker merged commit e6a4442 into PHPOffice:master May 20, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants