Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

R1C1 conversion should handle absolute A1 references #2060

Merged
merged 4 commits into from
May 7, 2021
Merged

R1C1 conversion should handle absolute A1 references #2060

merged 4 commits into from
May 7, 2021

Conversation

ndench
Copy link
Contributor

@ndench ndench commented May 4, 2021

This is:

- [x] a bugfix
- [ ] a new feature

Checklist:

Why this change is needed?

A1 references can be made absolute in Excel by preceding the column or row with a $.
image

The current AddressHelper::convertToR1C1 function will ignore the $ and convert to relative R1C1 if the $currentRowNumber or $currentColumnNumber are specified, and to absolute R1C1 if not.

If you enable the R1C1 format in Excel, any reference with the $ will be converted to absolute R1C1 format.

This fix handles these references the same as Excel. Meaning that the R1C1 reference will use an absolute row or column if the $ is specified. For example $O$18 is always R18C15 regardless of whether $currentRowNumber or $currentColumnNumber are specified.

$columnId = Coordinate::columnIndexFromString($cellReference[1]);
$rowId = (int) $cellReference[2];
if ($cellReference[1][0] === '$') {
$columnId = Coordinate::columnIndexFromString(substr($cellReference[1], 1));
Copy link
Member

@MarkBaker MarkBaker May 5, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it be easier to extract the $ for absolutes, the column address and row id directly through individual capture groups in the regexp rather than using substring? Then you can also eliminate the else condition in these if statements, which reduces the complexity of the code for path coverage

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point. I originally went this way to not increase the complexity of the regex, but I think your way is cleaner. I'll push an update today.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've pushed that change now. Much cleaner 👍

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those changes to the regex don't add any significant cost to execution, as you're already executing the regex anyway; but do eliminate the substr() function calls.

The only thing I'd be tempted to change now would be to use named capture groups in the regex, so you could reference

$rowId = (int) $cellReference['row_id'];

or

if ($cellReference['absolute_column_ref'] === '$') {

as associative rather than enumerated, which adds to the readability of the code.

I've only recently started using named capture groups myself. It gives the benefit of improved readability of purpose, but It does add to the complexity of the expression. I can understand if you wouldn't want to use them though.

One last change I'd suggest would be to extract the regex to a class constant, which can then be made public, because potentially there's other places in the codebase that could use it (such as the coordinateFromString() method in Cell\Coordinate... it might be better located there as a public class constant anyway).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that named groups are much more readable, I just stuck with non-named to match the existing code style.

I've updated it to use named groups and also moved the regex into a constant and reused it in Cell\Coordinate. Which is outside of scope for this PR, but I think the regexes should be reused constants which will minimize future bugs. Happy to undo this change if you want, but the tests still pass so it should be fine?

Copy link
Member

@MarkBaker MarkBaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work, and I'm glad you were comfortable enough to do the named capture groups

@MarkBaker MarkBaker merged commit f28eea7 into PHPOffice:master May 7, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants