Skip to content

Added splitWords and various code quality improvements #58

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 16 commits into from
Apr 20, 2023

Conversation

vanderlee
Copy link
Owner

@vanderlee vanderlee commented Apr 1, 2023

Fixes: #65

Martijn van der Lee and others added 2 commits April 2, 2023 17:24
Using the explicit @Covers annotation in tests reduces the coverage
result because it seems to focus on the manual assignments rather than
the automatically detected covered pieces of code. Relying on
automatically determined coverage generally appears to be less
error-prone.
- Unify wording.
- Make clear that splitWord() does not support punctuation.
- Add large text test.
- Reword variables to become more verbose.
- Reduce expression overhead.
- splitWords(): Fix algorithm. For example
  "A syllable is a unit of organization."
  was split into
  [
      ['A'],
      ['syl', 'la', 'ble'],
      ['is', 'ga', 'ni', 'za', 'tion.'],
  ]
- splitWords(): Split text punctuation into specific array.
  For example
  "A syllable is a unit of organization."
  is split into
  [
      ['A'],
      [' '],
      ['syl', 'la', 'ble'],
      [' '],
      ['is'],
      [' '],
      ['a'],
      [' '],
      ['unit'],
      [' '],
      ['of'],
      [' '],
      ['or', 'ga', 'ni', 'za', 'tion'],
      ['.']
  ]
This may be caused by an insufficient pattern in the TeX language
file affected or by a bug of Syllable::parseWord().
@alexander-nitsche
Copy link
Collaborator

Hi Martijn,

i dived into the Syllable class for the first time and made some adjustments regarding this Syllable::splitWord(), splitWords() and splitText(). What remains would be to generate a new PHPDoc for docs/ from the Syllable class. Not sure, how to best achieve that.

As to your splitWord() question: I would keep splitWord() but i would outline in the documentation, that it does handle only simple words, no text punctuation. I added according notes in the code, tests and README.md.

As to the splitWords() concept: The implementation had some bugs that i fixed. In addition there is one decision to make: what is the purpose of splitWords()? For now i separated the text punctuation from the words (see the tests), but this has to be checked against the purpose of this method.

The profiler shows some slight performance gains for the adjusted splitText() implementation, but nothing fundamentally. I did the changes to understand the function and strip it down to the minimum required code lines to improve copy&paste&adapt for splitWords().

I removed the @cover annotations from the test classes again, as they have reduced the lines covered by tests a lot. What is the purpose of adding the covers annotation? It might be error-prone to use it in this package as PHPUnit seems to be perfectly able to link tests to covered code currently.

Last but not least, i accidentally pushed a rebased state to this PR and therefore somehow have overwritten you as committer of the first commit. Sorry for that.

Greetings, Alex

@alexander-nitsche
Copy link
Collaborator

Relates to #28 and #57.

@alexander-nitsche
Copy link
Collaborator

Closes: #30

@alexander-nitsche
Copy link
Collaborator

The code coverage reports with @Covers annotation:
syllable-test-coverage-with-annotation
and without:
syllable-test-coverage-without-annotation

@alexander-nitsche
Copy link
Collaborator

Hi Martijn,

i have added a documentation generator that will update the API documentation in README.md if

./build/generate-docs

is executed on the console. You should run it as soon as the Syllable class is changed. I have run it once. Let's try to find a final format for the method signature: in the previous manually maintained API documentation there were some inconsistencies in the format, e.g. adding or skipping the keyword "function", the number of spaces, etc:).

Greetings, Alex

@alexander-nitsche alexander-nitsche force-pushed the master_splitWords branch 2 times, most recently from 03f705a to f8ffcf3 Compare April 4, 2023 09:53
@alexander-nitsche
Copy link
Collaborator

Hi Martijn,

i have added a GitHub Action that checks if the API documentation in README.md is up-to-date.

Greetings, Alex

@alexander-nitsche
Copy link
Collaborator

Remaining questions from my side would be:

  1. What is the purpose of Syllable::splitWords()?
  2. Do you agree with the API reference format?

@alexander-nitsche
Copy link
Collaborator

  1. I would not just rename API variables and methods, like Syllable::setCacheDir() to Syllable::setDirectoryCache and the like, as it will break all uses of the package out there and require at least a minor release.
  2. Maybe Codacy analysis can be disabled / dropped for now, as it currently fails and counteracts StyleCI requirements. Maybe one format guard is sufficient :)?

Previously, only changes to files of the current directory
and subdirectories were committed.
- Fix API documentation generation.
- Use " = " instead of "=" for assigning the default value.
- Check the API reference in GitHub Action.
- Cache Composer dependencies for reuse in different jobs to speed up
  workflows and save bandwidth. Mind the global warming.
This should be sufficient to find all version-dependent errors. Mind
the global warming.
@alexander-nitsche
Copy link
Collaborator

Did some small unifications and beautifications.

@alexander-nitsche
Copy link
Collaborator

The latest two commits fix #65 .

- Addition of caching tests.
- Support for nested directory structures in the test directory.
This helps to prevent PHP from encoding the version as a high precision
floating point number in the cache file, e.g. 1.399999999999 instead of
1.4.
@alexander-nitsche
Copy link
Collaborator

Hi @vanderlee ,

how about

  1. reverting the renaming of API variables and methods of this PR, like Syllable::setDirectoryCache() and the like, to not break all uses of the package out there, and
  2. dropping the Codacy analysis,

and getting this PR merged?

Greetings
Alex

@vanderlee
Copy link
Owner Author

Hi @alexander-nitsche. I'll look into dropping Codacy.

With regards to @Covers; I feel coverage is a tool rather than a goal by itself. Removing @Covers might report covered lines but does not provide any incentive to tailor testcases to the functionality that those lines represent.

Anything API should probably be reverted for backwards compatibility.

Haven't looked into the API reference format but any formal documentation will be better than the manual stuff we have now.

@alexander-nitsche
Copy link
Collaborator

Ok, i will revert the API names and you the Codacy settings and we'll meet here soon again :)

As for @Covers: Let's move this topic to another PR, ok?

Revert it to not break all uses of the package out there.

Updated README.md via
```
./build/generate-docs
```
Copy link
Collaborator

@alexander-nitsche alexander-nitsche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ready to be merged, from my side.

@vanderlee
Copy link
Owner Author

Once I have some spare time (i.e. not soon), I'll try to improve coverage and reinstate the @Covers. For now I'm okay to merge

@vanderlee vanderlee merged commit 275df07 into master Apr 20, 2023
@vanderlee vanderlee deleted the master_splitWords branch April 20, 2023 12:51
@alexander-nitsche
Copy link
Collaborator

@vanderlee : Should we create another release? If so, minor or patch release? Minor would be fine as we are introducing a new API method splitWords() and patch would be fine, as i do not know yet the use case of splitWords() and maybe it has no use case at all :)

@vanderlee
Copy link
Owner Author

According to semver, it should be a minor because we did add a new feature. So 1.6.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Cache version in JSON cache file can be infinite decimal
2 participants