Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ugrep format option %p stopped working for some archives at version 3.11.2+ (commit f6311d2) #299

Closed
acelticsfan opened this issue Sep 27, 2023 · 4 comments
Labels
bug Something isn't working

Comments

@acelticsfan
Copy link

acelticsfan commented Sep 27, 2023

These are the options I'm using to search .tgz files with gzipped files in them.
ugrep -sIza --zmax=2 --format-open='%u%p/%a:%z%~' --format='%u%O%~'

When run like this, the header line in the output that is printed for the file path is missing the path to the archive of the '%p' option.

For that changeset, there were a small number of changes introduced for hyperlink improvements. But, the changes broke the %p --format option for me in the way used above.

To demonstrate the problem, follow these steps:

cd /tmp
mkdir -p a/b/c
cp file.txt a/b/c
gzip a/b/c/file.txt
tar zcvf abc.tgz a/

Then grep for a string in the file using the following command:
ugrep -sIza --zmax=2 --format-open='%u%p/%a:%z%~' --format='%u%O%~' 'search_string' /tmp/abc.tgz
It will find the string, but the first line, the header line, will be missing the path to the archive.

Failing example

As an example, use this /etc/hosts file with this contents.
/etc/hosts:

127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6

Then run these commands:

cd /tmp
mkdir -p a/b/c
cp /etc/hosts a/b/c
gzip a/b/c/hosts
tar zcvf abc.tgz a/

ugrep -sIza --zmax=2 --format-open='%u%p/%a:%z%~' --format='%u%O%~'  host /tmp/abc.tgz
/:a/b/c/hosts.gz           <--- NOTE: missing archive name at the start of the line
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6

Working example using old release of ugrep 3.8.3

ugrep.3.8.3 -sIza --zmax=2 --format-open='%u%p/%a:%z%~' --format='%u%O%~'  host /tmp/abc.tgz
/tmp/abc.tgz:a/b/c/hosts.gz
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6

Change isolation

I ran 'git bisect' to find when the breaking change occurred working back from current release. It is the commit mentioned in the subject: f6311d2, which occurred just after 3.11.2 was released.

These are the specific changes that were made in that commit from the previous.
git diff --compact-summary 4211c2a f6311d2

git diff --compact-summary 4211c2a f6311d2237505bb6ca903334371554cd5194c18e
Author: Robert van Engelen <genivia-inc@users.noreply.github.com>
Date:   Fri Apr 7 12:52:21 2023 -0400

    update 3.11.2 for #252
    option --hyperlink improvements

 bin/win32/ugrep.exe | Bin 1671168 -> 1671680 bytes
 bin/win64/ugrep.exe | Bin 1932288 -> 1933312 bytes
 src/output.cpp      | 169 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-------------------------------------------------------------------------
 src/output.hpp      |   6 +++---
 src/ugrep.cpp       | 143 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-----------------------------------------------------------------
 5 files changed, 177 insertions(+), 141 deletions(-)

I think the change that caused the issue is in output.cpp in the Output::format function, although it might be the Output::header function too.

For Output::format, we see this change:

@@ -846,7 +869,7 @@ void Output::format(const char *format, const char *& pathname, const std::strin
         break;

       case 'p':
-        if (pathname != NULL)
+        if (heading)
         {
           const char *basename = strrchr(pathname, PATHSEPCHR);
           if (basename != NULL)

I think heading might be false here causing it to skip the code for the option, although I'm not entirely sure. Hopefully this is clear. Let me know if you have questions.

@acelticsfan acelticsfan changed the title ugrep format option %p stopped working for some archives at version 3.11.2+ (commit f6311d2237505bb6ca903334371554cd5194c18e) ugrep format option %p stopped working for some archives at version 3.11.2+ (commit f6311d2) Sep 27, 2023
@acelticsfan
Copy link
Author

acelticsfan commented Sep 28, 2023

Of course, some of the paths that I work with are much longer than this, and so this becomes a problem when I can't tell which directory/path a search hit was found in.

@genivia-inc
Copy link
Member

Thank you for your feedback.

The change was related to another series of changes that made pathname always set and never NULL and a change in the way --only-line-number works with formatting instead of a custom branch to handle its search and output. But %p should not be conditional on --heading. So the change for switch case 'p' can be reverted as well as the change for case 'a'.

The change made in the past attempts to replicate --heading when formatted output is specified. That only works alright when --format-open is specified also. However, it may not be as transparent how that works as I had hoped. So I am inclined to revert back to what we had before.

@genivia-inc
Copy link
Member

I've updated src/output.cpp in this repo with a revision to revert part of the changes so that the old behavior is restored. The %+ format still works too (which was new), but the logic is more carefully designed to avoid confusion and problems with the format fields. More specifically, if %+ is not used then the formats always work like they did before in 3.8.3.

I'm working on an update of ugrep for the next release with new features. This may take a couple of days.

@genivia-inc genivia-inc added the bug Something isn't working label Sep 28, 2023
@acelticsfan
Copy link
Author

acelticsfan commented Sep 28, 2023

Thank you very much for quick response and updates !!

Fix works for me. Appreciate it.

stdedos pushed a commit to stdedos/ugrep that referenced this issue Oct 16, 2023
# By Robert van Engelen (38) and Ashish SHUKLA (1)
# Via GitHub (1) and Robert van Engelen (1)
* tag 'v4.3.0':
  released 4.3.0
  release 4.3.0
  released 4.3.0
  released 4.3.0
  revert part of output.cpp to fix Genivia#299
  updated README with MacPorts installation instructions
  updated README
  updated README
  released 4.2
  updated README
  updated README
  update README and man page
  update README
  updated README
  released 4.1.0
  updated README
  updated README
  updated README
  update README
  Update README.md
  updated README
  released 4.0.5
  released 4.0.5
  updated README
  released 4.0.4
  released 4.0.4
  released 4.0.4
  released 4.0.3
  Delete .travis.yml
  updated README
  updated README
  released 4.0.2
  update makemake
  update docs
  update
  updated README
  updated README
  updated README
  Fix --enable-hidden option

Signed-off-by: Stavros Ntentos <3527706-stdedos@users.noreply.gitlab.com>

# Conflicts:
#	src/ugrep.cpp
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants