Skip to content

Conversation

@nc7s
Copy link
Contributor

@nc7s nc7s commented Oct 14, 2023

As proposed in #208 (comment), produce diffs of changes, instead of printing entire files.

This is a rough WIP that copies the terminal-inline example of similar and barely works. I intend to make it produce patches that are actually applicable.

example diff (note that coloring is provided by bat) example diff (note that coloring is provided by bat)

@CosmicHorrorDev CosmicHorrorDev marked this pull request as draft October 16, 2023 22:51
@CosmicHorrorDev
Copy link
Collaborator

Thanks for the PR! I really like how the output looks. I've gone ahead and marked this as a draft while this is still a WIP, but just let me know if you need anything

As for early feedback I think my only concern is that it shouldn't replace the existing output, but should be added as another option. Maybe through a --preview-diff flag or by changing --preview to take take a value like --preview <plain|diff>?

@nc7s
Copy link
Contributor Author

nc7s commented Oct 17, 2023

Thanks for the PR! I really like how the output looks. I've gone ahead and marked this as a draft while this is still a WIP, but just let me know if you need anything

Didn't know you can do that ;)

As for early feedback I think my only concern is that it shouldn't replace the existing output, but should be added as another option. Maybe through a --preview-diff flag or by changing --preview to take take a value like --preview <plain|diff>?

Indeed, the word preview has a meaning. Maybe just --diff? Since with such a flag it seems semantically incorrect to overwrite the file anyway.

@CosmicHorrorDev
Copy link
Collaborator

Sure --diff seems good!

@nc7s

This comment was marked as resolved.

@CosmicHorrorDev
Copy link
Collaborator

Hmmm, I think I would be all right with that. Just push changes and ping me whenever you want feedback

I am planning on pushing for a release over the next few days

@nc7s nc7s mentioned this pull request Nov 2, 2023
@nc7s nc7s changed the title WIP: diff output for --preview WIP: diff output Nov 3, 2023
@nc7s

This comment was marked as outdated.

@alexilyaev
Copy link

Can you share an example of how it would look like?

Got here after searching how to make --preview only show the matched strings or lines that would change and not print the whole file.

Example

This will print every single line of the file and then only at the end prints the change in color.
In large files, this makes --preview unusable.

echo "111\n222\n333\n444\n555\n666\n777\n888\n999" | sd -p '(999)' '000'
image

So, I'm wondering if the diff mode will help with that scenario?

Similar open issues

I suppose one of these would cover the large file use case as well

@nc7s
Copy link
Contributor Author

nc7s commented Dec 19, 2023

Can you share an example of how it would look like?

You can click the triangle on the left of "example" in the topic. Later updates differ a little in detail, but largely it's the same. The Wikipedia entry on diff is probably also worth a read.

Got here after searching how to make --preview only show the matched strings or lines that would change and not print the whole file.

So, I'm wondering if the diff mode will help with that scenario?

Probably. I never thought of diffs in this perspective, but it does work in this way (showing only changes, and limited surrounding context).

@alexilyaev
Copy link

@nc7s Oh, didn't notice the ▶ next to the example diff ☺️👍🏼
How did you get it to show colors? (I'm on macOS, using Zsh in iTerm)

Ok, so I'm testing -d with a change on consecutive lines and it does show me only the changed lines.
But, it doesn't show me the change for each line, rather a big chunk of "before" lines and then a big chunk of "after" lines:

echo "123\n123\n123\n123\n123\n123\n123\n123\n123\n123\n" | cargo run -- -d '2' '9'
image

If the changes are not in consecutive line, then it works as expected:

echo "123\n000\n123\n000\n123\n000\n123\n000\n123\n000\n" | cargo run -- -d '2' '9'
image

I still think that -p should have a flag to show only the affected lines and not print the whole file.

@nc7s
Copy link
Contributor Author

nc7s commented Dec 21, 2023

@alexilyaev:

How did you get it to show colors? (I'm on macOS, using Zsh in iTerm)

bat. Now spelled in the <summary>.

Ok, so I'm testing -d with a change on consecutive lines and it does show me only the changed lines. But, it doesn't show me the change for each line, rather a big chunk of "before" lines and then a big chunk of "after" lines:

I still think that -p should have a flag to show only the affected lines and not print the whole file.

This is expected behavior of diffs. Recommended read: https://en.wikipedia.org/wiki/Diff

Sometimes you won't know where and what the changed lines are if no surrounding context is provided. That's why diffs have contexts - to match surrounding lines. They may otherwise update an unrelated but identical line, which is totally possible for short lines of common words.

If you are confident, context radius could be reduced, down to zero. I'm considering an option to set that, but haven't decided on its naming yet.

@CosmicHorrorDev: copied over from earlier, maybe covered in noise:

Although, it lacks a context radius argument. I tend to not use a generic name like --context for a diff. But --diff-context is too long. I also don't know how to name the "file" when input is through stdin, so it's currently empty; maybe the user knows better, so we also provide an argument like --diff-filename?

@chmln
Copy link
Owner

chmln commented Dec 23, 2023

This is great as a prototype, excellent work. I have long wanted to implement something like this.

Not sure if this is actually possible but it would be great if we could integrate one of the rust pagers as a library so that users don't necessarily need to have e.g. bat installed and piped into, to get a nice looking diff output

@nc7s
Copy link
Contributor Author

nc7s commented Dec 23, 2023

@chmln:

Not sure if this is actually possible but it would be great if we could integrate one of the rust pagers as a library so that users don't necessarily need to have e.g. bat installed and piped into, to get a nice looking diff output

My thought then was that it feels unnecessary for every CLI program to implement its own colorization, esp. when the output is in a pretty common format. Feels like bloat.

Copy link
Owner

@chmln chmln left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor nits :)

/// format are likely to change in the future).
pub preview: bool,

#[arg(short, long)]
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think diff is short enough to not warrant a short flag, -d might be more useful for other things in the future

Copy link
Contributor Author

@nc7s nc7s Dec 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about -D? Even sed doesn't have many flags. (I impl'd this after all ;)

And what do you think about an option specifying context radius? And filename for piped diffing?

path: Option<&Path>,
) -> Result<String> {
let diff = TextDiff::from_lines(old.as_ref(), new.as_ref());
let path = if let Some(path) = path {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should just convert to a lossy string here, not sure its worth erroring out. Having the user see squareish symbols is better than not being able to see a diff at all

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed when acting as an end user interface, not so when the output is used as a proper diff. Or I'm overestimating the latter use case?

@CosmicHorrorDev
Copy link
Collaborator

For me the big thing is that an integrated pager can be more aware of its contents. An external pager is usually very naïve when it comes to searching or even determining terminal width

@chmln
Copy link
Owner

chmln commented Dec 23, 2023

My thought then was that it feels unnecessary for every CLI program to implement its own colorization, esp. when the output is in a pretty common format.

That's a valid point, although somewhat more of a philosphical one. Ultimately, the users want to see a pretty diff. If we can make that happen without having them install bat and configure a good looking theme..I think that's a clear win.

And if someone really wants to use their own diff program, we can detect that our diff output is being piped and just produce the raw version as you suggest.

Addendum: nevertheless, shipping this 'raw' diff mode is still a big win for all the power users. Just wanted to provide some ideas and food for thought.

@CosmicHorrorDev
Copy link
Collaborator

Also sorry for taking so long to give feedback. I've been stretched pretty thin lately. I likely won't have time to take a look till the new year

@nc7s
Copy link
Contributor Author

nc7s commented Dec 23, 2023

@CosmicHorrorDev:

For me the big thing is that an integrated pager can be more aware of its contents. An external pager is usually very naïve when it comes to searching or even determining terminal width

I'm no expert in this field, but fear that properly setting up an integrated one requires serious work (you gotta know the ins and outs). Plus, we've already seen GIT_PAGER and QUILT_PAGER, to name a few.

Also sorry for taking so long to give feedback. I've been stretched pretty thin lately. I likely won't have time to take a look till the new year

No worries ;) we all have a bunch of work at year end.

@chmln:

My thought then was that it feels unnecessary for every CLI program to implement its own colorization, esp. when the output is in a pretty common format.

That's a valid point, although somewhat more of a philosphical one. Ultimately, the users want to see a pretty diff. If we can make that happen without having them install bat and configure a good looking theme..I think that's a clear win.

Diff coloring is almost too easy to not implement, certainly can be made to work. The "do one thing" principle sometimes has a vague boundary…

And if someone really wants to use their own diff program, we can detect that our diff output is being piped and just produce the raw version as you suggest.

Yeah, configurable colorization was talked about in #268.

So, combined, maybe we can have colorless diffs first, then implement proper colorization later in one go.

@oriongonza
Copy link
Collaborator

Hello @chmln, @nc7s

With @CosmicHorrorDev stepping down what's the status on this? I think it's possitive if either noctis or me would have merge permissions. As I said before don't want to wholely maintain but I wouldn't mind reviewing PRs and doing basic fixes at least until another maintainer shows up.

@nc7s nc7s marked this pull request as ready for review January 17, 2024 22:51
@nc7s nc7s changed the title WIP: diff output Basic diff output Jan 17, 2024
@oriongonza
Copy link
Collaborator

@nc7s
Hello, I'm revisiting old PRs...
What if instead of this we go with the good ol unix philosophy and just use a dedicated differ?

It can be achieved today very easily

diff <(sd foo bar my_file --preview) my_file

And it has the full power of diff (or difftastic, or whatever differ you like)
Instead of poorly reinventing the wheel why not try to make it more convenient to get called from diff?

@nc7s
Copy link
Contributor Author

nc7s commented Apr 19, 2025

That's one step even more than my proposal of not providing paging ;)

It's actually always possible, from the beginning, to do as you suggested, by virtue of being a replacement utility: the user has the original file, sd provides a replaced file, it's almost natural to diff them. But that's extra work. Compared, turn on a switch and the diff is there for you.

It also involves named pipe, which differs on different shells, and some shells might not support it: on those, the user needs to further handle a replacement file themselves; and a more advanced usage of the shell, which I'll admit only until very recently did I get to know. Harder than a simple switch.

From an efficiency and "correctness" perspective, we mostly still use line based diffing, not syntactic, so a diffing library isn't worse than diff(1).

Indeed, the Unix philosophy says one tool should do only one thing and do it well. But then, like I said in a previous comment about paging, the boundary of "one thing" is sometimes vague. Rather, we should strike the balance between that philosophy and practical ease of use & usefulness. An extreme example: should top/htop/btop/etc. split their components monitoring different resources into different programs?

Indeed, it's possible and probably even preferable by some. But then I argue, we already made the effort to provide a colored preview (-p/--preview); why not hand that off too? That's the very point I developed this from, and arguably harder to review than a diff for longer files.

To some degree, this is additional possibility: take fonts as an example, it's better to have ligature then let the user disable it than not having it at all.

Above is obviously defending from my biased point of view as the one pushing and implementing it ;) Feel free to counter my points.

@willhansen
Copy link

Came here looking for this feature.

This syntax would meet my needs: sd A B file.txt --diff [optional custom diff tool] (and I don't even need the optional part)

As mentioned above, whatever_diff_tool <(sd A B file.txt --preview) file.txt would probably be least code to implement.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants