Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option to delete Tag #4

Closed
Warden20 opened this issue Feb 20, 2021 · 14 comments
Closed

Add option to delete Tag #4

Warden20 opened this issue Feb 20, 2021 · 14 comments
Labels
enhancement New feature or request

Comments

@Warden20
Copy link

No description provided.

@pjeby pjeby closed this as completed Feb 20, 2021
@Heniker
Copy link

Heniker commented Mar 4, 2021

Was this implemented?

@pjeby
Copy link
Owner

pjeby commented Mar 4, 2021

No, and it will not be. (I closed the issue without comment since it was opened without comment.)

I've previously discussed the reasoning for not including this (on the Obsidian forums); the short version is that 1) it's an inherently destructive operation that can't be undone, and 2) just deleting the tag without considering the surrounding text may introduce formatting errors into the note, that require potentially endless special cases to handle.

So, the recommended approach to delete a tag is to merge it into a designated "trash" tag (name it whatever you like), then delete that tag from notes individually when the opportunity arises. (Since as a human you can fix the formatting of surrounding text in an intelligent way.)

@Heniker
Copy link

Heniker commented Mar 4, 2021

Thanks for deliberate answer. I did not consider formatting issues, because the only way I use tags is in metadata section of a note.
In that case, it's probably worth removing the mention of ability to 'delete tags' from project's info.

My current approach is to use sed to delete tags, which works just fine in most cases.

@pjeby pjeby added the enhancement New feature or request label Apr 9, 2021
@pjeby pjeby mentioned this issue Aug 20, 2021
@pjeby pjeby pinned this issue Nov 2, 2021
@rapatel0
Copy link

rapatel0 commented Apr 25, 2022

Devil's advocate

Why not have an option to make the tag no longer a tag

#Trash/tag1/subtag

to

Trash/tag1/subtag

Alternativel, the following
#Trash/tag1/subtag
to
TAG_REMOVED (Trash/tag1/subtag)

The latter is a non destructive transformation that shouldn't introduce any obvious corner cases.

Personally, I mostly want to clean it out of the UI layer and not necessarily the note. This could be accomplished with some sed but I trust your parser more than randomly a CLI tool

@pjeby
Copy link
Owner

pjeby commented Apr 26, 2022

That wouldn't work for front matter tags, since the # is optional for them. Note that if you instead rename the tag to a string that's known to be unique in your vault, you can then use e.g. sed to do the postprocessing without collisions or parsing problems.

@rapatel0
Copy link

Yeah I ended up manually doing that but ran into an edge case that required a vault restore. Attaching a simple script here for others to use.

Here's a script if anyone needs to do this manually.
Note: REALLY DANGEROUS DELETION OF DATA. PLEASE BACKUP prior to use.
Also there is probably a better way of doing it

# Mark removal of tag in body text
find . -type f -name "*.md" -exec /usr/local/bin/gsed -i 's|\#ArchivedTags/|TAG_REMOVED_|g' {} +

# Delete from YAML (list format and then comma format) 
find . -type f -name "*.md" -exec /usr/local/bin/gsed -i '/^  - ArchivedTags.*$/d' {} +
find . -type f -name "*.md" -exec /usr/local/bin/gsed -i 's|ArchivedTags[\/a-zA-Z]*,||g' {} +
find . -type f -name "*.md" -exec /usr/local/bin/gsed -i 's/,\{2,\}/,/g' {} +

@wealthychef1
Copy link

But why?

I currently use the system outlined here of modifying a tag and then using sed to delete it. What I'm wondering right now is why if you can rename a tag safely without corrupting the data, why you can't rename it to ""?
I understand you have two objections to this:

  1. it's an inherently destructive operation that can't be undone

I would say that objection #1 is moot, since you are already changing the text of tags in the body of notes, which could remove characters (transformming a tag #aa to #a for example). The only difference between deleting a tag and renaming it to a shorter name is that one of the characters being removed might be a # character, and that it can leave a blank space, unless I'm missing a case, which I may be/ I'm not clear on this objection honestly. Could you say more about it?

  1. just deleting the tag without considering the surrounding text may introduce formatting errors into the note, that require potentially endless special cases to handle.

#2 seems more relevant to me, but I would say this is cleanly handled with a warning to the user that it will delete the tag text from the notes. I can't think of an actual example of deleting a legal tag where this would cause damage, can you give an example?
I think all valid tags can be found with two or three well chosen regex's. You must already be doing something like this in your plugin, no?
Here is the regex I use in my shell script for this for each $tag:
sed+="-e 's!#${tag}[[:space:]]!!g' -e 's!#${tag}\$!!g' -e '/^[Tt]ags:[ ]/ s!([[:space:]\[]+)${tag}!\1!g' "
I agree this is complex and was hard to invent -- this is why it belongs in your plugin, solved once and for all by a professional. :-)
The first catches all non-yaml tags by requiring a #tag surrounded by whitespace
The second catches the tag alone on a line by itself
The third catches tags in yaml by looking for a proper [Tt]ags: line.
Apologies if I'm missing important edge cases.

Why do I care?

Since this is what I currently do myself, I'm interested to know what you know about the dangers here as well as just trying to make my life easier by having you enhance a plugin for me. :-)

@pjeby
Copy link
Owner

pjeby commented Nov 3, 2022

Regexes cannot safely parse YAML. Note for example that YAML tags can appear like this:

---
Tag:
  - a/b
  - "c, d/e"
  - '#f'
---

(Where a/b, c, d/e, and f are all valid tags. Obsidian also supports tAgS or TAG as the names of the tag field, among other variants, and the value can be JSON encoded too, with embedded lists, e.g [ tag1, '#tag2, tag3'], possibly split on multiple lines.)

As for the body, your regex also doesn't handle comma-separated tags or leading spaces.

As I've said previously, I consider tag deletion to be something that requires manual attention for all but the simplest and most regular of cases. I find it hard to come up with an algorithm that could correctly delete a tag from all of my notes, let alone anyone else's.

@wealthychef1
Copy link

Thanks, yes, good points on the tag arrays, however, there are sed programs that will neatly handle those as well using lesser known commands of sed to manipulate the pattern space to handle multiple lines at a time. But it is no doubt tricky.

But I keep wondering out loud, haven't you solved the same problem already, that of identifying the various tag possibilities when changing the name? I would think that deletion is the most trivial of the "change this string" operations once the tag string identifier is located in the right markdown file through some plugin magic that you are already using to rename.

@pjeby
Copy link
Owner

pjeby commented Nov 3, 2022

Removing the tag is trivial - removing it without breaking what's around the tag is the problem, especially in YAML but also in
the note text. A trivial example: It's important to note that #xyz, #abc, and #def are the tags in this note -- see what your regexes do to that line of text. Now consider what happens when they are in a YAML list (either single or multi-line) and watch the syntax break.

But it is no doubt tricky.

No, it's actually not possible with sed - regular expressions and even sed constructs are insufficient to correctly parse YAML, and nothing less than a full YAML parser will do. (There are multiple Obsidian plugins that attempt to manipulate YAML using regular expressions, and without exception they either fail to read some forms of perfectly valid YAML or else they write out broken YAML in its place when they make a change.)

A simple example (from issue #13):

---
Quality: &a C/2
RLEVEL: &b
Utility: &c
Field: &d
Type: &e ⚪/Figure
tags: [*a, *b, *c, *d, *e]
---

This is an example YAML snippet from an actual user of this plugin, as presented in issue #13. And it's not the craziest YAML I've seen, not by a long shot.

@Heniker
Copy link

Heniker commented Nov 3, 2022

YAML is actually pretty hard to read & replace even with a parser if you care about preserving user formatting.

@wealthychef1

This comment was marked as off-topic.

@wealthychef1
Copy link

YAML is actually pretty hard to read & replace even with a parser if you care about preserving user formatting.

I find it's even harder in the body. But I think my script above does a good job. I'd love to hear any comments on it.

@pjeby
Copy link
Owner

pjeby commented Nov 3, 2022

so your example does not have any valid tags in it

That's not true. It has two valid tags in it, which I will leave you as an exercise for understanding how YAML works.

The following script would work on your example well.

No, it has two immediately obvious bugs, one of which indicates you don't understand YAML, the other that you didn't pay attention to what I said above about how Obsidian processes YAML tag field names.

And that's not counting - lists, or various other YAML features you've ignored (such as | strings, or flow lists). I also didn't thoroughly inspect your code for any other bugs it might have -- there were just two that jumped out at me at a glance, so I haven't bothered to look through it in any further detail.

I find it's even harder in the body.

Yes, which is another big reason why Tag Wrangler doesn't support it. The entire point of this discussion is that Tag Wrangler cannot handle all the edge cases (such as inline tag discussion in the text body) and handling the edge cases correctly in the metadata is just as difficult, if not more so.

That's why, if I ever add this feature it will be something interactive, that simply takes you to each instance in turn so you can remove it by hand. No script is going to be able to handle deleting #def from note that #xyz, #abc, and #def are tags in this note without mangling the sentence structure - i.e. note that #xyz, #abc, and are tags.

Repository owner locked and limited conversation to collaborators Nov 3, 2022
Repository owner unlocked this conversation May 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

5 participants