Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

V9: clean wikitext replace for bot-plugins #411

Open
Cobertos opened this issue Dec 21, 2020 · 6 comments
Open

V9: clean wikitext replace for bot-plugins #411

Cobertos opened this issue Dec 21, 2020 · 6 comments

Comments

@Cobertos
Copy link

Cobertos commented Dec 21, 2020

Is there any way to get the original wikitext for a specific model? Not converting it back like wtf-plugin-wikitext does, but like a startIndex, endIndex into the originally parsed wikitext string?

I want to be able to convert back to wikitext while keeping the originally desired user formatting, just changing the text in one or two rows of a table.

Almost something like

const [brandTable] = tldDoc.sections('Brand top-level domains').tables();
brandTable.data
  .map(r => {
    if(r.Name === '.somebrand') {
      r.Name = new Sentence('');
      return r.wikitext(); //It's a table _row_ so it doesn't really have that but I can manage on my end
    }
    else {
      return r.originalWikiText(); //Just returns the original wikitext for that row
    }
  })
  .join('\n');

...where r.originalWikiText() would basically just, return the

Is this even possible? Could the parser decorate the models with this information? Is this something that would be valuable in a PR? Trying to write a Wikipedia bot that updates a table...

@spencermountain
Copy link
Owner

hey Samantha, this is a great question. I'd like this feature too.
Let me think on this a bit - it's not possible right now.

Also - I've been waiting for someone to use this library as a wikipedia bot. It makes perfect sense. It would be beautiful if we could make simple edits to a template, then inject it back into the page safely - with a minimal diff. In addition to offsets, are there other features that would make bot-authorship easier? I know it's very hard to do right now.

I'm in some breaking-changes on dev, so now is a good time to horeshoe this idea into the library somehow.
cheers

@Cobertos
Copy link
Author

Cobertos commented Dec 21, 2020

Good to know, thanks for the wonderful library btw

Offsets would be a good foundation. Imo, I was just thinking two unenumberable properties on each model. Kind of like .data, maybe just .index and .length? Core might be able to use it too, replacing .wiki that I see passed around. Otherwise the rest could be implemented as a plugin.

I would also need to implement something like:

  • originalWikitext() helper to return the string slice from the original wikitext based on the offsets. Or similar name, originalWikitext feels wordy. Something where I don't have to manually slice all the time, it'd make the code more readable.
  • wtf-plugin-wikitext being able to .wikitext() just a single row (say, for adding a new row). I figured I'd just pull the code out of table.js into my local file to make it work, cause you can only wikitext() a full table rn. It would also help to implement any bot-specific wikitext styling on my end if wikitext formatting for rows and other more granular items were exposed.

But the bullets are much smaller/easier to implement on my end too if the offsets are there. That seemed to be the big foundational change that I'd need.

@spencermountain
Copy link
Owner

yep - amazing. PR welcome.

You're right that the wikitext is getting tossed-around to each class, and accessing that would be great.
I'm concerned that there may be situations where the wikitext is 'dirty' though, or has been transformed by something else already, and won't map back (either by match or offset) very well.
We can deal with that, if you run into it. Right now the cascading order-of-parsers is making this sort of thing harder than it should be.

Feel-free to change things - especially on plugin-wikitext - which was made on a lark, but could be really useful for cases like this.

(plz branch off of dev - it's green right now, but I may mess-around with some of the templates over the holidays.)

Let me know if I can help!
if the final goal is to make clean wikitext diffs, maybe it belongs in a wikitext-edit or plugin-bot or something like that.
cheers

@Cobertos
Copy link
Author

Cobertos commented Dec 24, 2020

Okay, don't have a shit ton of time but I'll start taking a stab at trying to add the indexes https://github.com/Cobertos/wtf_wikipedia

Noting my findings

  • new Document() will preProcess the wikitext before searching, which means that all indexes back into the wikitext will either have to refer to the preprocessed wikitext or preprocess will have to be removed.
  • Going to try just removing preProcess for now to test feasibility (I need self-closing refs for the page I want to edit as well)
  • That won't work, breaks a ton of tests off the bat, but it looks like mostly just because they use text length (might be able to be updated)
  • index would clash with .index() in Section, gonna mess with wikiStart and wikiLength for now
  • Already a pain with Section, have to manually find each match instead of split()ing, but it's sort of working for that model so far.

@spencermountain
Copy link
Owner

yep - I've been thinking about this too.
Let me take a crack at it, i've got some time over the holidays.
I don't have any clear ideas, but maybe it will help to just try a few things.
will let you know how it goes

@spencermountain spencermountain changed the title Original wikitext for a model? V9: clean wikitext replace for bot-plugins Dec 29, 2020
@spencermountain
Copy link
Owner

hey @Cobertos - I haven't got a solution to this, but haven't forgotten.

Plan is to do a fuzzy-match on the original wikitext, then allow replacing chunks of it - hard part is that the model isn't built-well for reproducing subsets of the document, like the table use-case you mentioned.

I will almost-certainly be over-thinking this over the winter, but should scale-back my promises for it.
cheers

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants