Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Daily notable summary - good idea? #1194

Closed
mhdawson opened this issue Mar 24, 2022 · 39 comments
Closed

Daily notable summary - good idea? #1194

mhdawson opened this issue Mar 24, 2022 · 39 comments

Comments

@mhdawson
Copy link
Member

I was talking with @joesepi about how it's hard to keep up with everything going on in the project. People are still being surprised by some relatively important PRs late in the process and sometimes even after they have landed, even though they follow all of our guidelines for waiting time, approvals etc.

Would it make any sense to try and tag PR/issues as notable and send out a daily summary? I suspect we could build a GitHub action to run a query and send out an email. I'm thinking an email as I don't think GitHub actions can be configured to call out something specific like that in a way that would be visible (at least I have no idea how to do that).

The existing notable-change label would not be the right thing as it may be added later on if something should be called out in the changelog (based on the description for it) versus a tag added up front.

The main ask on collaborators would be to add the notable label to appropriate PRs that they open or that they review.

@Trott
Copy link
Member

Trott commented Mar 24, 2022

If we do this, please don't use notable as it is guaranteed that people will (understandably) not know the difference between notable and notable-change. Let's set people up for success.

Who would get the email? TSC? (Then maybe tsc-alert?) People who subscribe? (email-digest maybe?)

I think a daily email is going to be something that people will ignore but I'll leave that up to the people that would actually sign up for such a thing.

Before we do all this, though, what are some examples of the problem we're trying to solve? What are the important PRs you are referring to and who exactly was surprised by them?

@ljharb
Copy link
Member

ljharb commented Mar 25, 2022

I was very surprised by the test runner PR. It was only open a week, and clearly the maintainers of the well-known test runners in the ecosystem weren’t all asked for feedback (given that i'm one of them) - before the PR nor during it (some were, ofc).

For a notable PR to get attention, it has to sit open for months, not days.

@gireeshpunathil
Copy link
Member

definitely a great idea. I have signed up for multiple repos and many times I miss out important ones because sometimes I cannot open every notification mails and examine each PRs, so I opt-in based on the title that comes as the subject line of the mail. Often, this approach leads to failure - partly due to my poor judgement and partly the PR title does not reflect the content enough.

the challenge may be how to qualify notable PRs consistently. Some may be super-interesting for one and less for another?

@RaisinTen
Copy link
Contributor

People are still being surprised by some relatively important PRs late in the process and sometimes even after they have landed, even though they follow all of our guidelines for waiting time, approvals etc.

Is there anything wrong with using the notable-changes label for both of these sorts of PRs:

  • the ones that need to be highlighted on the changelog
  • the important ones referred to in this issue

The existing notable-change label would not be the right thing as it may be added later on if something should be called out in the changelog (based on the description for it) versus a tag added up front.

We could change the description to fit both cases.

Also, if there are PRs that are important but they don't deserve to be highlighted on the changelog, we can introduce a separate label for them, perhaps named "changelog-highlights"?

It was only open a week, and clearly the maintainers of the well-known test runners in the ecosystem weren’t all asked for feedback (given that i'm one of them) - before the PR nor during it (some were, ofc).

Collaborators do have the ability to block (clicking on "request changes") a certain PR if they want someone else to take a look before it lands.

@ljharb
Copy link
Member

ljharb commented Mar 25, 2022

@RaisinTen yes, but I’m not a collaborator, and either way, ecosystem experts probably shouldn’t have to be a collaborator to be consulted prior to landing something in their domain.

@RaisinTen
Copy link
Contributor

@ljharb in that case, I think you can post a comment asking a collaborator to block the PR with the reasoning behind it?

@ljharb
Copy link
Member

ljharb commented Mar 25, 2022

First, that’s pretty aggressive to ask to block a PR solely because i want time to review it, and second, the PR landed before i was aware if it.

Dramatically slowing down velocity is the only way to increase the chances of notification and feedback. You can choose to prefer velocity, but the tradeoff is worsening those, and vice versa.

@RaisinTen
Copy link
Contributor

that’s pretty aggressive to ask to block a PR solely because i want time to review it

Isn't that the main ask here - that we want to make sure that a PR, that people think to be important, doesn't land until some folks get time to review it?

@ljharb
Copy link
Member

ljharb commented Mar 25, 2022

Yes, but I’m also suggesting that talking to domain experts should happen before a single line of code is written - because the api design might change drastically given the right information.

@aduh95
Copy link
Contributor

aduh95 commented Mar 25, 2022

The conversation is getting a bit off rails, discussion regarding changing the minimal wait time, or the review threshold for a PR to land should go on its own thread.


I'm not convinced that daily notable summary suggestion would solve the original issue (some PRs would still slip through that notable filter, and we would be in the same situation as today), and would give us more triaging work. I'm not opposed to it if other folks are supportive, count me as -0.

@bnb
Copy link
Contributor

bnb commented Mar 29, 2022

heh, funnily enough I decided to start working on something literally exactly like this recently.

I used to go through the repository every week on Monday and read through all the activity and put them into a blog post, but haven't had the ability to do that as often recently. I worked a bit ago on encites for my team at Microsoft to report what we're doing in Open Source, and figured the concepts wouldn't be too hard to transfer over to filtering data from an org.

It turns out there's a lot of opacity in what GitHub counts as "updated" and it's a bit harder to filter just for that. I paired with a friend at GitHub and built out a GraphQL query that gets most public "activity" in the org, just need to convert it to Markdown (should be pretty easy) and I can share. My plan was to share this code and propose it run weekly and auto-generate a blog post that can go on nodejs.org, but I happened across this issue and figured I could share that I was working on this now. If folks do want to do a label, that's totally fine too - I can set issues/PRs/discussions with those labels apart in the markdown for easier visibility, too.

@mhdawson
Copy link
Member Author

I can set issues/PRs/discussions with those labels apart in the markdown for easier visibility, too.
+1 as I think that might make the list more manageable.

@mmarchini
Copy link
Contributor

I don't know how they do it, but LLVM (a much bigger project) has a weekly curated digest: https://llvmweekly.org/, which is handy if you want to follow developments in the project but can't keep an eye on everything that is happening. Something similar in design (where users can subscribe to instead of just being available to TSC or collaborators) would be ideal IMO.

@tniessen
Copy link
Member

Years ago, I was thinking of an RSS feed or something similar that would highlight all semver-minor (and semver-major) changes. Of course, RSS feeds aren't exactly the most popular technology and semver-minor and semver-major PRs are not the only relevant activity.

There have been numerous cases in which I wished I'd had time to weigh in on a PR but either wasn't aware of it or didn't have time before it landed. Some PRs that make significant changes have been rushed, and others that make small but important technical changes are neglected for days or weeks.

@mhdawson
Copy link
Member Author

@bnb when do you think you will have something running that we can try out?

@bnb
Copy link
Contributor

bnb commented Apr 4, 2022

Here's a first pass at output. Might still be a few bugs, and I'd like to give more granularity on Discussions, but this should be the past week of activity categorized by repo, type, and status: https://gist.github.com/bnb/469da4b71ca5a1932dd77ef64d7ce6fa

I'll try to publish the module this week and get a cron running in a personal repo.

@WaleedAshraf
Copy link

As I see it, the issue is that people who have the interest and skills are not tagged on the relevant issue and PR. For example, @ljharb mentioned "Testing".

Would it be possible to create a list of all top collaborators with their interests and skills in specific areas. All collaborators agree on the listed members (with their interests/skills), and we update the issue template to always tag relevant members from the list.

@ljharb
Copy link
Member

ljharb commented Apr 5, 2022

Even more than that, though, is writing code before attempting to gather relevant expertise. Design takes time, and lots of consultation with others, and code should wait for design.

@RaisinTen
Copy link
Contributor

Would it be possible to create a list of all top collaborators with their interests and skills in specific areas.

We do already have a table for this - https://github.com/nodejs/node/blob/master/doc/contributing/collaborator-guide.md#who-to-cc-in-the-issue-tracker. I think someone should add a new row for the newly added @nodejs/test_runner team.

we update the issue template to always tag relevant members from the list.

Sometimes collaborators have to manually ping the teams and sometimes @nodejs-github-bot does it automatically but I don't know why the bot doesn't handle all cases.

@Trott
Copy link
Member

Trott commented Apr 5, 2022

Even more than that, though, is writing code before attempting to gather relevant expertise. Design takes time, and lots of consultation with others, and code should wait for design.

@ljharb The issue proposing and discussing the test runner module was open for months before a pull request adding the module was opened. It can certainly be debated how long an issue like that should be open, what sufficient efforts to gather feedback look like, and how to know when sufficient feedback has been attained. But stating that code gets written "before attempting to gather relevant expertise" is contrary to fact and a bit inflammatory.

@ljharb
Copy link
Member

ljharb commented Apr 5, 2022

That’s a fair counter, thanks for clarifying.

@asb
Copy link

asb commented Apr 5, 2022

I don't know how they do it, but LLVM (a much bigger project) has a weekly curated digest: https://llvmweekly.org/, which is handy if you want to follow developments in the project but can't keep an eye on everything that is happening. Something similar in design (where users can subscribe to instead of just being available to TSC or collaborators) would be ideal IMO.

Hi - one of my colleagues pointed me to this thread. I'm the author/editor of LLVM Weekly so thought I'd share a little bit about how I do it, in case it's helpful.

  • In terms of how it started - I was doing quite a bit of commit and mailing list reading in order to keep up with developments for my downstream work, and figured I could provide a useful service without excessive additional investment of my own time - so I went ahead and gave it a go.
  • Typically I write it on a Sunday, and it always gets sent on a Monday. It's quite a time commitment (varying between 2-4 hours a week)
  • I do invite people to highlight blog posts / commits / mailing list threads of particular interest. This very rarely happens (though is always welcome!).
  • Every week I skim through every commit, and make an editorial decision about which ones are notable enough to mention. I use tig for this (and have a custom key-binding to stick an appropriate markdown-formatted link in my clipboard). I write the commit summaries as I go.
  • For the mailing list / Discourse section, I skim through every updated thread over the past week, first I identify the threads I'd like to mention, then go through and write them up. Depending on how much summarisation is done, this could take a lot of time. Typically I don't do much other than point to interesting threads / responses. But sometimes I will dive in and try to summarise a more complex debate, linking to various people's position. That kind of summarisation is something that could be outsourced, though I do think that the LLVM community would be better served overall if RFC authors more frequently took on responsibility of summarising responses to their threads (I can then copy / link to this, but I think it would be good practice independent of LLVM Weekly).
  • I don't trawl through patches that are in review as this would be too time consuming. Major changes typically have a companion RFC thread that will get picked up anyway. In response to concerns about some patches getting stuck I did trial an LLVM Weekly "Review Corner" where I'd invite people to submit patches that were stuck in review. This got a bit of engagement, but not enough for it to be worth continuing with.
  • It's become a bit of a thing for me that LLVM Weekly goes out every Monday without fail (haven't missed one yet in over 8 years). It's obviously easy for things to slip and newsletters to start becoming very irregular. I'm sure there's a happy middle ground that allows for the odd week off!
  • In terms of mailing list subscriptions and distribution - I've got a horribly hacky python script to handle subscriptions with mailgun (using double opt-in).

Note: By no means is the only way to run a newsletter, it's just what I've done. In particular, I suspect a more collaborative approach with multiple contributors could work. Though I think something like this is always going to get written close to the ship date, and someone would need to be prepared to take on the bulk of the work if necessary, if there haven't been many contributions.

I hope some of that is useful to you all - just let me know if you have any questions.

@WaleedAshraf
Copy link

I think that the blog post approach @bnb suggested wouldn't be much effective. I think it'll be too long that people will start ignoring it, OR if we make it smart and concise, it will be too short and people will ignore it as it'd not include issues they are interested in.

I think it'd be the same case for creating a new tag like notable-change. People will pay attention to issues tagged with it, but in the longer run (a year) it will be ignored. The reason is that we can't define hard and fast rules on what is a notable change and what isn't. It can also be subjective, as something notable for one person might not be for others.

Maybe the approach here should be to educate/guide people on how to effectively use GitHub Notifications. How to configure them. How to use/configure email notifications. Improve and build automation/bots logic within GitHub org. Good starting steps could be to ask top collaborators,

  • How do they have configured GitHub Notifications.
  • How/If they use email notifications.
  • How/If they have email rules configured or are using email aliases.
  • How do they make sure to not miss anything.
  • Improve the tagging logic for @nodejs-github-bot
  • Check if we can automate the logic to add new teams in "Who to CC list"

Gather this feedback and document it in the project in a common place.
Even better to conduct a session on "How to use GitHub notifications effectively". This can be done once/twice a year, or also at Collab Summit.

@bnb
Copy link
Contributor

bnb commented Apr 5, 2022

I think that the blog post approach @bnb suggested wouldn't be much effective. I think it'll be too long that people will start ignoring it, OR if we make it smart and concise, it will be too short and people will ignore it as it'd not include issues they are interested in.

I frankly disagree. The point isn't to get clicks and have people read through it, it's to provide an overview for those who want it. This is something that's for those who are active enough to want insight into everything, but might miss something. This is a quick supplemental reference that can help with that without adding an additional burden.

I think it'd be the same case for creating a new tag like notable-change. People will pay attention to issues tagged with it, but in the longer run (a year) it will be ignored. The reason is that we can't define hard and fast rules on what is a notable change and what isn't. It can also be subjective, as something notable for one person might not be for others.

We've tried labels in the past often, and they've almost never worked (see the non-agenda awareness labels for TSC and CommComm). Whether people don't apply them, or whether people don't read them, I've never seen them actually be effective as a surfacing tool.

Maybe the approach here should be to educate/guide people on how to effectively use GitHub Notifications

GitHub notifications do not work well at the scale that Node.js operates if you're active. I'd rather we invest in our own project rather than in GitHub.

Further, the rest of your suggestions add a massive content maintenance burden that is not obviously beneficial, given the context. The people asking for this both now and historically have usually been folks who are extremely active maintainers that still struggle with the onslaught of notifications.

@Trott
Copy link
Member

Trott commented Apr 6, 2022

I think the key thing to take from @asb's very helpful comment is: Doing this well requires at least one very dedicated person to put in a good chunk of time on a regular ongoing basis.

I suspect that anything we do that doesn't include that element will end up having all the same problems as GitHub notifications do for our project right now: Too much stuff a person doesn't care about making it inevitable that they'll miss the stuff they do care about.

For me, the question isn't whether a notable summary is a good idea. It's whether we have someone willing to put in sufficient time to make it actually useful. In the distant past, this is totally the type of thing @williamkapke would have done, probably without anyone even asking him to. I don't think we have anyone bringing quite that same kind of energy to these things right now, but if someone wants to just start doing the work, that would be amazing.

@mhdawson
Copy link
Member Author

mhdawson commented Apr 7, 2022

I'm kind of discouraged that we seem to quickly run into "it won't do any good" versus trying things out or coming up with an alternative. I understand that we need to balance asking people to do work (like adding a label) versus what value we'll get out of it.

I guess I asked the question if was a good idea or not, maybe I should have asked for those that would be interested instead of people who had concerns/objections.

While I agree that having somebody who could do it manually would be most effective I think making that the minimum bar for doing something means we won't do anything (I'd love to be wrong and if somebody does want to do it, I'd be super happ). Many thanks to @asb for the great overview and sharing an experience from another project

I'd be happy to see what @bnb's tool will generate by default. If not ideal it would still likely be something we could iterate to improve (for example with a label or something).

As a side note, I don't believe that this is the same use case as the other labels. In those cases people still had to run their own queries, go looking for issues tagged that way which I think is a significant contributor for why they did not help. I think push is a different model, if I'd received a daily email with the issues tagged with those labels I think I would have been much more likely to be aware/look at them.

@WaleedAshraf
Copy link

Maybe my suggestions felt like objections. I did raise my concerns with other approaches, but I didn't mean to discourage or object to anyone. If @bnb or anyone else has some ideas to try out, I'd be more than happy, regardless if it works or not. 🎉

@mhdawson
Copy link
Member Author

mhdawson commented Apr 8, 2022

And sorry for being a bit grumpy on this, just hoping we can try out something.

@bnb
Copy link
Contributor

bnb commented Apr 20, 2022

Published tooling here and created a template repository for automating content generation here. You can find a sample PR here.

@mhdawson
Copy link
Member Author

@bnb thanks for the update surprised to see that only 6 PRs were merged last week.

@bnb
Copy link
Contributor

bnb commented Apr 20, 2022

@mhdawson since that is a manual run, that report is for the past 7 days from today, meaning it's also including the weekend which may have some impact on that. It's intentionally set up to run relative to the current day and have the cron manage triggering automatically.

Also could absolutely be something from the query being off (org:nodejs updated:2022-04-13..2022-04-20 you can find the query in retrogen's source, where owner is nodejs and the start/end dates are 7 days ago and today, respectively), though as far as I can tell I'm getting that correct :)

@WaleedAshraf
Copy link

According to pulse, 33 PRs were merged in the last 7 days (13th to 20th). The number of open issues is also different.
The query is not returning everything? 🤔

https://github.com/nodejs/node/pulse#merged-pull-requests

@bnb
Copy link
Contributor

bnb commented Apr 20, 2022

happy to poke at it more, you can also find the GraphQL query in retrogen if you're interested in poking at it @WaleedAshraf. GitHub's GraphQL Explorer can be used to test it out, I'm just outputting what I'm getting from the API :)

@WaleedAshraf
Copy link

@bnb I did some testing in the Explorer, and it looks like it does return all the PRs.
In the source code, I don't see where are you fetching the next page?

@mhdawson
Copy link
Member Author

I think discussion has peetered out and I'm not going to push for now so closing.

@bnb
Copy link
Contributor

bnb commented Jul 21, 2022

FWIW my tool has been perpetually running and I've been merging updates. PRs welcome, and I'm also happy to move it into the org if that's helpful.

@WaleedAshraf
Copy link

@mhdawson I gave it a shot using the GitHub repo insights page.
https://github.com/WaleedAshraf/github-weekly-digest

@mhdawson
Copy link
Member Author

mhdawson commented Sep 2, 2022

@WaleedAshraf thanks, subscribed to check it out.

@WaleedAshraf
Copy link

WaleedAshraf commented Sep 7, 2022

@mhdawson Did it work for you? More importantly, does it help with keeping track issues/PRs?

If yes, I can reach out to more people to test it out. Happy to move it to Nodejs org if we want to.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests