Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Collecting analytics for #active_questions #660

Open
surajkumar opened this issue Oct 25, 2022 · 18 comments
Open

Collecting analytics for #active_questions #660

surajkumar opened this issue Oct 25, 2022 · 18 comments
Assignees

Comments

@surajkumar
Copy link
Contributor

Recently, a question was asked by a member of the server, along the lines of "How long does it take for a question to get answered?". This information is unknown.

We could gather analytics based on the data in #active_questions.

Data such as:

  • Time from when a question is opened to the time a question is closed.
  • The it takes for somebody to respond to a question
  • The number of questions asked per day
  • The number of questions asked by a specific discord
  • The category questions are asked under

.. and any more that you can think of ..

Then we can then calculate averages based on this information to answer questions like "on average, how long does it take for a question to be answered".

The information could be displayed in a specific analytics channel and for the purpose of marketing, a website (or as a dynamic image to throw onto advert boards, should you wish).

To get the initial dataset, a quick scrub should take no time at all.
Going forward, it might require a lot more effort to keep the data up-to-date after the initial scrub.

Opinions, please.

@surajkumar surajkumar added the enhancement New feature or request label Oct 25, 2022
@Tais993
Copy link
Member

Tais993 commented Oct 25, 2022

Something to take a look at, prometheus
Is this possible to have this ran in the bot itself, maybe export to a database? Gotta figure this out.

Alternatively we might need install scripts that instill Prometheus, but i hope that's not required.

Or docker, or we disable stats if there's no active Prometheus instance.

Or a complete different approach, handling stats ourselves, so we wouldn't use a library, but would create our own "mini-tool". This has to be thought through

@Nxllpointer
Copy link
Contributor

I really like the idea if this feature!

@surajkumar
Copy link
Contributor Author

Something to take a look at, prometheus Is this possible to have this ran in the bot itself, maybe export to a database? Gotta figure this out.

Alternatively we might need install scripts that instill Prometheus, but i hope that's not required.

I was planning having the bot do all the work from the initial scrub which will save everything to a database. The scrub processing being, using JDA to get all the message history from the #active_questions channel and ripping out everything we need.

In JDA this would be something like:

channel.getThreadChannels().forEach(thread -> { 
  thread.getIdLong()
  thread.getName() 
  thread.getTimeCreated
  messages = thread.getHistory().getRetrievedHistory()
}
database.save(...)

Then moving forward, whenever a new thread is created (or /ask is invoked), add a new entry the database. When /help-thread-close is invoked, update a status column.

We can calcuate what we want to see from there.

@Tais993
Copy link
Member

Tais993 commented Oct 25, 2022

Well yes, and Prometheus would be the database in my case. If you'd install Grafana you'd have amazing graphical diagrams and whatever else you can think of related to data.

(Prometheus doesn't guess you want to know how many threads are open etc. You still have to tell Prometheus, and Prometheus would then show you how many threads get opened over time) and more

@surajkumar
Copy link
Contributor Author

Thought about this some more. I'm not sure there is any point in having a database. We could either scrub the #active_questions channel periodically (e.g. once a day) and still have result we want or manually invoke it with a command. This would make the solution self sufficent and portable.

@surajkumar
Copy link
Contributor Author

surajkumar commented Oct 29, 2022

I don't think it's possible to get archived thread channels using JDA. Tried literally everything.
The only alternative would be to start collecting data moving forward but we won't have any information over the previous years and it could take ages for future data to be of any use.

@Nxllpointer
Copy link
Contributor

I mean a month worth of info also says a lot

@surajkumar
Copy link
Contributor Author

I mean a month worth of info also says a lot

Complexity still rises as we would have to hook onto all the relevant slash commands (e.g. /ask and /help-thread-close), monitor on onMessageReceived events, introduce a new table (and setup if a DB if one isn't present) and anything else for the data collection part. Alongside the existing task of figuring out how we are gonna display the information and do the calculations we need.

Lots of testing will then need to be done so that the changes to existing functionality do not break.

Unless somebody is volunteering to do the work, I would suggest closing this with the reason of it being too big of a job.

@Nxllpointer
Copy link
Contributor

Hooking the slash commands should not be an issue since the BotCore calls the events somewhere. I would not close this yet.

@surajkumar
Copy link
Contributor Author

Feel free to look into data collection.

@Zabuzard
Copy link
Member

I would suggest closing this with the reason of it being too big of a job

then just reduce the scope. you guys are planning this task to be way too complex. step back and do something simple.

we already have a database collection meant for stats, the help-threads table. just add a few columns to it and start collecting.
first, figure out which metrics are actually interesting and helpful to know. then figure out how to collect them and then plan accordingly.

for example, lets say we want to know how many channels are closed with a RED-activity indicator, then we simply add can add two columns to the database table:

  • current activity indicator
  • closed yes/no

and based on that we can already retrieve the metric. for starters, we could simply have a slash command /help-metrics which just outputs a simple list:

  • help-threads closed with little activity = 12

dont overcomplicate. prometheus, grafana and all that shit is really cool. but way too complex for the first steps. first figure out what metrics are needed and then build a small PoC as explained. we can always improve afterwards and make graphs out of it.

@Tais993
Copy link
Member

Tais993 commented Nov 2, 2022

I don't think it's possible to get archived thread channels using JDA. Tried literally everything. The only alternative would be to start collecting data moving forward but we won't have any information over the previous years and it could take ages for future data to be of any use.

Go to the javadoc and look up "retrievearchive"

You can retrieve archived channels

@github-actions
Copy link

This issue is stale because it has been open 30 days with no activity. Remove stale label, comment or add the valid label or this will be closed in 5 days.

@github-actions github-actions bot added the stale label Dec 26, 2022
@marko-radosavljevic marko-radosavljevic added valid This issue/PR is validated and ready to be picked. This auto adds items to TJ project board. and removed stale labels Dec 26, 2022
@derrykid
Copy link

derrykid commented Mar 7, 2023

I agree with @Zabuzard

just reduce the scope. you guys are planning this task to be way too complex. step back and do something simple.

I think the following columns might already be available in the database:

  • thread title
  • thread created time
  • thread close time

These 3 columns can help us to create a simple model and answer the hypothesis: what type of question has the quickest response averagely?

We can model a simple text processing model, like text extraction, i.e. "count" the occurrence of the vocabulary in the thread title.

You can expect something like:
"how to start learning java", "how to add new item to Arraylist?" get the lowest response time.

Based on this simple model, if the question is very common, we can reply with something like:

The question regarding arraylist will usually be answered in 5 mins

Some unseen question or difficult one, like the model has never seen, we can give the response like:

This type of question is a sparse case, it might take up more than 30 mins to receive an response. Please wait patiently

@marko-radosavljevic marko-radosavljevic pinned this issue May 8, 2023
@ankitsmt211
Copy link
Member

I'd like to work on this one if we can reduce the scope, How about just "Tickets open" and "Tickets close" for a start, prolly more stats on what category each belongs to. Im sure we can extend it to include more stats later.

@Zabuzard
Copy link
Member

Sure, go ahead :)

@ankitsmt211 ankitsmt211 self-assigned this Nov 4, 2023
This was referenced Dec 17, 2023
@tj-wazei tj-wazei unpinned this issue Mar 24, 2024
@ankitsmt211 ankitsmt211 removed the valid This issue/PR is validated and ready to be picked. This auto adds items to TJ project board. label May 18, 2024
@Bryce72
Copy link

Bryce72 commented Jun 1, 2024

I would like to work on this. Has there been any updates such as : "Tickets open" and "Tickets close"?

@ankitsmt211
Copy link
Member

@Bryce72 pr #990 should set up the base with meta data, I'll try to get it merged today.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants