Harvesting document links from Slack #689
bill-anderson
started this conversation in
Ideas
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
We post a lot of useful links into Slack. Going back and finding them is difficult. I propose building a DDW table to allow better access to this resource.
Slack has an API but for the purpose of this proposal I have worked with its export facility.
All data from all public channels can be exported in a single pull. Export instructions are available here - https://slack.com/intl/en-gb/help/articles/201658943-Export-your-workspace-data
A zip file is exported containing a folder for each channel and within each folder a json file for each day.
Links that Slack has been able to read and render an image and summary can be identified by an "attachments" array in the file. This block contains:
"from_url" - NB "" is represented as "/"
"title" - title of the article
"service_name" - the name of the publication or site
Links that Slack cannot render can be identified in the "blocks" array.
DDW table should contain the following fields:
channel
date
url
title
publication ("service_name")
saved_by ("real_name" in "user_profile")
The above provides a basic load. We could then add some ETL logic to build a key-word index.
Beta Was this translation helpful? Give feedback.
All reactions