-
Notifications
You must be signed in to change notification settings - Fork 33
Issues: archivesunleashed/aut
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
PDF binary object extraction
DataFrames
enhancement
feature
Scala
#302
by ruebot
was closed Aug 12, 2019
Add EscapeHTML Function for ExtractLinks
enhancement
feature
#266
by ianmilligan1
was closed Sep 13, 2018
More complete Twitter Ingestion
enhancement
feature
#194
by greebie
was closed Jul 15, 2019
10 of 16 tasks
Implement Python versions of Serializable APIs
DataFrames
documentation
feature
PySpark
Python
#410
by ruebot
was closed May 20, 2020
Video binary object extraction
DataFrames
enhancement
feature
Scala
#306
by ruebot
was closed Aug 13, 2019
Spreadsheet binary object extraction
DataFrames
enhancement
feature
Scala
#303
by ruebot
was closed Aug 16, 2019
Add webgraph, imagegraph, webpages, etc. to command line app
DataFrames
feature
rdd
Scala
#431
by ruebot
was closed Apr 7, 2020
Include last modified date for a resource
App
DataFrames
feature
Scala
#546
by ruebot
was closed Nov 7, 2022
Extract gzip data from transfer-encoded WARC
bug
feature
#493
by ianmilligan1
was closed May 24, 2022
Doc binary object extraction
DataFrames
enhancement
feature
Scala
#304
by ruebot
was closed Aug 16, 2019
Audio binary object extraction
DataFrames
enhancement
feature
Scala
#307
by ruebot
was closed Aug 13, 2019
Powerpoint binary object extraction
DataFrames
enhancement
feature
Scala
#305
by ruebot
was closed Aug 16, 2019
Add parquet as an app format option
App
DataFrames
feature
Scala
#448
by ruebot
was closed Apr 22, 2020
Add datathon derivatives to app (binary info, web pages, web graph
App
DataFrames
feature
Scala
#447
by ruebot
was closed Apr 21, 2020
Implement Python versions of App utilities
DataFrames
feature
PySpark
Python
#409
by ruebot
was closed May 27, 2020
Implement Python versions of Matchbox utilities
DataFrames
feature
PySpark
Python
#408
by ruebot
was closed May 19, 2020
Add graphml output to DomainGraphExtractor
DataFrames
feature
Scala
#435
by ruebot
was closed Apr 11, 2020
For extractor (spark-submit) job, set Spark app name to be the extractor job name.
App
feature
Scala
#458
by ruebot
was closed May 4, 2020
ProTip!
Add no:assignee to see everything that’s not assigned.