-
Notifications
You must be signed in to change notification settings - Fork 33
Issues: archivesunleashed/aut
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
PySpark support for core AUT functionality. #12, #13.
enhancement
#100
by MapleOx
was closed Dec 5, 2017
Loading…
PDF binary object extraction
DataFrames
enhancement
feature
Scala
#302
by ruebot
was closed Aug 12, 2019
Migration of all RDD functionality over to DataFrames
DataFrames
enhancement
#223
by lintool
was closed Apr 21, 2020
Replace hashing of unique ids with .zipWithUniqueId()
enhancement
#243
by greebie
was closed Nov 22, 2018
Discussion: Restyle UDFs in the context of DataFrames
DataFrames
enhancement
rdd
Scala
#425
by lintool
was closed Mar 18, 2020
Add EscapeHTML Function for ExtractLinks
enhancement
feature
#266
by ianmilligan1
was closed Sep 13, 2018
Use Tika's detected MIME type instead of ArchiveRecord getMimeType?
DataFrames
enhancement
Scala
#342
by jrwiebe
was closed Aug 14, 2019
More complete Twitter Ingestion
enhancement
feature
#194
by greebie
was closed Jul 15, 2019
10 of 16 tasks
Plain Text UDF that combines RemoveHTML + RemoveHttpHeader
enhancement
rdd
wontfix
#270
by ianmilligan1
was closed Oct 1, 2018
Add method for unknown extensions in binary extractions
DataFrames
enhancement
resolve before 0.18.0
Scala
#343
by ruebot
was closed Aug 18, 2019
Test aut with Apache Spark 2.4.0
discussion
enhancement
on hold
#295
by ruebot
was closed Jul 17, 2019
Method to perform finer-grained selection of ARCs and WARCs
enhancement
in progress
RA-Task
#247
by lintool
was closed May 24, 2022
feature request: log when loadArchives opens and closes warc files in a dir
enhancement
RA-Task
#156
by dportabella
was closed Jan 31, 2019
UDFs that filter on url should also filter on src
DataFrames
enhancement
Scala
#418
by ruebot
was closed Feb 12, 2020
Changing keepDate to allow multiple dates, would close #108
enhancement
#161
by ianmilligan1
was merged Jan 8, 2018
Loading…
Video binary object extraction
DataFrames
enhancement
feature
Scala
#306
by ruebot
was closed Aug 13, 2019
Previous Next
ProTip!
Updated in the last three days: updated:>2024-09-17.