-
Notifications
You must be signed in to change notification settings - Fork 33
Issues: archivesunleashed/aut
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
Include last modified date for a resource
App
DataFrames
feature
Scala
#546
by ruebot
was closed Nov 7, 2022
DomainGraph should use YYYYMMDD not YYYYMMDDHHMMSS
App
bug
DataFrames
in progress
#544
by ruebot
was closed Oct 31, 2022
Remove http headers, and html on webpages()
bug
DataFrames
enhancement
#538
by ruebot
was closed May 30, 2022
Remove ExtractImageDetailsDF.scala
App
clean-up
DataFrames
Scala
#464
by ruebot
was closed May 24, 2020
Update PlainTextExtractor to just extract text
App
DataFrames
enhancement
Scala
#452
by ruebot
was closed Apr 22, 2020
Remove RDD options from app
App
clean-up
DataFrames
discussion
Scala
#449
by ruebot
was closed Apr 20, 2020
Add parquet as an app format option
App
DataFrames
feature
Scala
#448
by ruebot
was closed Apr 22, 2020
Add datathon derivatives to app (binary info, web pages, web graph
App
DataFrames
feature
Scala
#447
by ruebot
was closed Apr 21, 2020
DomainGraphExtractor produces different output in RDD vs DF
bug
DataFrames
Scala
#436
by ruebot
was closed Apr 8, 2020
Add graphml output to DomainGraphExtractor
DataFrames
feature
Scala
#435
by ruebot
was closed Apr 11, 2020
Add webgraph, imagegraph, webpages, etc. to command line app
DataFrames
feature
rdd
Scala
#431
by ruebot
was closed Apr 7, 2020
Discussion: Restyle UDFs in the context of DataFrames
DataFrames
enhancement
rdd
Scala
#425
by lintool
was closed Mar 18, 2020
Add alt text column to imageGraph (imageLinks)
DataFrames
enhancement
Scala
#420
by ruebot
was closed Feb 10, 2020
UDFs that filter on url should also filter on src
DataFrames
enhancement
Scala
#418
by ruebot
was closed Feb 12, 2020
Add crawl_date to binary DataFrames and imageLinks
DataFrames
enhancement
Scala
#413
by ruebot
was closed Jan 18, 2020
Implement Python versions of Serializable APIs
DataFrames
documentation
feature
PySpark
Python
#410
by ruebot
was closed May 20, 2020
Implement Python versions of App utilities
DataFrames
feature
PySpark
Python
#409
by ruebot
was closed May 27, 2020
Implement Python versions of Matchbox utilities
DataFrames
feature
PySpark
Python
#408
by ruebot
was closed May 19, 2020
DataFrame error with text files: java.net.MalformedURLException: unknown protocol: filedesc
bug
DataFrames
Scala
#362
by ruebot
was closed Dec 18, 2019
Add method for unknown extensions in binary extractions
DataFrames
enhancement
resolve before 0.18.0
Scala
#343
by ruebot
was closed Aug 18, 2019
Previous Next
ProTip!
Mix and match filters to narrow down what you’re looking for.