Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document the command line app. #51

Merged
merged 3 commits into from
Apr 7, 2020
Merged

Document the command line app. #51

merged 3 commits into from
Apr 7, 2020

Conversation

ruebot
Copy link
Member

@ruebot ruebot commented Apr 7, 2020

@lintool @ianmilligan1 @SamFritz here is a first crack at documenting the command line app. Please let me know how this approach works.

@ruebot
Copy link
Member Author

ruebot commented Apr 7, 2020

Copy link
Member

@SamFritz SamFritz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ruebot, this is looking really good! I like the flow you've given to all the documentation. I have a few surface level suggestions. My only major comment is related to how to start off with the scripts - hopefully I'm not overthinking things.

Look forward to working with this more! :)

current/aut-spark-submit-app.md Outdated Show resolved Hide resolved
current/aut-spark-submit-app.md Outdated Show resolved Hide resolved
@@ -0,0 +1,128 @@
# Using the Toolkit with spark-submit

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do these configuration options need to be used with a specific launch of the toolkit (e.g. package, uberjar, etc.)? At first glance, I guess I'm a little unsure of where to start or in terms of workflow, when this script would be introduced (e.g. use within or outside of sparkshell?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


The extration jobs have a basic outline of:

`spark-submit --class io.archivesunleashed.app.CommandLinAppRunner PATH_TO_AUT_JAR --extractor EXTRACTOR --input INPUT DIRECTORY --output OUTPUT DIRECTORY`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it possible to put in an example of how this script would look if one of us were to run it, below the basic outline? I find that when I have an example it's a bit easier to see what needs to be changed in line.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's what all the examples are below.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yeah, I realize the examples further down are more detailed - I just was going down a different line of thinking, so disregard my original comment

current/aut-spark-submit-app.md Show resolved Hide resolved
@ianmilligan1 ianmilligan1 merged commit dbe71c3 into master Apr 7, 2020
@ianmilligan1 ianmilligan1 deleted the issue-14 branch April 7, 2020 18:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Document command line app
3 participants