Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOC: Remove unnecessary appendix in SSS doc, add missing links #1390

Merged
merged 2 commits into from
Sep 13, 2020

Conversation

HeartSaVioR
Copy link
Contributor

Appendix in SSS page was actually to deduplicate code in maintenance page, but it didn't move to maintenance page once we extract the content from SSS page.

I also add a link for compacting data files in SSS page.

@@ -130,3 +130,25 @@ table.rewriteManifests()

See the [`RewriteManifestsAction` Javadoc](/javadoc/master/org/apache/iceberg/actions/RewriteManifestsAction.html) to see more configuration options.

## Appendix
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this here? It seems like we should link to the API sections that cover this instead of duplicating the content.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel there's a slight gap between content in Create a table in java-api-quickstart page and here, but it would be OK to just link the section if we think end users can infer they need to call loadTable instead. WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree about the API quickstart. Now that Spark SQL supports DDL operations, we should start in that section with loading a table. At the top, we can point to the "Getting started" page with SQL create table and the section on creating tables, but after that talk about how to load tables with the API.

And linking to the quickstart is good for this, since we want to point to the right docs, not copy information. Thanks!

@@ -76,31 +76,9 @@ Each micro-batch written to a table produces a new snapshot, which are tracked i

### Compacting data files

The amount of data written in a micro batch is typically small, which can cause the table metadata to track lots of small files. Compacting small files into larger files reduces the metadata needed by the table, and increases query efficiency.
The amount of data written in a micro batch is typically small, which can cause the table metadata to track lots of small files. [Compacting small files into larger files](../maintenance#compact-data-files) reduces the metadata needed by the table, and increases query efficiency.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good, thank you!

@HeartSaVioR HeartSaVioR changed the title DOC: Move appendix in SSS doc to maintenance, add missing link DOC: Remove unnecessary appendix in SSS doc, add missing links Sep 1, 2020
@HeartSaVioR
Copy link
Contributor Author

I've changed the approach to point to Java quickstart page as suggested.

Screen Shot 2020-09-01 at 11 01 31 PM

Screen Shot 2020-09-01 at 11 02 02 PM

Could you please take a look again? Thanks in advance!

@@ -41,6 +41,9 @@ import org.apache.iceberg.catalog.TableIdentifier;

TableIdentifier name = TableIdentifier.of("logging", "logs");
Table table = catalog.createTable(name, schema, spec);

// or to load an existing table, use the following line
// Table table = catalog.loadTable(name);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should add a section for this instead of adding this block in multiple places. I'm happy to do that if you want to keep it separate from this PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes I did this because having a new section for loading table would duplicate the code here. It'd be easier for us to have a different PR (or a PR against my fork) if you have better idea. Thanks!

@rdblue rdblue merged commit 3bc8eb9 into apache:master Sep 13, 2020
@rdblue
Copy link
Contributor

rdblue commented Sep 13, 2020

Thanks, @HeartSaVioR! I'm merging this and will follow up with a PR to better explain how to load a table.

@HeartSaVioR
Copy link
Contributor Author

Thanks for reviewing and merging!

@HeartSaVioR HeartSaVioR deleted the MINOR-DOC-FIX-maintenance branch September 14, 2020 01:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants