Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(backup): Add restore indices and restore backup tasks #2779

Merged

Conversation

dexter-mh-lee
Copy link
Contributor

Use the datahub-upgrade framework to restore indices and restore backup.

  • Restore indices: Get the latest version of each aspect from local db and produce mae events for each of them
  • Restore backup: use EbeansBackupReader to get each row of the local db. Update local db and produce mae events. (For now only supports local parquet)

Checklist

  • The PR conforms to DataHub's Contributing Guideline (particularly Commit Message Format)
  • Links to related issues (if applicable)
  • Tests for the changes have been added/updated (if applicable)
  • Docs related to the changes have been added/updated (if applicable)

@@ -61,6 +61,7 @@ project.ext.externalDependency = [
'guice': 'com.google.inject:guice:4.2.2',
'guava': 'com.google.guava:guava:27.0.1-jre',
'h2': 'com.h2database:h2:1.4.196',
'hadoopClient': 'org.apache.hadoop:hadoop-client:3.1.0',
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to confirm: Have we run any sec scans since adding these deps?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good

@@ -0,0 +1,119 @@
{{- if .Values.datahubUpgrade.enabled -}}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this correct? Why does this depend on datahubUpgrade flag?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All k8s files under datahub-upgrade are all enabled by the same flag

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Enabling it doesn't do much until you actually run the job since these cronjobs are suspended by default

/**
* Returns whether or not to skip the step based on the UpgradeContext
*/
default boolean skip(UpgradeContext context) {
Copy link
Collaborator

@jjoyce0510 jjoyce0510 Jun 29, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: Why do we need this API if the step can simply check the context and no-op inside its "execute" method?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because I wanted to mention in the logs that it was skipped.

Right now it says that DeleteAspectV2TableStep ran successfully which to me implies that the table was deleted. But it actually wasn't, and was rather just skipped. I think it's good to make this explicit.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay makes sense.. agree its useful to know

if (_alwaysRun) {
return false;
}
if (context.parsedArgs().containsKey(NoCodeUpgrade.CLEAN_ARG_NAME)) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add a "cleanup has been requested" report line here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That log is added in the execute function. I can move it to here?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no lets just keep that


@Override
public String id() {
return "GMSEnableWriteModeStep";
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be GMSDisableWriteModeStep?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OMG making this mistake way too many times


final UpgradeStepResult stepResult = executeStepInternal(context, step);
stepResults.add(stepResult);

// Apply Actions
if (UpgradeStepResult.Action.ABORT.equals(stepResult.action())) {
upgradeReport.addLine(String.format("Step with id %s requested an abort of the in-progress update. Aborting the upgrade...", step.id()));
upgradeReport.addLine(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My formatter :(

return new DefaultUpgradeResult(UpgradeResult.Result.ABORTED, upgradeReport);
}

// Handle Results
if (UpgradeStepResult.Result.FAILED.equals(stepResult.result())) {
if (step.isOptional()) {
upgradeReport.addLine(String.format("Failed Step %s/%s: %s. Step marked as optional. Proceeding with upgrade...", i + 1, steps.size(), step.id()));
upgradeReport.addLine(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome!!!!

UpgradeStepResult.Result.FAILED);
context.report()
.addLine(
String.format("Caught exception during attempt %s of Step with id %s: %s", i, step.id(), e.toString()));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for making this class better!!!

final List<UpgradeStep> steps = new ArrayList<>();
steps.add(new RemoveAspectV2TableStep(server));
steps.add(new GMSQualificationStep());
steps.add(new UpgradeQualificationStep(server));
steps.add(new CreateAspectTableStep(server));
steps.add(new IngestDataPlatformsStep(entityService));
steps.add(new DataMigrationStep(server, entityService, entityRegistry));
steps.add(new GMSEnableWriteModeStep());
steps.add(new GMSEnableWriteModeStep(entityClient));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we ever disable write mode??

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

During backup restore!

return "ClearAspectV2TableStep";
}

@Override
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just confirming: No skip state required here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now, I haven't seen any use case for it. The other ones were used by the RestoreIndices task which doesn't require clearing the MySQL table.


@Override
public List<UpgradeCleanupStep> cleanupSteps() {
return ImmutableList.of();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor nit: Collections.emptyList()

}

EbeanAspectBackupIterator iterator = _backupReaders.get(backupReaderName.get()).getBackupIterator(context);
if (iterator == null) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It wasn't immediately obvious that the iterator could be null. Should the iterator have some method on it for this?

ie. iterator.hasData()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is hard because there is no null iterator. I'll make this Optional to make this more explicit.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ty

public interface BackupReader {
String getName();

EbeanAspectBackupIterator getBackupIterator(UpgradeContext context);
Copy link
Collaborator

@jjoyce0510 jjoyce0510 Jun 29, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wish this interface didn't have to know about UpgradeContext... does it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was mainly because of the logging. If we can just log.info, then we don't need to pass in the context.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

forgot about this -- we should be able to just use the logger as well

return new ParquetEbeanAspectBackupIterator(reader);
} catch (IOException e) {
context.report().addLine(String.format("Failed to build ParquetReader: %s", e));
return null;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we throw here instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens when an exception is thrown here? Does it say the step failed?


/**
* Iterator to retrieve EbeanAspectV2 objects from the ParquetReader
* Converts the avro GeneralRecord object into EbeanAspectV2
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor nit: GeneralRecord -> GenericRecord

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

}

private EbeanAspectV2 convertRecord(GenericRecord record) {
if (record == null) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Case is already covered above

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch. removed


@Override
public List<UpgradeCleanupStep> cleanupSteps() {
return ImmutableList.of();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor nit: Collections.emptyList()


context.report().addLine("Sending MAE from local DB...");
final int rowCount = _server.find(EbeanAspectV2.class).where().eq(EbeanAspectV2.VERSION_COLUMN, 0).findCount();
context.report().addLine(String.format("Found %s rows in legacy aspects table", rowCount));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not the legacy table right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

int count = getBatchSize(context.parsedArgs());
while (start < rowCount) {

context.report()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

./docker/datahub-upgrade/datahub-upgrade.sh -u RestoreIndices
```

If you need to clear the search and graph indices before restoring, add `-a clean` to the end of the command.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding a note about the default env variables used and how to change them to suit your deployment

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a link to the datahub-upgrade readme section on env variables. Also updated that doc


## Kubernetes

Run `kubectl get cronjobs` to see if the restoration job template has been deployed. If you see results like below, you
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good stuf!!!

@@ -18,6 +19,7 @@
private ElasticSearchService elasticSearchService;

@Bean(name = "searchService")
@Primary
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just curious, why is this needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure :( Build failed because it said there were two instances of SearchService one from here and one from ElasticSearchServiceFactory. I have no idea why this is like this while graph service works fine.

@@ -204,10 +205,10 @@
*/
@Action(name = "setWritable")
@Nonnull
public Task<Void> setWriteable() {
public Task<Void> setWriteable(@ActionParam(PARAM_VALUE) @Optional("true") @Nonnull Boolean value) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for making this change!

_canWrite = canWrite;
}

public void disableWrite() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can delete this now right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@Nonnull final RelationshipFilter relationshipFilter,
final int offset,
final int count) {
public List<String> findRelatedUrns(@Nullable final String sourceType, @Nonnull final Filter sourceEntityFilter,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ughhhhh hate these reformattings

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is much less readable imo. maybe we need to change the styling rules

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Me too :( I'm going to revert this file and add back the one function I added.

@@ -30,6 +30,10 @@ public void configure() {
indexBuilders.buildAll();
}

@Override
public void clear() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to do something here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch

Copy link
Collaborator

@jjoyce0510 jjoyce0510 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, LGTM.

Thanks for addressing the comments!

Copy link
Contributor

@shirshanka shirshanka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@shirshanka shirshanka merged commit 8f0f322 into datahub-project:master Jun 30, 2021
wan54 pushed a commit to wan54/datahub that referenced this pull request Jul 15, 2021
* fix(react): update ispartofbuilderfromdataflow, update ui in datajob header (datahub-project#2619)

* fix(ingestion): improve robustness of glue ingestion source (datahub-project#2626)

fixes: datahub-project#2625

Co-authored-by: thomas.larsson <thomas.larsson@klarna.com>

* feat(react): custom properties are now sortable by name in the UI (datahub-project#2622)

* fix(ci): update trigger to always generate docker images (datahub-project#2636)

* fix(ingest): include urn as key for kafka emitter (datahub-project#2634)

* fix(react): url encoding urns and tag profile fix (datahub-project#2623)

* feat(react): replace user urn with username (datahub-project#2599)

Co-authored-by: shubham.garg <shubham.garg@thoughtworks.com>

* fix(ingest): improve redshift ingestion performance (datahub-project#2635)

* docs: update roadmap with accomplished items (datahub-project#2617)

* feat: No Code Metadata Modeling (datahub-project#2629)

Co-authored-by: Dexter Lee <dexter@acryl.io>
Co-authored-by: Gabe Lyons <itsgabelyons@gmail.com>
Co-authored-by: Shirshanka Das <shirshanka@apache.org>

* docs(no-code): removing unused snapshot annotations (datahub-project#2642)

* feat(datahub-upgrade): Use the "head" image tag.  (datahub-project#2641)

* fix(datahub-upgrade): remove debug flag (datahub-project#2640)

* fix(build): don't purge ingestion codegen files on gradle clean (datahub-project#2645)

* fix(analytics): fix index names (datahub-project#2647)

* fix(nocode): Fix docker image tag for run_upgrade script (datahub-project#2648)

* docs(nocode): Update migration docs with workaround for stuck upgrade(datahub-project#2649)

* fix: use head tag by default in quickstart (datahub-project#2650)

* fix(datahub-upgrade): fixing mis-spelled schema registry url (datahub-project#2652)

* fix(gms): Return empty snapshots when one does not exist (datahub-project#2646)

* docs(nocode): fix links to datahub-upgrade (datahub-project#2651)

* feat(datahub-upgrade): improve no code upgrade logging (datahub-project#2653)

* fix(docker): pin containers to golden hash for release (datahub-project#2654)

* feat(nocode): Add datahub-upgrade job to helm chart and set version to v0.8.1 (datahub-project#2643)

* docs(nocode): Adding documentation for no-migration upgrade option (datahub-project#2656)

* revert: "fix(docker): pin containers to golden hash for release (datahub-project#2654)" (datahub-project#2659)

This reverts commit a483933 and moves
us back to using HEAD in quickstart on the master branch.

* fix(docker): use debug tag in local dev images (datahub-project#2658)

* fix(ingest): support mssql encryption via ODBC (datahub-project#2657)

* fix(NoCode): Update snapshot json to latest (datahub-project#2655)

* fix(ingest): exclude mssql-odbc from "all" extra (datahub-project#2660)

* docs(nocode): adding documentation to handle backwards incompatibility issues (datahub-project#2662)

* docs(nocode): fix typo in autocomplete annotation (datahub-project#2664)

* build(docker): test docker builds in pull request CI (datahub-project#2661)

* fix(ingest): fix MyPy stubs (datahub-project#2666)

* feat(ingest): Feast ingestion integration (datahub-project#2605)

* Add feast testing setup

* Init Feast test script

* Add feast to dependencies

* Update feast descriptors

* Sort integrations

* Working feast pytest

* Clean up feast docker-compose file

* Expand Feast tests

* Setup feast classes

* Add continuous and bytes data to feature types

* Update field type mapping

* Add PDLs

* Add MLFeatureSetUrn.java

* Comment out feast setup

* Add snapshot file and update inits

* Init Feast golden files generation

* Clean up Feast ingest

* Feast testing comments

* Yield Feature snapshots

* Fix Feature URN naming

* Update feast MCE

* Update Feature URN prefix

* Add MLEntity

* Update golden files with entities

* Specify feast sources

* Add feast source configs

* Working feast docker ingestion

* List entities and features before adding tables

* Add featureset names

* Remove unused

* Rename feast image

* Update README

* Add env to feast URNs

* Fix URN naming

* Remove redundant URN names

* Fix enum backcompatibility

* Move feast testing to docker

* Move URN generators to mce_builder

* Add source for features

* Switch TypeClass -> enum_type

* Rename source -> sourceDataset

* Add local Feast ingest image builds

* Rename Entity -> MLPrimaryKey

* Restore features and keys for each featureset

* Do not json encode source configs

* Remove old source properties from feature sets

* Regenerate golden file

* Fix race condition with Feast tests

* Exclude unknown source

* Update feature datatype enum

* Update README and fix typos

* Fix Entity typo

* Fix path to local docker image

* Specify feast config and version

* Fix feast env variables

* PR fixes

* Refactor feast ingest constants

* Make feature sources optional for back-compatibility

* Remove unused GCP files

* adding docker publish workflow

* Simplify name+namespace in PrimaryKeys

* adding docker publish workflow

* debug

* final attempt

* final final attempt

* final final final commit

* Switch to published ingestion image

* Update name and namespace in java files

* Rename FeatureSet -> FeatureTable

* Regenerate codegen

* Fix initial generation errors

* Update snapshot jsons

* Regenerated schemas

* Fix URN formats

* Revise builds

* Clean up feast URN builders

* Fix naming typos

* Fix Feature Set -> Feature Table

* Fix comments

* PR fixes

* All you need is Urn

* Regenerate snapshots and update validation

* Add UNKNOWN data type

* URNs for source types

* Add note on docker requirement

* Fix typo

* Reorder aspect unions

* Refactor feast ingest functions

* Update snapshot jsons

* Rebuild

Co-authored-by: Shirshanka Das <shirshanka@apache.org>

* fix(no-code): Adding Chart input relationship annotations (datahub-project#2669)

* chart input relationship pdl fix

* commiting schema.avsc changes

* Add get_identifier to hive source in metadata ingestion (datahub-project#2667)

* feat(k8s): Add imagePullSecrets to all K8's jobs (datahub-project#2671)

* fix(analytics): Support sasl authentication to kafka (datahub-project#2675)

* feat(ingest): headers for codegen Python scripts (datahub-project#2637)

* fix(ingest): pin to new mypy version (datahub-project#2670)

* Fix 2592 neo4j connection options (datahub-project#2632)

* fix(ingest): upgrade acryl-pyhive to use sasl3 instead of sasl (datahub-project#2684)

* feat(ingest): support Oracle service names (datahub-project#2676)

* feat(sql_views): added views as datasets for SQLAlchemy DBs (datahub-project#2663)

* removing whitespace from service aspect (datahub-project#2688)

* Only work with dbt catalog data if load_catalog is False (datahub-project#2686)

* feat(docs): Docs for S3 ingestion with AWS Glue (datahub-project#2672)

* feat(datahub cli): DataHub CLI Quickstart  (datahub-project#2689)

* feat(gms): Merge MAE, MCE consumers into GMS (datahub-project#2690)

* fix(NoCode): Fix product analytics (datahub-project#2697)

* feat(gms):  support basic auth header when connecting to elasticSearch (datahub-project#2701)

* Add support for different oidc client authentication methods (datahub-project#2691)

* feat(aspects): support fetching of versioned aspects (datahub-project#2677)

* feat(entities): add markdown description update/viewer feature in dataset, datajob, dataflow, chart and dashboard, update ui/ux (datahub-project#2707)

* feat(ingest): Add test case and docs for SQL view ingestion (datahub-project#2709)

* fix(ingest): use `looker` data platform (datahub-project#2708)

* test(ingest): simplify docker cleanup commands (datahub-project#2699)

* feat: Adding annotation option to frontend service (datahub-project#2674)

* feat(ingest): expose additional types to Python via codegen (datahub-project#2712)

* Removed key not in catalog filter from dbt and added node filter using AllowDenyPattern (datahub-project#2702)

* added node_type_pattern in dbt yaml file (datahub-project#2705)

* fix(editable descriptions): adding indexing for editable descriptions (datahub-project#2710)

* feat(quickstart): use versioned docker images when on a release tag (datahub-project#2715)

* fix(noCode): Improving efficiency of EntityService "listLatestAspects" API  (datahub-project#2711)

* fix(react): update schema description edit behavior (datahub-project#2718)

* Update version for helm (datahub-project#2719)

* fixing docs sidebar (datahub-project#2721)

* fix(docs): update extending-the-metadata-model.md (datahub-project#2727)

* fix(docker): update default tags to head (datahub-project#2730)

* fix(looker): fix invalid URN syntax error (datahub-project#2737)

* fix(ci): increase Feast docker setup timeout (datahub-project#2722)

* fix(ingest): types for dbt (datahub-project#2716)

* feat(ingest): add support for Glue ETL jobs (datahub-project#2687)

* fix(ingest): fix lookml platform URN (datahub-project#2742)

* fix(ingest): update lookml test (datahub-project#2741)

* fix(gms): fixes for version aspect fetching (datahub-project#2723)

* feat(graph): support using elasticsearch as graph backend. (datahub-project#2726)

* feat(ingest): print docker logs on timeout (datahub-project#2743)

* fix(ci): increase wait-for-it timeout to fix flaky feast test (datahub-project#2747)

* feat(docker): reduce quickstart footprint (datahub-project#2744)

* feat(quickstart): remove orphaned docker containers on quickstart through cli (datahub-project#2748)

* fix(gms): add rest.li validation in gms (datahub-project#2745)

* fix(datahub-upgrade): add support for postgres migration (datahub-project#2740)

* fix(docker): modernize docker images and fix vulnerabilities (datahub-project#2746)

* feat(ingest): ingest last-modified from dbt sources.json (datahub-project#2729)

* feat(ingest): add option to specify source platform database in lookml ingestion (datahub-project#2749)

* fix(gms): make get return entity type (datahub-project#2739)

* fix(elastic-as-graph): adding elasticsearch setup back in (datahub-project#2752)

* fix(browse): sort by doc count descending (datahub-project#2751)

* Revert "fix(gms): add rest.li validation in gms (datahub-project#2745)" (datahub-project#2754)

This reverts commit f0b4adc.

* fix(docker): use head tag for datahub-ingestion (datahub-project#2760)

* feat(elastic-as-graph): defaulting to elastic in quickstart (datahub-project#2753)

* fix(react): reverse result of topological sort, autofocus in add tag modal (datahub-project#2759)

* feat: usage stats (part 1) (datahub-project#2750)

Co-authored-by: Gabe Lyons <itsgabelyons@gmail.com>

* feat: usage stats (part 2) (datahub-project#2762)

Co-authored-by: Gabe Lyons <itsgabelyons@gmail.com>

* docs: update for Jun townhall (datahub-project#2764)

* fix(frontend): auth session ttl specified in hours instead of days. M… (datahub-project#2695)

Fixes: datahub-project#2680

Co-authored-by: thomas.larsson <thomas.larsson@klarna.com>

* fix(react): move percent sign after number and update meta tag (datahub-project#2765)

* fix(docker): Fix dependency vulnerability (datahub-project#2763)

* fix(datahub-upgrade): removing the CleanupStep from datahub upgrade (datahub-project#2728)

* docs(ingest): move usage stats docs into the "sources" section (datahub-project#2766)

* fix(react): update the query frequency text label (datahub-project#2767)

* feat(k8s): add GCP deploy recipe (datahub-project#2768)

* fix(react): fix graphql apollo cache update issue cause of usagestats (datahub-project#2769)

* feat(logs): improve logging in GMS and datahub-frontend (datahub-project#2761)

* fix(nocode): removing service PDL (datahub-project#2772)

* fix(react): update platform text in dataset profile header (datahub-project#2771)

* feat(logs): add thresholding, misc cleanup (datahub-project#2773)

* docs: upgrade docusaurus, minor ingestion updates (datahub-project#2774)

* feat(ingest): add non-random sampling for mongo (datahub-project#2778)

* feat(react-graphql-mock): mock graphql queries and mutations with miragejs

* feat(react): configure Cypress + MirageJS + GraphQL mock for functional testing plus a couple of example tests

* fix(react): fix mutation helper for updateentitylink

* feat(react): update graphql issues with mock data

* fix(docker): Upgrade to 4.1.44 netty-all library (datahub-project#2784)

* refactor(ingest): use common get_sys_time method (datahub-project#2782)

* fix(docs): links to Feast entities (datahub-project#2780)

* feat(k8s): Upgrade to v0.8.4 (datahub-project#2792)

* fix(frontend): making nested SSL configs optional (datahub-project#2785)

* fix(ingest): use correct platform for MongoDB ingestion (datahub-project#2783)

* Update quickstart to include system prune (datahub-project#2794)

* docs(website): add releases page (datahub-project#2776)

* refactor(ingest): remove deprecated methods and warn on deprecated import (datahub-project#2797)

* fix(usage-stats): add indices target for usage stats query (datahub-project#2798)

* fix(ingest): handle case when view definition handler is not implemented (datahub-project#2796)

* fix(ingest): quote table names in hive (datahub-project#2801)

* fix(datahub-upgrade): add runtime dependency on logbackClassic (datahub-project#2799)

* fix(search): Filter out "removed" entities from autocomplete and analytics (datahub-project#2781)

* feat(ingest): Improve lookml sql derived tables detection, add cascading derived tables to lineage (datahub-project#2770)

* feat(ingest): SageMaker feature store ingestion (datahub-project#2758)

* fix(ingest): better warnings and error handling for rest sink (datahub-project#2800)

* fix(usage-stats): Add usage stats factory to mae processor config (datahub-project#2803)

* refactor(ingest): extract sqlalchemy uri generation logic (datahub-project#2786)

* fix(quickstart): Fixing manual mysql quickstart (datahub-project#2811)

* feat(ingest): support ingesting from multiple snowflake dbs (datahub-project#2793)

* feat(backup): Add restore indices and restore backup tasks (datahub-project#2779)

* docs(website): hide outdated FAQs page (datahub-project#2809)

* feat(ingest): refactor mce comparison and add pytest update golden files option (datahub-project#2812)

* fix(k8s): upgrade helm version (datahub-project#2810)

* feat(ingest): basic support for complex hive types (datahub-project#2804)

* fix(datahub-upgrade): fix vulnerabilities (datahub-project#2813)

* chore(helm): upgrade version to v0.8.5 (datahub-project#2814)

* feat(datahub-frontend): Adding basic file-based authentication to datahub-frontend (datahub-project#2818)

* fix(ingest): view handling resilience for redshift (datahub-project#2816)

* docs(ingest): add extra info for Redshift behind a proxy (datahub-project#2817)

* fix(docs): update OIDC docs  (datahub-project#2823)

* docs(docker): fix check command reference (datahub-project#2822)

Fixes datahub-project#2820.

* docs(elastic-for-graph): Add migrating from neo4j to elastic instructions (datahub-project#2826)

* fix(ingest): convert superset timestamps to micros (datahub-project#2827)

* Previous values were in seconds, which rendered incorrectly in the UI.

* feat(ci): separate metadata-ingestion into a separate workflow (datahub-project#2828)

* fix(cli): change docker nuke to also remove stopped containers (datahub-project#2825)

* fix(mae-consumer): support standalone mae consumer without neo4j (datahub-project#2824)

* docs: update H2 2021 roadmap (datahub-project#2829)

* docs(ingest): update links to Kafka docs (datahub-project#2834)

* fix(ingest): mask password in info-level logs (datahub-project#2835)

* fix(ingest): various BigQuery source fixes (datahub-project#2836)

* docs(ingest): clarify that the Kafka options are pass-through (datahub-project#2837)

* fix(ingest): do not fail dbt ingestion when encountering missing nodes  (datahub-project#2833)

* feat (helm datahub-gms): add ingress template to datahub-gms helm chart (datahub-project#2832)

Co-authored-by: christopher.coulson <christopher.coulson@higpagesgroup.com.au>

* feat(react): fix apollo client cache issues in entities profile pages (datahub-project#2841)

* fix(ingest): avoid setting timestamps unless source system provides it (datahub-project#2843)

* fix(docs): fix some broken links in the docs (datahub-project#2846)

* fix(ingest): Fix glob pattern and handle possible recursion in lookml (datahub-project#2851)

* feat(ingest): prettify stack traces in CLI (datahub-project#2845)

* fix(analytics): Fix SSL issue with analytics on frontend (datahub-project#2840)

* fix(ingest): check for dbt materialization before proceeding (datahub-project#2842)

* feat(docs): throttle and retry requests in doc generation (datahub-project#2847)

* feat(ingest): SageMaker jobs and models (datahub-project#2830)

* docs(ingest): remove hanging sentence from docs (datahub-project#2853)

* fix(ingest): delete pycache files when running clean (datahub-project#2852)

* fix(ingest): handle 'fields' list missing in bigquery-usage (datahub-project#2844)

* feat(k8s): Extract helm charts into a separate repo (datahub-project#2848)

* fix(react): fix searchbar width issue in small screen (datahub-project#2856)

* feat(react): implement visualizing historical versions of dataset schema (datahub-project#2855)

* feat(docs): carousels for videos and articles (datahub-project#2857)

* fix(frontend): handling null aspects in AspectType (datahub-project#2860)

* fix: fixing lint issue (datahub-project#2861)

* feat(ingest): support dynamic imports for transfomer methods (datahub-project#2858)

* feat(docs): swap Medium and videos sections (datahub-project#2859)

* feat(ingest): add browse paths + dataplatform for Feast features (datahub-project#2849)

* fix(build): increase retries for dependency fetches (datahub-project#2867)

* build(ingest): separate metadata-ingestion build workflow fully (datahub-project#2862)

* fix(quickstart): update compose spec version (datahub-project#2873)

Also update the defaults to point at the head tag instead of latest.

* docs(quickstart): add default password to quickstart (datahub-project#2875)

* feat(ingest): extract dbt meta fields (datahub-project#2876)

* feat(ingest): update golden files only when diff fails (datahub-project#2869)

* build(ingestion): add version prompt to release script (datahub-project#2866)

* fix(react): fix bug in description update modal (datahub-project#2874)

* fix(react): update dataset graphql mock

Co-authored-by: Thomas Larsson <thomas.p.larsson@gmail.com>
Co-authored-by: thomas.larsson <thomas.larsson@klarna.com>
Co-authored-by: Rickard Cardell <rickard.cardell@klarna.com>
Co-authored-by: Dexter Lee <dexter@acryl.io>
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
Co-authored-by: Gabe Lyons <itsgabelyons@gmail.com>
Co-authored-by: shubham garg <gargshubham49@gmail.com>
Co-authored-by: shubham.garg <shubham.garg@thoughtworks.com>
Co-authored-by: Shirshanka Das <shirshanka@apache.org>
Co-authored-by: John Joyce <john@acryl.io>
Co-authored-by: Kevin Hu <6051736+kevinhu@users.noreply.github.com>
Co-authored-by: zack3241 <zmjlawson@gmail.com>
Co-authored-by: Vincenzo Lavorini <vincenzo.lavorini@protonmail.ch>
Co-authored-by: Remi <salmon.remi@gmail.com>
Co-authored-by: JeffSkinner <Jeff@RigaBee.com>
Co-authored-by: Peter Mortier <kwark@users.noreply.github.com>
Co-authored-by: kuntalkumarbasu <42655948+kuntalkumarbasu@users.noreply.github.com>
Co-authored-by: vijayan-nallasami-curve <85870918+vijayan-nallasami-curve@users.noreply.github.com>
Co-authored-by: wan <wan@noreply.gitlab.com>
Co-authored-by: datascienceChris <coulson.christopher@gmail.com>
Co-authored-by: christopher.coulson <christopher.coulson@higpagesgroup.com.au>
Co-authored-by: Fredrik Sannholm <fredrik.sannholm@wolt.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants