Skip to content

Add SpannerMetadataModule with ExtensionLoader #15959

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: develop
Choose a base branch
from

Conversation

123-komal
Copy link

No description provided.

@@ -2845,6 +2845,11 @@
<value>/opt/cdap/master/ext/log-publisher</value>
</property>

<property>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a flag to determine if we want to use spanner instead of elastic search?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just for extension (not specific to spanner) - I think we might not need a flag here.

Komal is anyways testing the default flow for ES implementation. So should be safe.

Constants.Metrics.STORAGE_METRICS_TAGS);
metricsCollector.increment(Constants.Metrics.MetadataStorage.METRICS_PREFIX + metricSuffix, 1L);
Constants.Metrics.STORAGE_METRICS_TAGS);
metricsCollector.increment(Constants.Metrics.MetadataStorage.METRICS_PREFIX + metricSuffix,1L);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are these lines changed? because of formatting? I think the earlier formatting was correct are you using Google formatter?

this.cConf = cConf;
this.extensionLoader = extensionLoader;
this.extensionLoader.getAll();

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove new line

@@ -179,7 +179,7 @@ public MetadataStorage get() {
return injector.getInstance(DatasetMetadataStorage.class);
}
if (Constants.Metadata.STORAGE_PROVIDER_ELASTICSEARCH.equalsIgnoreCase(config)) {
return injector.getInstance(ElasticsearchMetadataStorage.class);
return injector.getInstance(DefaultMetadataStorageProvider.class);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see this is loading the extenssion for spanner. i.e. after your change the new instances will not use elastic search is that expected? we dont have any control plane changes ready?

I think we should have a cconf flag to guard this change and when everything is okay we can turn that on from the control plane

@Override
public MetadataChange apply(MetadataMutation mutation, MutationOptions options)
throws IOException {
return getDelegate().apply(mutation, options);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we really need to call getDelegate everytime or we can cache the result ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is advisable to call getDelegate() each time as that method is rightly synchronized. Similar thing is done in DefaultStorageProvider and DelegatingMessagingService.

String propertiesPrefix =
Constants.Dataset.STORAGE_EXTENSION_PROPERTY_PREFIX + storageImpl + ".";
this.cConf = Collections.unmodifiableMap(cConf.getPropsWithPrefix(propertiesPrefix));
this.properties = Collections.unmodifiableMap(cConf.getPropsWithPrefix(propertiesPrefix));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both these variables are having same values - do we need 2 separate vars here?

import java.util.stream.Collectors;
import org.apache.tephra.TransactionSystemClient;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please take care while committing changes - the ones unrelated to your PR should not be changed (even if Google formatter updates few other lines) Otherwise you will mess up the blame / git history for the original changes.

* A simple class to pass around a Spanner Mutation, along with the metadata
* change that it effects.
*/
public class RequestAndChange implements ChangeRequest {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Usually naming convention is like : For spanner impl you can call this class as SpannerChangeRequest

For ES -> ElasticSearchChangeRequest / ESChangeRequest.

import java.util.Map;

public interface MetadataStorageContext {
Map<String, String> getProperties();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please check if this one is needed.

pom.xml Outdated
@@ -48,6 +48,9 @@
<organizationUrl>http://www.cdap.io</organizationUrl>
</developer>
</developers>
<modules>
<module>cdap-metadata-ext-spanner</module>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can move this to place where other modules are added.

@123-komal 123-komal force-pushed the es-spanner_feature branch from 366ff67 to fe0dfb3 Compare May 30, 2025 03:42
import java.io.IOException;
import java.util.List;

public class DefaultMetadataStorageProvider implements MetadataStorage {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if we should call this as Provider.

DefaultStorageProvider implements StorageProvider (in case of storage tables).
For Metadata, we do not have an interface like MetadataStorageProvider.

Maybe we can call it as DefaultMetadataStorage / DelegatingMetadataStorage?

@bhardwaj-priyanshu - wdyt?


public String getName() {
return cConf.get(Constants.Metadata.STORAGE_PROVIDER_SPANNER);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should just read the MetadataStorage property from cConf.

This function will check if the property value matches the getName() in ES or getName in Spanner impl.

Or you can even follow something llike this : https://github.com/cdapio/cdap/blob/develop/cdap-data-fabric/src/main/java/io/cdap/cdap/spi/data/common/DefaultStorageProvider.java#L53

and then have switch case like : https://github.com/cdapio/cdap/blob/develop/cdap-data-fabric/src/main/java/io/cdap/cdap/spi/data/common/DefaultStorageProvider.java#L96

Both the cases work.

<dependency>
<groupId>com.google.cloud</groupId>
<artifactId>google-cloud-spanner</artifactId>
<version>6.12.1</version>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please update the spanner version to use latest. Actually the spanner package comes from the google cloud library. You need to override the google cloud library in this extension to use the latest version like this - https://github.com/cdapio/cdap/blob/develop/cdap-storage-ext-spanner/pom.xml#L41.

</dependency>
<dependency>
<groupId>io.cdap.cdap</groupId>
<artifactId>cdap-common</artifactId>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not have dependency on cdap-common ideally. There is a chance that some mvn packages will have versions incompatible to the latest spanner ones.

Please check if there is a way to remove it.

@123-komal 123-komal force-pushed the es-spanner_feature branch 3 times, most recently from 18e8057 to 05ce05e Compare May 30, 2025 09:46
@@ -2845,6 +2845,11 @@
<value>/opt/cdap/master/ext/log-publisher</value>
</property>

<property>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just for extension (not specific to spanner) - I think we might not need a flag here.

Komal is anyways testing the default flow for ES implementation. So should be safe.

@@ -179,7 +179,7 @@ public MetadataStorage get() {
return injector.getInstance(DatasetMetadataStorage.class);
}
if (Constants.Metadata.STORAGE_PROVIDER_ELASTICSEARCH.equalsIgnoreCase(config)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please update the if case as discussed earlier.
Let the DefaultStorageProvider handle the if else logic for ES to use or Spanner

@@ -83,6 +84,16 @@ public void setAuditPublisher(AuditPublisher auditPublisher) {
this.auditPublisher = auditPublisher;
}

@Override
public String getName() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please verify how this is called since the default value of metadata.storage.implementation=elastic

if (metadataStorage != null) {
return metadataStorage;
}
LOG.info(extensionLoader.get(getName()).toString());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove extra loggers that are used for debugging right now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you will run into NPE if getName is empty string - we should not run into NPEs.

@@ -74,6 +74,16 @@ public class DatasetMetadataStorage extends SearchHelper implements MetadataStor
super(txClient, tableDefinition);
}

@Override
public String getName(){
return "DatasetMetadataStorage";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can instead use .getClass().getSimpleName() instead

<scope>compile</scope>
</dependency>
<dependency>
<groupId>com.google.cloud</groupId>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please update the entire google cloud version
This will eventually be overridden when google cloud package is imported

https://github.com/cdapio/cdap/blob/develop/cdap-storage-ext-spanner/pom.xml#L41

You can try mvn dependency tree command to find out which version is pulled.


@Override
public void initialize(MetadataStorageContext context) throws Exception {
Log.info("Spanner Initialized");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


@Override
public String getName() {
return Constants.Metadata.STORAGE_PROVIDER_SPANNER;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably this is adding the cdap-common dependency.
Please feel free to remove this constant to avoid that dependency.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not see the META-INF for this. Please check - I guess it would be needed for Extension loader.

Format Changes

changes

Changes

name-changes

formatting-changes

elastic-changes
@123-komal 123-komal force-pushed the es-spanner_feature branch from 05ce05e to b2a0e2b Compare May 30, 2025 16:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants