Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/oap/cluster plugin #1440

Merged
merged 11 commits into from
Jul 11, 2018

Conversation

peng-yongsheng
Copy link
Member

Please answer these questions before submitting the pull request

  • Why submit this pull request?
  • Bug fix
  • New feature provided
  • Improve performance

New feature or improvement

  • Describe the details and related test reports.

Cluster management implementation.

@peng-yongsheng peng-yongsheng added core feature Core and important feature. Sometimes, break backwards compatibility. backend OAP backend related. labels Jul 10, 2018
@peng-yongsheng peng-yongsheng added this to the 6.0.0-preview milestone Jul 10, 2018
@peng-yongsheng
Copy link
Member Author

Some test cases unfinished, I will supplement them later.

Copy link
Member

@wu-sheng wu-sheng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some suggestions and questions have been added to the comments.

@@ -0,0 +1,39 @@
# Observability Analysis Platform Cluster Management
OAP(Observability Analysis Platform) server is a distributed system, services need to find each
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the document, we are using backend to represent the collector or server. Let's keep it same.

OAP(Observability Analysis Platform) server is a distributed system, services need to find each
other. i.e. a web service needs to find query service, level 1 aggregate service need to find
level 2 aggregate service. Cluster management is just a client-side implementation. It must work
together with a distributed coordination server, (i.e. Zookeeper, Consul, Kubernetes.) unless
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be distributed coordination service. Service is Zookeeper says for itself. Ref https://zookeeper.apache.org/


### Cluster Management Plugins
By default, OAP server provides two implementations for cluster management, which are standalone
and zookeeper. When the applications being monitored are small, you can choose the standalone mode.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the scale of services, which are under monitoring, is small,

### Cluster Management Plugins
By default, OAP server provides two implementations for cluster management, which are standalone
and zookeeper. When the applications being monitored are small, you can choose the standalone mode.
If they are big, you must choose the cluster mode by zookeeper plugin, or you can implement another
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. If they are big - > Otherwise
  2. It(Scale) is big.
  3. Remove another

There are two interfaces defined beforehand in the OAP server core, which are Module register and
module query, all the cluster management plugins must implement those two interfaces.

* Module Register: When any modules which need to provide services for each other, those modules
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When any module needs to provide services for others, the module must do register through this interface.
The first half of adjustment is about the language only, the second half is that Cluster Management Interface can't guarantee the register will be into service discovery service. Like standalone is no service existed outside, Kubernetes is doing service instance management by itself, no real register happens.

* Module Register: When any modules which need to provide services for each other, those modules
must invoke this interface to register itself into the service discovery server.
* Module Query: When any modules which need to call each other, those modules must retrieve the
module set by this interface.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. When any modules which need to call each other -> Same as above. No need which. each other means in the same group or set, but in this case, they are not.
  2. The descriptions of Register and Query should be from the provider side. Such as When any module needs to find other services, it can use this interface to retrieve the service instance list in the certain order.

* Module Query: When any modules which need to call each other, those modules must retrieve the
module set by this interface.

### Process Flow Between Client and Cluster Management
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add the following description:
The client has two ways to connect the backend, one, use the direct link by a set of backend instance endpoint list, or you can use naming service, which considers your given list is just the seed nodes of the whole cluster. The following graph is showing you how naming service works. You need to check the probe documents to know which way(s) is(are) supported.

private final Map<String, ServiceCache<InstanceDetails>> serviceCacheMap;

public ServiceCacheManager() {
this.serviceCacheMap = new HashMap<>();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure about the OP on put is synchronized already? No concurrency?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just can know there should be multiple service names in this Map, but do they just be added during startup?

public List<InstanceDetails> query(String moduleName, String providerName) throws ServiceRegisterException {
List<ServiceInstance<InstanceDetails>> serviceInstances = cacheManager.get(NodeNameBuilder.build(moduleName, providerName)).getInstances();

List<InstanceDetails> instanceDetails = new ArrayList<>(serviceInstances.size());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a high-frequency call? Will query be called everytime you do secondary aggregation? If so, be careful of this ArrayList.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ArrayList is a fixed size set.

cluster:
zookeeper:
hostPort: localhost:2181
sessionTimeout: 100000
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any purpose to change this?

@coveralls
Copy link

coveralls commented Jul 11, 2018

Coverage Status

Coverage increased (+0.6%) to 25.07% when pulling b1b5013 on peng-yongsheng:feature/oap/cluster_plugin into a26e6af on apache:6.0.

@wu-sheng wu-sheng merged commit dcdfeb1 into apache:6.0 Jul 11, 2018
@peng-yongsheng peng-yongsheng deleted the feature/oap/cluster_plugin branch July 30, 2018 02:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend OAP backend related. core feature Core and important feature. Sometimes, break backwards compatibility.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants