OOM due to large number of requests in TransportService.clientHandlers

**Describe the feature**:



**Elasticsearch version** (`bin/elasticsearch --6.3.2`):

**Plugins installed**: [repository-hdfs]

**JVM version** (`java -version`):10.0.2

**OS version** (`uname -a` if on a Unix-like system):centos7.2

**Description of the problem including expected versus actual behavior**:
our cluster is 24 data nodes with 31G heap and 1.7T*4 ssd disk , 3 master with 8G heap ,about 8000tps for write. we rolling upgrade(6.3.2 to 6.3.2) the cluster as the following steps:
①set allocation to none
②restart the data node 	
③set allocation to all
wait for the health from yellow to green.
When i finished upgrade part of the data nodes，i had waited a while,i found the master node old gc and run OOM later.
So I loaded heap dump from master node into Eclipse MemoryAnalyzer and found that 87.57% of memory is used by TransportService.clientHandlers hash map,
most of the RequestHolder was consist of like the action:indices:monitor/stats[n]、indices:monitor/recovery[n] or cluster:monitor/stats[n],below is the pic of heap dump:
![client](https://user-images.githubusercontent.com/34452367/70923617-1c7c9800-2063-11ea-8f7f-ece1f61a3a67.png)
![1](https://user-images.githubusercontent.com/34452367/70923169-66b14980-2062-11ea-9d1a-2f20ff506b91.png)
![2](https://user-images.githubusercontent.com/34452367/70923174-687b0d00-2062-11ea-975c-66555a62ec2c.png)

and i use the OQL 
` SELECT toString(action) FROM org.elasticsearch.transport.TransportService$RequestHolder`
to statistic the action,result is as follows:

![action](https://user-images.githubusercontent.com/34452367/70923478-e6d7af00-2062-11ea-8e84-10a109158b5f.png)


So,are there any bugs here or master overload?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

OOM due to large number of requests in TransportService.clientHandlers #50241

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

OOM due to large number of requests in TransportService.clientHandlers #50241

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions