Skip to content

DubboProtocol.getSharedClient 方法并发环境下 可能出现 NPE 问题 #6444

Closed
@lxk-kk

Description

@lxk-kk
  • I have searched the issues of this repository and believe that this is not a duplicate.
  • I have checked the FAQ of this repository and believe that this is not a duplicate.

Environment

  • Dubbo version: 2.7.7 及其之前的版本

Steps to reproduce this issue

  1. 贴出源码
private final ConcurrentMap<String, Object> locks = new ConcurrentHashMap<>();
    private List<ReferenceCountExchangeClient> getSharedClient(URL url, int connectNum) {
        String key = url.getAddress();
        // 略去无关代码

        // 【1】
        locks.putIfAbsent(key, new Object());

       // 【2】
        synchronized (locks.get(key)) {
            clients = referenceClientMap.get(key);
            if (checkClientCanUse(clients)) {
                batchClientRefIncr(clients);
                return clients;
            }
            connectNum = Math.max(connectNum, 1);
            if (CollectionUtils.isEmpty(clients)) {
                clients = buildReferenceCountExchangeClientList(url, connectNum);
                referenceClientMap.put(key, clients);
            } else {
                for (int i = 0; i < clients.size(); i++) {
                    ReferenceCountExchangeClient referenceCountExchangeClient = clients.get(i);
                    if (referenceCountExchangeClient == null || referenceCountExchangeClient.isClosed()) {
                        clients.set(i, buildReferenceCountExchangeClient(url));
                        continue;
                    }
                    referenceCountExchangeClient.incrementAndGetCount();
                }
            }

            // 【3】
            locks.remove(key);
            return clients;
        }
    }

  1. 问题描述:
  • 注意上述 【1】、【2】、【3】处代码
  • 假设这种情况:并发环境下,线程 A、B 持有相同的 key,执行到该方法时。如果 A 执行到 同步块中,并在 locks.remove(key) 真正 remove 之前,线程 B 刚好执行到 locks.putIfAbsent(key, new Object())
  • 当 B 执行完该语句后,在 B 执行 locks.get(key) 之前,正好 A remove 完成。
  • 此时,B 执行到 locks.get(key) 将拿到 null,此时 抛出 NPE
  1. 问题提出:
  • 查看过历史 issue 后,使用 ConcurrentHashMap 解决了历史版本中 key.intern() 潜在性的死锁问题,但是随之也可能会有上述的 NPE 问题,所以这是被允许的吗?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions