Skip to content

Commit

Permalink
bonding: do not collect slave's stats
Browse files Browse the repository at this point in the history
When stat query is requested(dev_get_stats()), bonding interfaces
collect stats of slave interfaces.
Then, it prints delta value, which is "new_stats - old_stats" and
updates bond->bond_stats.
But, this mechanism has some problems.

1. It needs a lock for protecting "bond->bond_stats".
Bonding interfaces would be nested. So this lock would also be nested.
So, spin_lock_nested() or dynamic lockdep class key was used.
In the case of spin_lock_nested(), it needs correct nested level value
and this value will be changed when master/nomaster operations
(ip link set bond0 master bond1) are being executed.
This value is protected by RTNL mutex lock, but "dev_get_stats()" would
be called outside of RTNL mutex.
So, imbalance lock/unlock would be happened.
Another case, which is to use dynamic lockdep class key has same problem.
dynamic lockdep class key is protected by RTNL mutex lock
and if master/nomaster operations are executed, updating lockdep class
key is needed.
But, dev_get_stats() would be called outside of RTNL mutex, so imbalance
lock/unlock would be happened too.

2. Couldn't show correct stats value when slave interfaces are used
directly.

Test commands:
    ip netns add nst
    ip link add veth0 type veth peer name veth1
    ip link set veth1 netns nst
    ip link add bond0 type bond
    ip link set veth0 master bond0
    ip netns exec nst ip link set veth1 up
    ip netns exec nst ip a a 192.168.100.2/24 dev veth1
    ip a a 192.168.100.1/24 dev bond0
    ip link set veth0 up
    ip link set bond0 up
    ping 192.168.100.2 -I veth0 -c 10
    ip -s link show bond0
    ip -s link show veth0

Before:
26: bond0:
RX: bytes  packets  errors  dropped overrun mcast
656        8        0       0       0       0
TX: bytes  packets  errors  dropped carrier collsns
1340       22       0       0       0       0
~~~~~~~~~~~~

25: veth0@if24:
RX: bytes  packets  errors  dropped overrun mcast
656        8        0       0       0       0
TX: bytes  packets  errors  dropped carrier collsns
1340       22       0       0       0       0
~~~~~~~~~~~~

After:
19: bond0:
RX: bytes  packets  errors  dropped overrun mcast
544        8        0       0       0       8
TX: bytes  packets  errors  dropped carrier collsns
746        9        0       0       0       0
~~~~~~~~~~~

18: veth0@if17:
link/ether 76:14:ee:f1:7d:8e brd ff:ff:ff:ff:ff:ff link-netns nst
RX: bytes  packets  errors  dropped overrun mcast
656        8        0       0       0       0
TX: bytes  packets  errors  dropped carrier collsns
1250       21       0       0       0       0
~~~~~~~~~~~~

Only veth0 interface is used by ping process directly. bond0 interface
isn't used. So, bond0 stats should not be increased.
But, bond0 collects stats value of slave interface.
So bond0 stats will be increased.

In order to fix the above problems, this patch makes bonding interfaces
record own stats data like other interfaces.
This patch is made based on team interface stats logic.

There is another problem.
When master/nomaster operations are being executed, updating a dynamic
lockdep class key is needed.
But, bonding doesn't update dynamic lockdep key.
So, lockdep warning message occurs.
But, this problem will be disappeared by this patch.
Because this patch removes stats_lock and a dynamic lockdep class key
for stats_lock, which is stats_lock_key.

Test commands:
    ip link add bond0 type bond
    ip link add bond1 type bond
    ip link set bond0 master bond1
    ip link set bond0 nomaster
    ip link set bond1 master bond0

Splat looks like:
[  316.354460][ T1170] WARNING: possible circular locking dependency detected
[  316.355240][ T1170] 5.5.0+ torvalds#366 Not tainted
[  316.355720][ T1170] ------------------------------------------------------
[  316.357345][ T1170] ip/1170 is trying to acquire lock:
[  316.358007][ T1170] ffff8880ca79acd8 (&bond->stats_lock_key#2){+.+.}, at: bond_get_stats+0x90/0x4d0 [bonding]
[  316.359544][ T1170]
[  316.359544][ T1170] but task is already holding lock:
[  316.360578][ T1170] ffff8880b12f2cd8 (&bond->stats_lock_key){+.+.}, at: bond_get_stats+0x90/0x4d0 [bonding]
[  316.361992][ T1170]
[  316.361992][ T1170] which lock already depends on the new lock.
[  316.361992][ T1170]
[  316.363446][ T1170]
[  316.363446][ T1170] the existing dependency chain (in reverse order) is:
[  316.364739][ T1170]
[  316.364739][ T1170] -> #1 (&bond->stats_lock_key){+.+.}:
[  316.366686][ T1170]        _raw_spin_lock+0x30/0x70
[  316.367394][ T1170]        bond_get_stats+0x90/0x4d0 [bonding]
[  316.368202][ T1170]        dev_get_stats+0x1ec/0x270
[  316.368890][ T1170]        bond_get_stats+0x1a5/0x4d0 [bonding]
[  316.370573][ T1170]        dev_get_stats+0x1ec/0x270
[  316.371227][ T1170]        rtnl_fill_stats+0x44/0xbe0
[  316.371891][ T1170]        rtnl_fill_ifinfo+0xeb2/0x3720
[  316.372619][ T1170]        rtmsg_ifinfo_build_skb+0xca/0x170
[  316.373371][ T1170]        rtmsg_ifinfo_event.part.33+0x1b/0xb0
[  316.374161][ T1170]        rtnetlink_event+0xcd/0x120
[  316.375112][ T1170]        notifier_call_chain+0x90/0x160
[  316.375680][ T1170]        netdev_change_features+0x74/0xa0
[  316.376417][ T1170]        bond_compute_features.isra.45+0x4e6/0x6f0 [bonding]
[  316.378357][ T1170]        bond_enslave+0x3639/0x47b0 [bonding]
[  316.379138][ T1170]        do_setlink+0xaab/0x2ef0
[  316.379757][ T1170]        __rtnl_newlink+0x9c5/0x1270
[  316.380486][ T1170]        rtnl_newlink+0x65/0x90
[  316.381271][ T1170]        rtnetlink_rcv_msg+0x4a8/0x890
[  316.382138][ T1170]        netlink_rcv_skb+0x121/0x350
[  316.382793][ T1170]        netlink_unicast+0x42e/0x610
[  316.383507][ T1170]        netlink_sendmsg+0x65a/0xb90
[  316.384398][ T1170]        ____sys_sendmsg+0x5ce/0x7a0
[  316.385084][ T1170]        ___sys_sendmsg+0x10f/0x1b0
[  316.385778][ T1170]        __sys_sendmsg+0xc6/0x150
[  316.386469][ T1170]        do_syscall_64+0x99/0x4f0
[  316.387185][ T1170]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
[  316.388000][ T1170]
[  316.388000][ T1170] -> #0 (&bond->stats_lock_key#2){+.+.}:
[  316.389019][ T1170]        __lock_acquire+0x2d8d/0x3de0
[  316.392598][ T1170]        lock_acquire+0x164/0x3b0
[  316.393309][ T1170]        _raw_spin_lock+0x30/0x70
[  316.393962][ T1170]        bond_get_stats+0x90/0x4d0 [bonding]
[  316.394787][ T1170]        dev_get_stats+0x1ec/0x270
[  316.395496][ T1170]        bond_get_stats+0x1a5/0x4d0 [bonding]
[  316.396285][ T1170]        dev_get_stats+0x1ec/0x270
[  316.396949][ T1170]        rtnl_fill_stats+0x44/0xbe0
[  316.400420][ T1170]        rtnl_fill_ifinfo+0xeb2/0x3720
[  316.401122][ T1170]        rtmsg_ifinfo_build_skb+0xca/0x170
[  316.401933][ T1170]        rtmsg_ifinfo_event.part.33+0x1b/0xb0
[  316.402739][ T1170]        rtnetlink_event+0xcd/0x120
[  316.403477][ T1170]        notifier_call_chain+0x90/0x160
[  316.404187][ T1170]        netdev_change_features+0x74/0xa0
[  316.404957][ T1170]        bond_compute_features.isra.45+0x4e6/0x6f0 [bonding]
[  316.405978][ T1170]        bond_enslave+0x3639/0x47b0 [bonding]
[  316.406780][ T1170]        do_setlink+0xaab/0x2ef0
[  316.407441][ T1170]        __rtnl_newlink+0x9c5/0x1270
[  316.408152][ T1170]        rtnl_newlink+0x65/0x90
[  316.408788][ T1170]        rtnetlink_rcv_msg+0x4a8/0x890
[  316.411550][ T1170]        netlink_rcv_skb+0x121/0x350
[  316.412230][ T1170]        netlink_unicast+0x42e/0x610
[  316.412897][ T1170]        netlink_sendmsg+0x65a/0xb90
[  316.413703][ T1170]        ____sys_sendmsg+0x5ce/0x7a0
[  316.414448][ T1170]        ___sys_sendmsg+0x10f/0x1b0
[  316.415129][ T1170]        __sys_sendmsg+0xc6/0x150
[  316.415801][ T1170]        do_syscall_64+0x99/0x4f0
[  316.416491][ T1170]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
[  316.417407][ T1170]
[  316.417407][ T1170] other info that might help us debug this:
[  316.417407][ T1170]
[  316.418728][ T1170]  Possible unsafe locking scenario:
[  316.418728][ T1170]
[  316.419693][ T1170]        CPU0                    CPU1
[  316.420400][ T1170]        ----                    ----
[  316.421096][ T1170]   lock(&bond->stats_lock_key);
[  316.421821][ T1170]                                lock(&bond->stats_lock_key#2);
[  316.422819][ T1170]                                lock(&bond->stats_lock_key);
[  316.424906][ T1170]   lock(&bond->stats_lock_key#2);
[  316.425596][ T1170]
[  316.425596][ T1170]  *** DEADLOCK ***
[  316.425596][ T1170]
[  316.426726][ T1170] 3 locks held by ip/1170:
[  316.427351][ T1170]  #0: ffffffffa2cf60f0 (rtnl_mutex){+.+.}, at: rtnetlink_rcv_msg+0x457/0x890
[  316.428556][ T1170]  #1: ffff8880b12f2cd8 (&bond->stats_lock_key){+.+.}, at: bond_get_stats+0x90/0x4d0 [bondin]
[  316.431170][ T1170]  #2: ffffffffa29254c0 (rcu_read_lock){....}, at: bond_get_stats+0x5/0x4d0 [bonding]
[  316.432411][ T1170]
[  316.432411][ T1170] stack backtrace:
[  316.433341][ T1170] CPU: 3 PID: 1170 Comm: ip Not tainted 5.5.0+ torvalds#366
[  316.434208][ T1170] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[  316.435367][ T1170] Call Trace:
[  316.435846][ T1170]  dump_stack+0x96/0xdb
[  316.436449][ T1170]  check_noncircular+0x371/0x450
[  316.437101][ T1170]  ? print_circular_bug.isra.35+0x310/0x310
[  316.437895][ T1170]  ? hlock_class+0x130/0x130
[  316.438531][ T1170]  ? __lock_acquire+0x2d8d/0x3de0
[  316.439241][ T1170]  __lock_acquire+0x2d8d/0x3de0
[  316.439889][ T1170]  ? register_lock_class+0x14d0/0x14d0
[  316.440602][ T1170]  ? check_chain_key+0x236/0x5d0
[  316.442135][ T1170]  lock_acquire+0x164/0x3b0
[  316.442789][ T1170]  ? bond_get_stats+0x90/0x4d0 [bonding]
[  316.443524][ T1170]  _raw_spin_lock+0x30/0x70
[  316.444112][ T1170]  ? bond_get_stats+0x90/0x4d0 [bonding]
[  316.445571][ T1170]  bond_get_stats+0x90/0x4d0 [bonding]
[  316.446320][ T1170]  ? bond_neigh_init+0x2d0/0x2d0 [bonding]
[  316.447330][ T1170]  ? rcu_read_lock_held+0x90/0xa0
[  316.448003][ T1170]  ? rcu_read_lock_sched_held+0xc0/0xc0
[  316.450802][ T1170]  ? bond_get_stats+0x5/0x4d0 [bonding]
[  316.451569][ T1170]  dev_get_stats+0x1ec/0x270
[  316.452206][ T1170]  bond_get_stats+0x1a5/0x4d0 [bonding]
[ ... ]

Fixes: 089bca2 ("bonding: use dynamic lockdep key instead of subclass")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
  • Loading branch information
TaeheeYoo authored and intel-lab-lkp committed Feb 16, 2020
1 parent e083cae commit f4f0c0c
Show file tree
Hide file tree
Showing 4 changed files with 138 additions and 117 deletions.
14 changes: 7 additions & 7 deletions drivers/net/bonding/bond_alb.c
Original file line number Diff line number Diff line change
Expand Up @@ -1299,8 +1299,8 @@ void bond_alb_deinitialize(struct bonding *bond)
rlb_deinitialize(bond);
}

static netdev_tx_t bond_do_alb_xmit(struct sk_buff *skb, struct bonding *bond,
struct slave *tx_slave)
static bool bond_do_alb_xmit(struct sk_buff *skb, struct bonding *bond,
struct slave *tx_slave)
{
struct alb_bond_info *bond_info = &(BOND_ALB_INFO(bond));
struct ethhdr *eth_data = eth_hdr(skb);
Expand All @@ -1319,7 +1319,7 @@ static netdev_tx_t bond_do_alb_xmit(struct sk_buff *skb, struct bonding *bond,
}

bond_dev_queue_xmit(bond, skb, tx_slave->dev);
goto out;
return true;
}

if (tx_slave && bond->params.tlb_dynamic_lb) {
Expand All @@ -1330,11 +1330,11 @@ static netdev_tx_t bond_do_alb_xmit(struct sk_buff *skb, struct bonding *bond,

/* no suitable interface, frame not sent */
bond_tx_drop(bond->dev, skb);
out:
return NETDEV_TX_OK;

return false;
}

netdev_tx_t bond_tlb_xmit(struct sk_buff *skb, struct net_device *bond_dev)
bool bond_tlb_xmit(struct sk_buff *skb, struct net_device *bond_dev)
{
struct bonding *bond = netdev_priv(bond_dev);
struct ethhdr *eth_data;
Expand Down Expand Up @@ -1372,7 +1372,7 @@ netdev_tx_t bond_tlb_xmit(struct sk_buff *skb, struct net_device *bond_dev)
return bond_do_alb_xmit(skb, bond, tx_slave);
}

netdev_tx_t bond_alb_xmit(struct sk_buff *skb, struct net_device *bond_dev)
bool bond_alb_xmit(struct sk_buff *skb, struct net_device *bond_dev)
{
struct bonding *bond = netdev_priv(bond_dev);
struct ethhdr *eth_data;
Expand Down
Loading

0 comments on commit f4f0c0c

Please sign in to comment.