This repository has been archived by the owner on Apr 18, 2024. It is now read-only.
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
mptcp: Close all subflows when fallen back and probe timer expires
There have been numerous reports of mptcp_sock_def_error_report warnings, like: ------------[ cut here ]------------ Meta already closed i_rcv 1 i_snd 1 send_i 0 flags 0x4004303 WARNING: CPU: 0 PID: 0 at net/mptcp/mptcp_ctrl.c:660 mptcp_sock_def_error_report+0xf6/0x100 Modules linked in: cls_fw cls_u32 sch_htb xt_tcpudp xt_policy drbg ansi_cprng authenc echainiv xfrm6_mode_tunnel xfrm4_mode_tunnel ipip fou ip_tunnel ip6_udp_tunnel udp_tunnel tun xfrm_user xfrm4_tunnel tunnel4 ipcomp xfrm_ipcomp esp4 ah4 af_key xfrm_algo dummy ip6table_filter ip6_tables iptable_filter xt_statistic xt_mark xt_connmark xt_conntrack xt_TPROXY nf_tproxy_ipv6 nf_tproxy_ipv4 xt_addrtype iptable_mangle ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c nfit libnvdimm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd cryptd glue_helper ppdev snd_pcsp intel_rapl_perf evdev snd_pcm parport_pc parport snd_timer snd soundcore serio_raw button mptcp_fullmesh ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2 fscrypto dm_mod crc32c_intel nvme i2c_piix4 nvme_core ena CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.19.56-mat-0.95 #5 Hardware name: Amazon EC2 m5.large/, BIOS 1.0 10/16/2017 RIP: 0010:mptcp_sock_def_error_report+0xf6/0x100 Code: 41 0f b6 74 24 24 4c 8b 45 60 48 c7 c7 f0 e1 ca 84 89 f2 89 f1 40 c0 ee 06 c0 ea 07 83 e1 01 83 e6 01 0f b6 d2 e8 ea 92 96 ff <0f> 0b e9 7a ff ff ff 0f 1f 00 0f 1f 44 00 00 41 55 41 89 d5 41 54 RSP: 0018:ffff9a61a5203dc0 EFLAGS: 00010282 RAX: 0000000000000000 RBX: ffff9a614d3e3d40 RCX: 0000000000000006 RDX: 0000000000000007 RSI: 0000000000000096 RDI: ffff9a61a52166b0 RBP: ffff9a61583ca300 R08: 0000000000000001 R09: 00000000000001d1 R10: 0000000000000000 R11: 0000000000000000 R12: ffff9a615f863480 R13: 0000000000000008 R14: dead000000000200 R15: 0000000000000003 FS: 0000000000000000(0000) GS:ffff9a61a5200000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f0fc758a4e0 CR3: 00000001f820a001 CR4: 00000000007606f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: <IRQ> tcp_write_err+0x2b/0x50 tcp_write_timeout+0x10b/0x4a0 tcp_retransmit_timer+0x2ba/0x5e0 mptcp_sub_retransmit_timer+0xe/0xd0 tcp_write_timer_handler+0xe7/0x230 tcp_write_timer+0xbd/0xc0 ? tcp_write_timer_handler+0x230/0x230 call_timer_fn+0x2b/0x130 run_timer_softirq+0x1d3/0x420 __do_softirq+0x10d/0x2c3 irq_exit+0xc2/0xd0 smp_apic_timer_interrupt+0x74/0x130 apic_timer_interrupt+0xf/0x20 </IRQ> Most timers are no more firing when a connection is in fallback mode. Except, the probe-timer! When this one fires and tcp_write_err is being called, the meta-socket will get closed (through a call to tcp_done), but the subflow is kept alive. If the connection has fallen back to regular TCP, that means we will hit the condition in mptcp_sock_def_error_report, which means we are in an inconsistent state. Preventing the probe timer on the meta-socket from firing when in fallback mode would be a big re-architecture, too risky for a patch that needs to get backported to all the stable releases. Thus, let's just kill all the subflows in the probe-timer when we trigger a tcp_write_err. Github-Issue: #352 Github-Issue: #315 Reported-by: https://github.com/cifvts Reported-by: https://github.com/knxhm Fixes: Zero-day Bug Signed-off-by: Christoph Paasch <cpaasch@apple.com> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> (cherry picked from commit 972a16b) Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
- Loading branch information