Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lib: changes for making snmp socket non-blocking #5134

Merged
merged 1 commit into from
Oct 15, 2019

Conversation

sudhanshukumar22
Copy link
Contributor

lib: changes for making snmp socket non-blocking
Description: The changes have been done to make the snmp socket
non-blocking before calling snmp_read()
Problem Description/Summary :
vtysh hangs on first try to enter after a reboot with BGP dynamic peers

Expected Behavior :
VTYSH should not hang.
When we debug more into bgpd docker by doing gdb on its threads, we find the below thread of bgpd, which is causing the issue.
Thread 1 (Thread 0x7f1e1ec46d40 (LWP 47)):

0x00007f1e1d762593 in recvfrom () from /lib/x86_64-linux-gnu/libpthread.so.0
0x00007f1e1aadd09b in netsnmp_tcpbase_recv () from /usr/lib/x86_64-linux-gnu/libnetsnmp.so.30
0x00007f1e1aad9617 in netsnmp_transport_recv () from /usr/lib/x86_64-linux-gnu/libnetsnmp.so.30
0x00007f1e1aab2c07 in _sess_read () from /usr/lib/x86_64-linux-gnu/libnetsnmp.so.30
0x00007f1e1aab3a29 in snmp_sess_read2 () from /usr/lib/x86_64-linux-gnu/libnetsnmp.so.30
0x00007f1e1aab3a7b in snmp_read2 () from /usr/lib/x86_64-linux-gnu/libnetsnmp.so.30
0x00007f1e1aab3acf in snmp_read () from /usr/lib/x86_64-linux-gnu/libnetsnmp.so.30
0x00007f1e1b44d7ec in agentx_read (t=0x7fffa75f0080) at lib/agentx.c:63
0x00007f1e1e7d6451 in thread_call (thread=0x7fffa75f0080) at lib/thread.c:1620
0x00007f1e1e770699 in frr_run (master=0x559396ea60f0) at lib/libfrr.c:1011
0x0000559395b4d953 in main (argc=5, argv=0x7fffa75f02b8) at bgpd/bgp_main.c:492

(gdb) bt

0x00007f830c89d210 in __read_nocancel () from /lib/x86_64-linux-gnu/libpthread.so.0
0x000056450e1e8238 in vtysh_client_run (vclient=0x56450e4a8b40 <vtysh_client+24768>, line=0x56450e21add0 enable, callback=0x0, cbarg=0x0) at vtysh/vtysh.c:216
0x000056450e1e8c6b in vtysh_client_run_all (head_client=0x56450e4a8b40 <vtysh_client+24768>, line=0x56450e21add0 enable, continue_on_err=0, callback=0x0, cbarg=0x0) at vtysh/vtysh.c:356
0x000056450e1e8ddb in vtysh_client_execute (head_client=0x56450e4a8b40 <vtysh_client+24768>, line=0x56450e21add0 enable) at vtysh/vtysh.c:393
0x000056450e1e9c82 in vtysh_execute_func (line=0x56450e21add0 enable, pager=0) at vtysh/vtysh.c:598
0x000056450e1e9dee in vtysh_execute_no_pager (line=0x56450e21add0 enable) at vtysh/vtysh.c:619
0x000056450e1f7d48 in vtysh_read_file (confp=0x56451000a9d0, top_cfg=1) at vtysh/vtysh_config.c:494
0x000056450e1f7ef2 in vtysh_read_config (config_default_dir=0x56450e4edc20 <frr_config> /etc/frr/frr.conf, top_cfg=1) at vtysh/vtysh_config.c:522
0x000056450e1e5de4 in vtysh_apply_top_level_config () at vtysh/vtysh_main.c:301
0x000056450e1e7842 in main (argc=2, argv=0x7ffc81e6f598, env=0x7ffc81e6f5b0) at vtysh/vtysh_main.c:692

The fix has been taken from the following link.
https://sourceforge.net/p/net-snmp/patches/1348/

Signed-off-by: Preetham Singh (preetham.singh@broadcom.com)

Copy link

@polychaeta polychaeta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contribution to FRR!

  • One of your commits does not have a blank line between the summary and body; this will break git log --oneline

If you are a new contributor to FRR, please see our contributing guidelines.

@LabN-CI
Copy link
Collaborator

LabN-CI commented Oct 10, 2019

💚 Basic BGPD CI results: SUCCESS, 0 tests failed

Results table
_ _
Result SUCCESS git merge/5134 4765a43
Date 10/10/2019
Start 02:20:16
Finish 02:41:52
Run-Time 21:36
Total 1815
Pass 1815
Fail 0
Valgrind-Errors 0
Valgrind-Loss 0
Details vncregress-2019-10-10-02:20:16.txt
Log autoscript-2019-10-10-02:21:07.log.bz2
Memory 414 412 360

For details, please contact louberger

@NetDEF-CI
Copy link
Collaborator

Continuous Integration Result: SUCCESSFUL

Congratulations, this patch passed basic tests

Tested-by: NetDEF / OpenSourceRouting.org CI System

CI System Testrun URL: https://ci1.netdef.org/browse/FRR-FRRPULLREQ-9190/

This is a comment from an automated CI system.
For questions and feedback in regards to this CI system, please feel free to email
Martin Winter - mwinter (at) opensourcerouting.org.


CLANG Static Analyzer Summary

  • Github Pull Request 5134, comparing to Git base SHA d00d0f9

No Changes in Static Analysis warnings compared to base

1 Static Analyzer issues remaining.

See details at
https://ci1.netdef.org/browse/FRR-FRRPULLREQ-9190/artifact/shared/static_analysis/index.html

@LabN-CI
Copy link
Collaborator

LabN-CI commented Oct 10, 2019

💚 Basic BGPD CI results: SUCCESS, 0 tests failed

Results table
_ _
Result SUCCESS git merge/5134 1de6fc9
Date 10/10/2019
Start 06:10:17
Finish 06:31:55
Run-Time 21:38
Total 1815
Pass 1815
Fail 0
Valgrind-Errors 0
Valgrind-Loss 0
Details vncregress-2019-10-10-06:10:17.txt
Log autoscript-2019-10-10-06:11:09.log.bz2
Memory 428 417 360

For details, please contact louberger

@NetDEF-CI
Copy link
Collaborator

Continuous Integration Result: SUCCESSFUL

Congratulations, this patch passed basic tests

Tested-by: NetDEF / OpenSourceRouting.org CI System

CI System Testrun URL: https://ci1.netdef.org/browse/FRR-FRRPULLREQ-9192/

This is a comment from an automated CI system.
For questions and feedback in regards to this CI system, please feel free to email
Martin Winter - mwinter (at) opensourcerouting.org.


CLANG Static Analyzer Summary

  • Github Pull Request 5134, comparing to Git base SHA d09e0a8

No Changes in Static Analysis warnings compared to base

1 Static Analyzer issues remaining.

See details at
https://ci1.netdef.org/browse/FRR-FRRPULLREQ-9192/artifact/shared/static_analysis/index.html

@sudhanshukumar22
Copy link
Contributor Author

@donaldsharp @rwestphal , please review the changes. I was facing some issues with github in #5061 and #5099. I have raised a new PR here. I have closed the earlier PRs.

@qlyoung
Copy link
Member

qlyoung commented Oct 10, 2019

@sudhanshukumar22

  • please rebase to eliminate the merge commit
  • Your signed off by line says Signed-off-by: Preetham Singh (preetham.singh@broadcom.com); the correct format uses <> not (), remember git will generate a signature for you, all you have to do is pass -s when you commit

@sudhanshukumar22
Copy link
Contributor Author

@sudhanshukumar22

  • please rebase to eliminate the merge commit
  • Your signed off by line says Signed-off-by: Preetham Singh (preetham.singh@broadcom.com); the correct format uses <> not (), remember git will generate a signature for you, all you have to do is pass -s when you commit

Hi @qlyoung,
I have one commit below. What are the git operations to avoid the below commit.
"Merge remote-tracking branch 'upstream/master' into bgp-snmp-issue". I have tried git pull --rebase. but it is not working.
-bash-4.1$ git status -uno
On branch bgp-snmp-issue
nothing to commit (use -u to show untracked files)
-bash-4.1$ git pull --rebase
There is no tracking information for the current branch.
Please specify which branch you want to rebase against.
See git-pull(1) for details.

git pull <remote> <branch>

If you wish to set tracking information for this branch you can do so with:

git branch --set-upstream-to=<remote>/<branch> bgp-snmp-issue

@sudhanshukumar22 sudhanshukumar22 force-pushed the bgp-snmp-issue branch 2 times, most recently from f049940 to 241aaf9 Compare October 11, 2019 04:53
Description: The changes have been done to make the snmp socket
non-blocking before calling snmp_read()

Problem Description/Summary :
vtysh hangs on first try to enter after a reboot with BGP dynamic peers

Expected Behavior :
VTYSH should not hang.

When we debug more into bgpd docker by doing gdb on its threads, we find the below thread of bgpd, which is causing the issue.
Thread 1 (Thread 0x7f1e1ec46d40 (LWP 47)):

0x00007f1e1d762593 in recvfrom () from /lib/x86_64-linux-gnu/libpthread.so.0
0x00007f1e1aadd09b in netsnmp_tcpbase_recv () from /usr/lib/x86_64-linux-gnu/libnetsnmp.so.30
0x00007f1e1aad9617 in netsnmp_transport_recv () from /usr/lib/x86_64-linux-gnu/libnetsnmp.so.30
0x00007f1e1aab2c07 in _sess_read () from /usr/lib/x86_64-linux-gnu/libnetsnmp.so.30
0x00007f1e1aab3a29 in snmp_sess_read2 () from /usr/lib/x86_64-linux-gnu/libnetsnmp.so.30
0x00007f1e1aab3a7b in snmp_read2 () from /usr/lib/x86_64-linux-gnu/libnetsnmp.so.30
0x00007f1e1aab3acf in snmp_read () from /usr/lib/x86_64-linux-gnu/libnetsnmp.so.30
0x00007f1e1b44d7ec in agentx_read (t=0x7fffa75f0080) at lib/agentx.c:63
0x00007f1e1e7d6451 in thread_call (thread=0x7fffa75f0080) at lib/thread.c:1620
0x00007f1e1e770699 in frr_run (master=0x559396ea60f0) at lib/libfrr.c:1011
0x0000559395b4d953 in main (argc=5, argv=0x7fffa75f02b8) at bgpd/bgp_main.c:492

(gdb) bt

0x00007f830c89d210 in __read_nocancel () from /lib/x86_64-linux-gnu/libpthread.so.0
0x000056450e1e8238 in vtysh_client_run (vclient=0x56450e4a8b40 <vtysh_client+24768>, line=0x56450e21add0 enable, callback=0x0, cbarg=0x0) at vtysh/vtysh.c:216
0x000056450e1e8c6b in vtysh_client_run_all (head_client=0x56450e4a8b40 <vtysh_client+24768>, line=0x56450e21add0 enable, continue_on_err=0, callback=0x0, cbarg=0x0) at vtysh/vtysh.c:356
0x000056450e1e8ddb in vtysh_client_execute (head_client=0x56450e4a8b40 <vtysh_client+24768>, line=0x56450e21add0 enable) at vtysh/vtysh.c:393
0x000056450e1e9c82 in vtysh_execute_func (line=0x56450e21add0 enable, pager=0) at vtysh/vtysh.c:598
0x000056450e1e9dee in vtysh_execute_no_pager (line=0x56450e21add0 enable) at vtysh/vtysh.c:619
0x000056450e1f7d48 in vtysh_read_file (confp=0x56451000a9d0, top_cfg=1) at vtysh/vtysh_config.c:494
0x000056450e1f7ef2 in vtysh_read_config (config_default_dir=0x56450e4edc20 <frr_config> /etc/frr/frr.conf, top_cfg=1) at vtysh/vtysh_config.c:522
0x000056450e1e5de4 in vtysh_apply_top_level_config () at vtysh/vtysh_main.c:301
0x000056450e1e7842 in main (argc=2, argv=0x7ffc81e6f598, env=0x7ffc81e6f5b0) at vtysh/vtysh_main.c:692

The fix has been taken from the following link.
https://sourceforge.net/p/net-snmp/patches/1348/

Signed-off-by: Preetham Singh <preetham.singh@broadcom.com>
@LabN-CI
Copy link
Collaborator

LabN-CI commented Oct 11, 2019

💚 Basic BGPD CI results: SUCCESS, 0 tests failed

Results table
_ _
Result SUCCESS git merge/5134 3b71726
Date 10/11/2019
Start 00:55:17
Finish 01:16:57
Run-Time 21:40
Total 1815
Pass 1815
Fail 0
Valgrind-Errors 0
Valgrind-Loss 0
Details vncregress-2019-10-11-00:55:17.txt
Log autoscript-2019-10-11-00:56:05.log.bz2
Memory 430 431 360

For details, please contact louberger

@NetDEF-CI
Copy link
Collaborator

Continuous Integration Result: SUCCESSFUL

Congratulations, this patch passed basic tests

Tested-by: NetDEF / OpenSourceRouting.org CI System

CI System Testrun URL: https://ci1.netdef.org/browse/FRR-FRRPULLREQ-9204/

This is a comment from an automated CI system.
For questions and feedback in regards to this CI system, please feel free to email
Martin Winter - mwinter (at) opensourcerouting.org.


CLANG Static Analyzer Summary

  • Github Pull Request 5134, comparing to Git base SHA f043157

No Changes in Static Analysis warnings compared to base

1 Static Analyzer issues remaining.

See details at
https://ci1.netdef.org/browse/FRR-FRRPULLREQ-9204/artifact/shared/static_analysis/index.html

@NetDEF-CI
Copy link
Collaborator

Continuous Integration Result: SUCCESSFUL

Congratulations, this patch passed basic tests

Tested-by: NetDEF / OpenSourceRouting.org CI System

CI System Testrun URL: https://ci1.netdef.org/browse/FRR-FRRPULLREQ-9205/

This is a comment from an automated CI system.
For questions and feedback in regards to this CI system, please feel free to email
Martin Winter - mwinter (at) opensourcerouting.org.


CLANG Static Analyzer Summary

  • Github Pull Request 5134, comparing to Git base SHA f043157

No Changes in Static Analysis warnings compared to base

1 Static Analyzer issues remaining.

See details at
https://ci1.netdef.org/browse/FRR-FRRPULLREQ-9205/artifact/shared/static_analysis/index.html

@NetDEF-CI
Copy link
Collaborator

Continuous Integration Result: SUCCESSFUL

Congratulations, this patch passed basic tests

Tested-by: NetDEF / OpenSourceRouting.org CI System

CI System Testrun URL: https://ci1.netdef.org/browse/FRR-FRRPULLREQ-9203/

This is a comment from an automated CI system.
For questions and feedback in regards to this CI system, please feel free to email
Martin Winter - mwinter (at) opensourcerouting.org.


CLANG Static Analyzer Summary

  • Github Pull Request 5134, comparing to Git base SHA f043157

No Changes in Static Analysis warnings compared to base

1 Static Analyzer issues remaining.

See details at
https://ci1.netdef.org/browse/FRR-FRRPULLREQ-9203/artifact/shared/static_analysis/index.html

FD_ZERO(&fds);
FD_SET(THREAD_FD(t), &fds);
snmp_read(&fds);

/* Reset the flag */
if (!nonblock)
fcntl(THREAD_FD(t), F_SETFL, flags);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we turning it off again? Is this needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we turning it off again? Is this needed?
Please note that the defect was because vtysh was hanging for a blocking snmp socket in bgp. What we are doing here is as follows:

  1. when we read from the agent, check the type of the socket. if the socket is blocking, convert it into non blocking socket by setting a non-blocking flag on it. If the socket is already non-blocking, do not do anything.
  2. read the data from the socket.
  3. while returning from the agent, if we had set the non blocking flag on the socket, we reset that flag. This makes sure that when we return from this function, we have the same state on the socket flag as we had when we entered inside the function.
    Hope I was clear.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I understood the code. I was asking why 3) is needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

netsnmp is library code(can change from version to version). It might have some code that depends on the socket flag being blocking or non-blocking. When I changed the code here, I did not want to change the socket flag.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I see. The sockets themselves are created by net-snmp. We get them, make them nonblocking, then pass them to another net-snmp library function. Then set the flag back to whatever it was before. Real gnarly.

Someone with a better understanding of net-snmp should probably take a look at this one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -55,13 +55,29 @@ static int agentx_timeout(struct thread *t)
static int agentx_read(struct thread *t)
{
fd_set fds;
int flags;
int nonblock = false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest to rename the variable into something more easy to read.
For example flags_changed

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well this variable holds the flags set on the socket prior to us changing it. So I don't really agree.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't agree with that. The variable was initialized with some value before we read the socket state.
Then we do some magic, changing it only in one if branch.
For me it was hard to read.
I'd prefer to have a solution which will change a variable to true only when we changed flag parameter, and then restore the flag, only if the variable was true.

P.S.: Honestly saying we don't need this variable at all.
We can restore the socket flags without any conditions, and set the socket to NON_BLOCK without any conditions.

@qlyoung
Copy link
Member

qlyoung commented Oct 15, 2019

@sudhanshukumar22 thanks!

@qlyoung qlyoung merged commit e5ab18e into FRRouting:master Oct 15, 2019
sudhanshukumar22 added a commit to sudhanshukumar22/sonic-buildimage that referenced this pull request Oct 16, 2019
…ket-issue, origin/bgp-snmp-socket-issue)

Author: sudhanshukumar22 <sudhanshu.kumar@broadcom.com>
Date:   Tue Oct 15 02:31:20 2019 -0700

    lib: changes for making snmp socket non-blocking

    Description: The changes have been done to make the snmp socket
    non-blocking before calling snmp_read()
    FRR Pull request: FRRouting/frr#5134

    Problem Description/Summary :
    vtysh hangs on first try to enter after a reboot with BGP dynamic peers

    Expected Behavior :
    VTYSH should not hang.
    When we debug more into bgpd docker by doing gdb on its threads, we find the below thread of bgpd, which is causing the issue.
    Thread 1 (Thread 0x7f1e1ec46d40 (LWP 47)):

    0x00007f1e1d762593 in recvfrom () from /lib/x86_64-linux-gnu/libpthread.so.0
    0x00007f1e1aadd09b in netsnmp_tcpbase_recv () from /usr/lib/x86_64-linux-gnu/libnetsnmp.so.30
    0x00007f1e1aad9617 in netsnmp_transport_recv () from /usr/lib/x86_64-linux-gnu/libnetsnmp.so.30
    0x00007f1e1aab2c07 in _sess_read () from /usr/lib/x86_64-linux-gnu/libnetsnmp.so.30
    0x00007f1e1aab3a29 in snmp_sess_read2 () from /usr/lib/x86_64-linux-gnu/libnetsnmp.so.30
    0x00007f1e1aab3a7b in snmp_read2 () from /usr/lib/x86_64-linux-gnu/libnetsnmp.so.30
    0x00007f1e1aab3acf in snmp_read () from /usr/lib/x86_64-linux-gnu/libnetsnmp.so.30
    0x00007f1e1b44d7ec in agentx_read (t=0x7fffa75f0080) at lib/agentx.c:63
    0x00007f1e1e7d6451 in thread_call (thread=0x7fffa75f0080) at lib/thread.c:1620
    0x00007f1e1e770699 in frr_run (master=0x559396ea60f0) at lib/libfrr.c:1011
    0x0000559395b4d953 in main (argc=5, argv=0x7fffa75f02b8) at bgpd/bgp_main.c:492

    (gdb) bt

    0x00007f830c89d210 in __read_nocancel () from /lib/x86_64-linux-gnu/libpthread.so.0
    0x000056450e1e8238 in vtysh_client_run (vclient=0x56450e4a8b40 <vtysh_client+24768>, line=0x56450e21add0 enable, callback=0x0, cbarg=0x0) at vtysh/vtysh.c:216
    0x000056450e1e8c6b in vtysh_client_run_all (head_client=0x56450e4a8b40 <vtysh_client+24768>, line=0x56450e21add0 enable, continue_on_err=0, callback=0x0, cbarg=0x0) at vtysh/vtysh.c:356
    0x000056450e1e8ddb in vtysh_client_execute (head_client=0x56450e4a8b40 <vtysh_client+24768>, line=0x56450e21add0 enable) at vtysh/vtysh.c:393
    0x000056450e1e9c82 in vtysh_execute_func (line=0x56450e21add0 enable, pager=0) at vtysh/vtysh.c:598
    0x000056450e1e9dee in vtysh_execute_no_pager (line=0x56450e21add0 enable) at vtysh/vtysh.c:619
    0x000056450e1f7d48 in vtysh_read_file (confp=0x56451000a9d0, top_cfg=1) at vtysh/vtysh_config.c:494
    0x000056450e1f7ef2 in vtysh_read_config (config_default_dir=0x56450e4edc20 <frr_config> /etc/frr/frr.conf, top_cfg=1) at vtysh/vtysh_config.c:522
    0x000056450e1e5de4 in vtysh_apply_top_level_config () at vtysh/vtysh_main.c:301
    0x000056450e1e7842 in main (argc=2, argv=0x7ffc81e6f598, env=0x7ffc81e6f5b0) at vtysh/vtysh_main.c:692

    The fix has been taken from the following link.
    https://sourceforge.net/p/net-snmp/patches/1348/
pavel-shirshov pushed a commit to sonic-net/sonic-buildimage that referenced this pull request Oct 16, 2019
…ket-issue, origin/bgp-snmp-socket-issue) (#3604)

Author: sudhanshukumar22 <sudhanshu.kumar@broadcom.com>
Date:   Tue Oct 15 02:31:20 2019 -0700

    lib: changes for making snmp socket non-blocking

    Description: The changes have been done to make the snmp socket
    non-blocking before calling snmp_read()
    FRR Pull request: FRRouting/frr#5134

    Problem Description/Summary :
    vtysh hangs on first try to enter after a reboot with BGP dynamic peers

    Expected Behavior :
    VTYSH should not hang.
    When we debug more into bgpd docker by doing gdb on its threads, we find the below thread of bgpd, which is causing the issue.
    Thread 1 (Thread 0x7f1e1ec46d40 (LWP 47)):

    0x00007f1e1d762593 in recvfrom () from /lib/x86_64-linux-gnu/libpthread.so.0
    0x00007f1e1aadd09b in netsnmp_tcpbase_recv () from /usr/lib/x86_64-linux-gnu/libnetsnmp.so.30
    0x00007f1e1aad9617 in netsnmp_transport_recv () from /usr/lib/x86_64-linux-gnu/libnetsnmp.so.30
    0x00007f1e1aab2c07 in _sess_read () from /usr/lib/x86_64-linux-gnu/libnetsnmp.so.30
    0x00007f1e1aab3a29 in snmp_sess_read2 () from /usr/lib/x86_64-linux-gnu/libnetsnmp.so.30
    0x00007f1e1aab3a7b in snmp_read2 () from /usr/lib/x86_64-linux-gnu/libnetsnmp.so.30
    0x00007f1e1aab3acf in snmp_read () from /usr/lib/x86_64-linux-gnu/libnetsnmp.so.30
    0x00007f1e1b44d7ec in agentx_read (t=0x7fffa75f0080) at lib/agentx.c:63
    0x00007f1e1e7d6451 in thread_call (thread=0x7fffa75f0080) at lib/thread.c:1620
    0x00007f1e1e770699 in frr_run (master=0x559396ea60f0) at lib/libfrr.c:1011
    0x0000559395b4d953 in main (argc=5, argv=0x7fffa75f02b8) at bgpd/bgp_main.c:492

    (gdb) bt

    0x00007f830c89d210 in __read_nocancel () from /lib/x86_64-linux-gnu/libpthread.so.0
    0x000056450e1e8238 in vtysh_client_run (vclient=0x56450e4a8b40 <vtysh_client+24768>, line=0x56450e21add0 enable, callback=0x0, cbarg=0x0) at vtysh/vtysh.c:216
    0x000056450e1e8c6b in vtysh_client_run_all (head_client=0x56450e4a8b40 <vtysh_client+24768>, line=0x56450e21add0 enable, continue_on_err=0, callback=0x0, cbarg=0x0) at vtysh/vtysh.c:356
    0x000056450e1e8ddb in vtysh_client_execute (head_client=0x56450e4a8b40 <vtysh_client+24768>, line=0x56450e21add0 enable) at vtysh/vtysh.c:393
    0x000056450e1e9c82 in vtysh_execute_func (line=0x56450e21add0 enable, pager=0) at vtysh/vtysh.c:598
    0x000056450e1e9dee in vtysh_execute_no_pager (line=0x56450e21add0 enable) at vtysh/vtysh.c:619
    0x000056450e1f7d48 in vtysh_read_file (confp=0x56451000a9d0, top_cfg=1) at vtysh/vtysh_config.c:494
    0x000056450e1f7ef2 in vtysh_read_config (config_default_dir=0x56450e4edc20 <frr_config> /etc/frr/frr.conf, top_cfg=1) at vtysh/vtysh_config.c:522
    0x000056450e1e5de4 in vtysh_apply_top_level_config () at vtysh/vtysh_main.c:301
    0x000056450e1e7842 in main (argc=2, argv=0x7ffc81e6f598, env=0x7ffc81e6f5b0) at vtysh/vtysh_main.c:692

    The fix has been taken from the following link.
    https://sourceforge.net/p/net-snmp/patches/1348/
praveen-li pushed a commit to praveen-li/sonic-buildimage that referenced this pull request Feb 9, 2021
…ket-issue, origin/bgp-snmp-socket-issue) (sonic-net#3604)

Author: sudhanshukumar22 <sudhanshu.kumar@broadcom.com>
Date:   Tue Oct 15 02:31:20 2019 -0700

    lib: changes for making snmp socket non-blocking

    Description: The changes have been done to make the snmp socket
    non-blocking before calling snmp_read()
    FRR Pull request: FRRouting/frr#5134

    Problem Description/Summary :
    vtysh hangs on first try to enter after a reboot with BGP dynamic peers

    Expected Behavior :
    VTYSH should not hang.
    When we debug more into bgpd docker by doing gdb on its threads, we find the below thread of bgpd, which is causing the issue.
    Thread 1 (Thread 0x7f1e1ec46d40 (LWP 47)):

    0x00007f1e1d762593 in recvfrom () from /lib/x86_64-linux-gnu/libpthread.so.0
    0x00007f1e1aadd09b in netsnmp_tcpbase_recv () from /usr/lib/x86_64-linux-gnu/libnetsnmp.so.30
    0x00007f1e1aad9617 in netsnmp_transport_recv () from /usr/lib/x86_64-linux-gnu/libnetsnmp.so.30
    0x00007f1e1aab2c07 in _sess_read () from /usr/lib/x86_64-linux-gnu/libnetsnmp.so.30
    0x00007f1e1aab3a29 in snmp_sess_read2 () from /usr/lib/x86_64-linux-gnu/libnetsnmp.so.30
    0x00007f1e1aab3a7b in snmp_read2 () from /usr/lib/x86_64-linux-gnu/libnetsnmp.so.30
    0x00007f1e1aab3acf in snmp_read () from /usr/lib/x86_64-linux-gnu/libnetsnmp.so.30
    0x00007f1e1b44d7ec in agentx_read (t=0x7fffa75f0080) at lib/agentx.c:63
    0x00007f1e1e7d6451 in thread_call (thread=0x7fffa75f0080) at lib/thread.c:1620
    0x00007f1e1e770699 in frr_run (master=0x559396ea60f0) at lib/libfrr.c:1011
    0x0000559395b4d953 in main (argc=5, argv=0x7fffa75f02b8) at bgpd/bgp_main.c:492

    (gdb) bt

    0x00007f830c89d210 in __read_nocancel () from /lib/x86_64-linux-gnu/libpthread.so.0
    0x000056450e1e8238 in vtysh_client_run (vclient=0x56450e4a8b40 <vtysh_client+24768>, line=0x56450e21add0 enable, callback=0x0, cbarg=0x0) at vtysh/vtysh.c:216
    0x000056450e1e8c6b in vtysh_client_run_all (head_client=0x56450e4a8b40 <vtysh_client+24768>, line=0x56450e21add0 enable, continue_on_err=0, callback=0x0, cbarg=0x0) at vtysh/vtysh.c:356
    0x000056450e1e8ddb in vtysh_client_execute (head_client=0x56450e4a8b40 <vtysh_client+24768>, line=0x56450e21add0 enable) at vtysh/vtysh.c:393
    0x000056450e1e9c82 in vtysh_execute_func (line=0x56450e21add0 enable, pager=0) at vtysh/vtysh.c:598
    0x000056450e1e9dee in vtysh_execute_no_pager (line=0x56450e21add0 enable) at vtysh/vtysh.c:619
    0x000056450e1f7d48 in vtysh_read_file (confp=0x56451000a9d0, top_cfg=1) at vtysh/vtysh_config.c:494
    0x000056450e1f7ef2 in vtysh_read_config (config_default_dir=0x56450e4edc20 <frr_config> /etc/frr/frr.conf, top_cfg=1) at vtysh/vtysh_config.c:522
    0x000056450e1e5de4 in vtysh_apply_top_level_config () at vtysh/vtysh_main.c:301
    0x000056450e1e7842 in main (argc=2, argv=0x7ffc81e6f598, env=0x7ffc81e6f5b0) at vtysh/vtysh_main.c:692

    The fix has been taken from the following link.
    https://sourceforge.net/p/net-snmp/patches/1348/

Signed-off-by: Zhenggen Xu <zxu@linkedin.com>

RB=2119458
G=lnos-reviewers
R=pchaudhary,pmao,vapatil,samaity,zxu
A=
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants