Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Port state remain down in CLI, port is up in HW #6884

Open
wjaco opened this issue Feb 25, 2021 · 4 comments
Open

Port state remain down in CLI, port is up in HW #6884

wjaco opened this issue Feb 25, 2021 · 4 comments

Comments

@wjaco
Copy link

wjaco commented Feb 25, 2021

Happens in 202012.

Port is up in HW and syncd sends oper status update and I can see that orchagent is receiving it, but it's dropped in the below code.

PortsOrch::doTask returns from here when interface oper status up received:

void PortsOrch::doTask(NotificationConsumer &consumer)
{
    SWSS_LOG_ENTER();

    /* Wait for all ports to be initialized */
    if (!allPortsReady())
    {
        SWSS_LOG_DEBUG("All ports not ready\n");             >>> Added this trace, it returns from here.
        return;
    }

Logs showing the same:

Feb 25 02:02:09.169075 sonic NOTICE swss#orchagent: :- doPortTask: Set port Ethernet0 admin status to up
...
...
Feb 25 02:02:15.604928 sonic DEBUG swss#orchagent: :- processReply: got message: ["port_state_change","[{\"port_id\":\"oid:0x1000000000002\",\"port_state\":\"SAI_PORT_OPER_STATUS_UP\"}]"]
Feb 25 02:02:15.604959 sonic DEBUG swss#orchagent: :< processReply: exit
Feb 25 02:02:15.604959 sonic DEBUG swss#orchagent: :< readData: exit
Feb 25 02:02:15.605044 sonic DEBUG swss#orchagent: :< select: exit
Feb 25 02:02:15.605044 sonic DEBUG swss#orchagent: :> doTask: enter
Feb 25 02:02:15.605070 sonic DEBUG swss#orchagent: :- doTask: All ports not ready        >>>> 
Feb 25 02:02:15.605070 sonic DEBUG swss#orchagent: :< doTask: exit

Because, PortInitDone notification arrives late (after ~45msec)

Feb 25 02:02:15.648461 sonic INFO swss#orchagent: :- doPortTask: Get PortInitDone notification from portsyncd.

I have only one port in this system. Noticed that, if I bring up one more port - both the ports come up.
Looks like a timing/sync issue.

Please let me know what are the relevant logs needed to root cause this further.

config:

    "PORT": {
        "Ethernet0": {
            "lanes": "1296,1297,1298,1299",
            "alias": "Ethernet0",
            "index": "0",
            "speed": "100000",
            "admin_status": "up",
            "mtu": "9100"
        },
# name lanes alias index speed
Ethernet0      1296,1297,1298,1299       Ethernet0   0    100000

Thanks and regards,
Wilson

@anshuv-mfst
Copy link

Need more info @wjaco:

  • Provide full system dump
  • Please confirm if there is any crash seen?

@anshuv-mfst
Copy link

@volodymyrsamotiy - FYI

@wjaco
Copy link
Author

wjaco commented Mar 4, 2021

No crash
root@sonic:/home/cisco# cd /var/core/
root@sonic:/var/core# ls -ltr
total 0
root@sonic:/var/core#

show tech attached.
sonic_dump_sonic_20210304_032015.tar.gz

Ethernet0 is down. Ethernet1 is made admin down (in config_db). If I make both admin up it both come up. So it appears to be a timing issue like I explained above.

root@sonic:/home/cisco# show int stat
  Interface                Lanes    Speed    MTU    FEC       Alias    Vlan    Oper    Admin             Type    Asym PFC
-----------  -------------------  -------  -----  -----  ----------  ------  ------  -------  ---------------  ----------
  Ethernet0  1296,1297,1298,1299     100G   9100    N/A   Ethernet0  routed    down       up  QSFP28 or later         N/A
  Ethernet1  1300,1301,1302,1303     100G   9100    N/A   Ethernet1  routed    down     down  QSFP28 or later         N/A
  Ethernet2  1288,1289,1290,1291     100G   9100    N/A   Ethernet2  routed    down       up              N/A         N/A

@LuiSzee
Copy link
Contributor

LuiSzee commented Aug 17, 2021

@anshuv-mfst
I also meet the same question on centec arm64 board. 201911/202012 branch.
Is there any mechanism for check allPortsReady in sai ?
Maybe, portsOrch should call refreshPortStatus after allPortsReady.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants