Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Eeprom programming working and failing on different computers #832

Open
arbite opened this issue Jul 3, 2024 · 6 comments
Open

Eeprom programming working and failing on different computers #832

arbite opened this issue Jul 3, 2024 · 6 comments

Comments

@arbite
Copy link

arbite commented Jul 3, 2024

I have moved from an old reliable Optiplex 745 desktop to a new Topcon small computer and have an application running well on both computers using SOEM. I do have to run a script on the Topcon router that sets NIC rx-usecs etc to get the application to run reliably.

I went to program the eeprom for some new slave hardware on the new Topcon computer. eepromtool reported that it programmed successfully. However the slave will not work with my application and slaveinfo fails and shows a seemingly corrupted eeprom.

I thought my new hardware was bork. In a fit of desperation I fired up the old computer and programmed the eeprom. It works fine! I was able to program all my boards.

I wondered if the new computer had a bad install of SOEM or something. I copied the eepromtool binary from the old working computer to the new computer. It will not program the eeprom!
This is very strange.

Can anyone offer any advice? I really want to retire this old computer....

Here is some output from eepromtool and slaveinfo:

Not working:
Here is programming the slave using the new computer. Note the total EEPROM write time.

newcomputer:~/SOEM/install/bin$ sudo ./eepromtool enp3s0 1 -w eeprom_file.bin
SOEM (Simple Open EtherCAT Master)
EEPROM tool
ec_init on enp3s0 succeeded.
1 slaves found.
Slave 1
Vendor ID : 00000A43
Product Code : 00001001
Revision Number : 00000000
Serial Number : 00000000
Busy....
Total EEPROM write time :1289ms
End, close socket
End program

power cycle hardware

Here is slave info on the new computer after programming with the new computer. It fails.
At first glance it looks ok, but the output size is incorrect along with some of the other data. On previous attempts the Name entry was garbled, with only some parts of the text "arbite_device" showing correctly.

newcomputer:~/SOEM/install/bin$sudo ./slaveinfo enp3s0
SOEM (Simple Open EtherCAT Master)
Slaveinfo
Starting slaveinfo
ec_init on enp3s0 succeeded.
1 slaves found and configured.
Calculated workcounter 3
Not all slaves reached safe operational state.
Slave 1 State= 1 StatusCode= 0 : No error

Slave:1
Name:arbite_device
Output size: 2233bits
Input size: 672bits
State: 1
Delay: 0[ns]
Has DC: 1
DCParentport:0
Activeports:1.0.0.0
Configured address: 1001
Man: 00000a43 ID: 00001001 Rev: 00000000
SM0 A:1000 L: 128 F:00010026 Type:1
SM1 A:1080 L: 128 F:00010022 Type:2
SM2 A:1100 L: 280 F:00010024 Type:3
SM3 A:1900 L: 84 F:00010020 Type:4
FMMU0 Ls:00000000 Ll: 280 Lsb:0 Leb:7 Ps:1100 Psb:0 Ty:02 Act:01
FMMU1 Ls:00000118 Ll: 84 Lsb:0 Leb:7 Ps:1900 Psb:0 Ty:01 Act:01
FMMUfunc 0:1 1:2 2:0 3:0
MBX length wr: 128 rd: 128 MBX protocols : 04
CoE details: 1f FoE details: 00 EoE details: 00 SoE details: 00
Ebus current: 0[mA]
only LRD/LWR:0
End slaveinfo, close socket
End program

Working:
Here is programming the slave using the old computer. Note the total EEPROM write time.

oldcomputer:~/SOEM/install/bin$ sudo ./eepromtool enp3s0 1 -w eeprom_file.bin
[sudo] password for law:
SOEM (Simple Open EtherCAT Master)
EEPROM tool
ec_init on enp3s0 succeeded.
1 slaves found.
Slave 1
Vendor ID : 00000A43
Product Code : 00001001
Revision Number : 00000000
Serial Number : 00000000
Busy....
Total EEPROM write time :2079ms
End, close socket
End program

power cycle hardware
Here is slave info on the new computer after the slave was programmed on the old computer. It works:

newcomputer:~/SOEM/install/bin$ sudo ./slaveinfo enp3s0
SOEM (Simple Open EtherCAT Master)
Slaveinfo
Starting slaveinfo
ec_init on enp3s0 succeeded.
1 slaves found and configured.
Calculated workcounter 3

Slave:1
Name:arbite_device
Output size: 672bits
Input size: 672bits
State: 4
Delay: 0[ns]
Has DC: 1
DCParentport:0
Activeports:1.0.0.0
Configured address: 1001
Man: 00000a43 ID: 00001001 Rev: 00000000
SM0 A:1000 L: 128 F:00010026 Type:1
SM1 A:1080 L: 128 F:00010022 Type:2
SM2 A:1100 L: 84 F:00010024 Type:3
SM3 A:1900 L: 84 F:00010020 Type:4
FMMU0 Ls:00000000 Ll: 84 Lsb:0 Leb:7 Ps:1100 Psb:0 Ty:02 Act:01
FMMU1 Ls:00000054 Ll: 84 Lsb:0 Leb:7 Ps:1900 Psb:0 Ty:01 Act:01
FMMUfunc 0:1 1:2 2:0 3:0
MBX length wr: 128 rd: 128 MBX protocols : 04
CoE details: 1f FoE details: 00 EoE details: 00 SoE details: 00
Ebus current: 0[mA]
only LRD/LWR:0
End slaveinfo, close socket
End program

@ArthurKetels
Copy link
Contributor

A wireshark capture of the new computer doing an eeprom programming would help.

@arbite
Copy link
Author

arbite commented Aug 23, 2024

Hi Arthur,
Apologies for the delay. I've attached a capture of the new computer attempting to program a new board.
The program attempt looks correct, but when I do a slaveinfo the eeprom image is incorrect.
Thanks again

program_attempt_wireshark.zip

@meyerrap
Copy link

meyerrap commented Oct 14, 2024

I have a related issue.
I did successfully flash the eeprom with the tool, but when I did reflash I noticed that the eeprom content did get corrupted. I checked this buy using the eeprom read function of the eeprom tool and compared the original binary against the read back content. Interestingly, only certain address blocks were affected, others were completely fine.
Using the SSC tool to write the eeprom succeeds every time.

Further, I compared the wireshark capture of both, the eepromtool and the SSC tool while writing, see

eeprom_write_wireshark.zip

Interestingly, SOEM does not seem to use the address register 0x504 but since it did work initially, I assume it uses a different mechanism which I'm not familiar with.

Another thing to note: just waiting for a few min (!) after writing the eeprom did increase the chance of write success significantly.

Any ideas what is happening here?

Commit: 83c6264

@ArthurKetels
Copy link
Contributor

This issue has been a problem for some people. As I was never able to replicate it it was not solved. The last post had some interesting clues about what could be the issue. Most likely this is a timing issue around the write delay of the eeprom. Some eeproms need 4 to 10 ms worst case for a page erase. The page size can be a short as 4 bytes. The eeprom write routine has a delay build in after a write error, but this is only 5x200us = 1ms. It will retry 3 times maximum so this is still shorter than the worst case condition.

So please test the following patch. If it works I can update the code.

in ethercatmain.c, function ecx_writeeepromAP :
change

               if (estat & EC_ESTAT_NACK)
               {
                  nackcnt++;
                  osal_usleep(EC_LOCALDELAY * 5);
               }

in

               if (estat & EC_ESTAT_NACK)
               {
                  nackcnt++;
                  osal_usleep(5000);
               }

@arbite
Copy link
Author

arbite commented Nov 18, 2024

Hi Arthur,
I've tested on one board and it seems to be working ok.
I can test on a few more in the next coming weeks.

@meyerrap
Copy link

This solved the issue also in my case, thanks for the hint; looking forward to a release :D

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants