Skip to content

mlx5: got completion with error #5

@jtuni

Description

@jtuni

I'm unable to successfully transfer a file with hdRDMAcp. On sever side I am running the following:
./hdrdmacp -s -n 1 -m 8GB
and getting the following output:


=============================================
Found 1 devices
---------------------------------------------
   device 0 : mlx5_0 : uverbs0 : IB : InfiniBand channel adapter : Num. ports=1 : port num=1 : lid=30
=============================================

Device mlx5_0 opened. num_comp_vectors=60
Port attributes:
           state: 4
         max_mtu: 5
      active_mtu: 5
  port_cap_flags: 575793224
      max_msg_sz: 1073741824
    active_width: 2
    active_speed: 32
      phys_state: 5
      link_layer: 1
Created 1 buffers of 8000MB (8GB total)
Listening for connections on port ... 10470

So everything looks good. On client side, I am trying to send a large file running:
/hdrdmacp/hdrdmacp /home/tuni/example.file 10.2.1.85:/home/tuni/file_1g.file -n 1 -m 8GB
And I get an error, regardless of the file size and the combinations of buffers and buffer sizes I use:


=============================================
Found 1 devices
---------------------------------------------
   device 0 : mlx5_0 : uverbs0 : IB : InfiniBand channel adapter : Num. ports=1 : port num=1 : lid=21
=============================================

Device mlx5_0 opened. num_comp_vectors=60
Port attributes:
           state: 4
         max_mtu: 5
      active_mtu: 5
  port_cap_flags: 575793224
      max_msg_sz: 1073741824
    active_width: 2
    active_speed: 32
      phys_state: 5
      link_layer: 1
Created 1 buffers of 8000MB (8GB total)
IP address: 10.2.1.85 (10.2.1.85)
Connected to 10.2.1.85:10470
Sending file: /home/tuni/example.file-> (10.2.1.85:)/home/tuni/file_1g.file   (3.15027 GB)
  queued 3150MB (3150/3150 MB -- 100%  - 24.7215 Gbps)   mlx5: pirineusknl1: got completion with error:
00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000
00000000 42006802 0a0002e3 0000eed2

  Transferred 3.15027 GB in 1.02321 sec  (24.6306 Gbps)
  I/O rate reading from file: 1.01938 sec  (24.7231 Gbps)

Even though the output states the file was transferred at 24.6 Gbps in 1.02 seconds, the file is never transferred. Both client and server are using the same OFED version and are basically identical on every aspect, so I don't know what to do to fix this, any ideas?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions