Skip to content

[leo_storage] read repairer does not fix all objects #881

Closed
@windkit

Description

Description

I launched a cluster with 5 storage nodes, put 4 mil objects into the cluster with 3 replicas

What I did

  1. Stop storage node (bin/leo_storage stop) on S1@192.168.100.37
  2. Remove AVS directory /ssd/avs
  3. Start storage node (bin/leo_storage start) on S1@192.168.100.37
  4. Read through all the 4 mil object once

Observation

  1. Read performance is similar to fully operational cluster
  2. Warning from leo_storage_read_repairer
[W]     S1@192.168.100.37       2017-10-16 10:03:35.638060 +0900        1508115815      leo_storage_read_repairer:compare/4     167     [{node,'S1@192.168.100.37'},{addr_id,262089314529519640105457477379090544578},{key,<<"test/1562482">>},{clock,1508113160551767},{cause,not_found}]
[W]     S1@192.168.100.37       2017-10-16 10:03:35.654676 +0900        1508115815      leo_storage_read_repairer:compare/4     167     [{node,'S1@192.168.100.37'},{addr_id,97979787013326817027861459251467535726},{key,<<"test/1562489">>},{clock,1508113160570634},{cause,not_found}]
[W]     S1@192.168.100.37       2017-10-16 10:03:35.666188 +0900        1508115815      leo_storage_read_repairer:compare/4     167     [{node,'S1@192.168.100.37'},{addr_id,145216544548550617579951455369514880573},{key,<<"test/1562494">>},{clock,1508113160584383},{cause,not_found}]
[W]     S1@192.168.100.37       2017-10-16 10:03:35.676827 +0900        1508115815      leo_storage_read_repairer:compare/4     167     [{node,'S1@192.168.100.37'},{addr_id,128647732487682273230731598297851900247},{key,<<"test/1562498">>},{clock,1508113160595215},{cause,not_found}]
[W]     S2@192.168.100.38       2017-10-16 10:03:36.325356 +0900        1508115816      leo_storage_read_repairer:compare/4     167     [{node,'S1@192.168.100.37'},{addr_id,72846239249748048471647548013304151801},{key,<<"test/1562438">>},{clock,1508113161235638},{cause,not_found}]
[W]     S2@192.168.100.38       2017-10-16 10:03:36.358660 +0900        1508115816      leo_storage_read_repairer:compare/4     167     [{node,'S1@192.168.100.37'},{addr_id,213401350361107074116716132585815701224},{key,<<"test/1562449">>},{clock,1508113161265654},{cause,not_found}]
[W]     S2@192.168.100.38       2017-10-16 10:03:36.369805 +0900        1508115816      leo_storage_read_repairer:compare/4     167     [{node,'S1@192.168.100.37'},{addr_id,146984899329602051518955140785947694688},{key,<<"test/1562453">>},{clock,1508113161277052},{cause,not_found}]
  1. Some background traffic to S1@192.168.100.37 to fix the objects

grafana

Issue

S1@192.168.100.37 is not fully repaired, the missing objects are those S1@192.168.100.37 as the primary

leofs-adm whereis test/1562498
-------+------------------------+--------------------------------------+------------+--------------+----------------+----------------+----------------+----------------------------
 del?  |          node          |             ring address             |    size    |   checksum   |  has children  |  total chunks  |     clock      |             when
-------+------------------------+--------------------------------------+------------+--------------+----------------+----------------+----------------+----------------------------
       | S1@192.168.100.37      |                                      |            |              |                |                |                |
       | S2@192.168.100.38      | 60c8a6eb471f0fc585846b32a4363157     |        20K |   41dd6d2dc1 | false          |              0 | 55b9ef537e70f  | 2017-10-16 09:19:21 +0900
       | S5@192.168.100.41      | 60c8a6eb471f0fc585846b32a4363157     |        20K |   41dd6d2dc1 | false          |              0 | 55b9ef537e70f  | 2017-10-16 09:19:21 +0900

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions