Skip to content

Conversation

@adoroszlai
Copy link
Contributor

What changes were proposed in this pull request?

Save timestamp of last successful data scan for each container (in the .container file). After a datanode restart, resume data scanning with the container that was least recently scanned.

Newly closed containers have no timestamp and are thus scanned first during the next iteration. This will be changed in HDDS-1369, which proposes to scan newly closed containers immediately.

https://issues.apache.org/jira/browse/HDDS-1228

How was this patch tested?

Created and closed containers. Restarted datanode while scanning was in progress. Verified that after the restart, scanner resumed from the container where it was interrupted.

datanode_1  | STARTUP_MSG: Starting HddsDatanodeService
datanode_1  | 2019-10-08 19:37:07 DEBUG ContainerDataScanner:148 - Scanning container 1, last scanned never
datanode_1  | 2019-10-08 19:37:07 DEBUG ContainerDataScanner:155 - Completed scan of container 1 at 2019-10-08T19:37:07.570Z
datanode_1  | 2019-10-08 19:37:07 INFO  ContainerDataScanner:122 - Completed an iteration of container data scrubber in 0 minutes. Number of iterations (since the data-node restart) : 1, Number of containers scanned in this iteration : 1, Number of unhealthy containers found in this iteration : 0
datanode_1  | 2019-10-08 19:37:17 DEBUG ContainerDataScanner:148 - Scanning container 2, last scanned never
datanode_1  | 2019-10-08 19:38:57 DEBUG ContainerDataScanner:155 - Completed scan of container 2 at 2019-10-08T19:38:57.402Z
datanode_1  | 2019-10-08 19:38:57 DEBUG ContainerDataScanner:148 - Scanning container 1, last scanned at 2019-10-08T19:37:07.570Z
datanode_1  | 2019-10-08 19:38:57 DEBUG ContainerDataScanner:155 - Completed scan of container 1 at 2019-10-08T19:38:57.443Z
datanode_1  | 2019-10-08 19:38:57 INFO  ContainerDataScanner:122 - Completed an iteration of container data scrubber in 1 minutes. Number of iterations (since the data-node restart) : 2, Number of containers scanned in this iteration : 2, Number of unhealthy containers found in this iteration : 0
datanode_1  | 2019-10-08 19:38:57 DEBUG ContainerDataScanner:148 - Scanning container 3, last scanned never
datanode_1  | 2019-10-08 19:39:02 DEBUG ContainerDataScanner:155 - Completed scan of container 3 at 2019-10-08T19:39:02.402Z
datanode_1  | 2019-10-08 19:39:02 DEBUG ContainerDataScanner:148 - Scanning container 4, last scanned never
datanode_1  | 2019-10-08 19:39:02 DEBUG ContainerDataScanner:155 - Completed scan of container 4 at 2019-10-08T19:39:02.430Z
datanode_1  | 2019-10-08 19:39:02 DEBUG ContainerDataScanner:148 - Scanning container 5, last scanned never
datanode_1  | 2019-10-08 19:39:11 ERROR HddsDatanodeService:75 - RECEIVED SIGNAL 15: SIGTERM
datanode_1  | STARTUP_MSG: Starting HddsDatanodeService
datanode_1  | 2019-10-08 19:39:22 DEBUG ContainerDataScanner:148 - Scanning container 5, last scanned never
datanode_1  | 2019-10-08 19:40:18 DEBUG ContainerDataScanner:155 - Completed scan of container 5 at 2019-10-08T19:40:18.268Z
datanode_1  | 2019-10-08 19:40:18 DEBUG ContainerDataScanner:148 - Scanning container 6, last scanned never
datanode_1  | 2019-10-08 19:40:31 DEBUG ContainerDataScanner:155 - Completed scan of container 6 at 2019-10-08T19:40:31.735Z
datanode_1  | 2019-10-08 19:40:31 DEBUG ContainerDataScanner:148 - Scanning container 2, last scanned at 2019-10-08T19:38:57.402Z
datanode_1  | 2019-10-08 19:42:12 DEBUG ContainerDataScanner:155 - Completed scan of container 2 at 2019-10-08T19:42:12.128Z
datanode_1  | 2019-10-08 19:42:12 DEBUG ContainerDataScanner:148 - Scanning container 1, last scanned at 2019-10-08T19:38:57.443Z
datanode_1  | 2019-10-08 19:42:12 DEBUG ContainerDataScanner:155 - Completed scan of container 1 at 2019-10-08T19:42:12.140Z
datanode_1  | 2019-10-08 19:42:12 DEBUG ContainerDataScanner:148 - Scanning container 3, last scanned at 2019-10-08T19:39:02.402Z
datanode_1  | 2019-10-08 19:42:16 DEBUG ContainerDataScanner:155 - Completed scan of container 3 at 2019-10-08T19:42:16.629Z
datanode_1  | 2019-10-08 19:42:16 DEBUG ContainerDataScanner:148 - Scanning container 4, last scanned at 2019-10-08T19:39:02.430Z
datanode_1  | 2019-10-08 19:42:16 DEBUG ContainerDataScanner:155 - Completed scan of container 4 at 2019-10-08T19:42:16.669Z
datanode_1  | 2019-10-08 19:42:16 INFO  ContainerDataScanner:122 - Completed an iteration of container data scrubber in 2 minutes. Number of iterations (since the data-node restart) : 1, Number of containers scanned in this iteration : 6, Number of unhealthy containers found in this iteration : 0

Also tested upgrade from Ozone 0.4.0. (Downgrade does not work, see HDDS-2268.)

@adoroszlai
Copy link
Contributor Author

@arp7 thanks for the comments on the original PR. I've updated this one to use Optional, and while there, use Instant instead of Long to make it more type-safe. Unfortunately snakeyaml doesn't work well with either of those, so I kept a Long member in the background for serialization.

Copy link
Contributor

@arp7 arp7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One comment on the comparator, the rest of the patch looks pretty good to me.

@arp7 arp7 merged commit 4d1e811 into apache:master Oct 24, 2019
@adoroszlai adoroszlai deleted the HDDS-1228 branch October 24, 2019 14:41
GlenGeng-awx referenced this pull request in GlenGeng-awx/hadoop-ozone Sep 17, 2020
kuenishi referenced this pull request in pfnet/ozone Feb 22, 2022
* Rate limiting
* Disabled by default
tanvipenumudy added a commit to tanvipenumudy/ozone that referenced this pull request May 12, 2022
# This is the 1st commit message:

Initial Commit

# This is the commit message apache#2:

more slight changes

# This is the commit message apache#3:

changes++

# This is the commit message apache#4:

getExecutorService Changes

# This is the commit message apache#5:

applyTransaction() Changes

# This is the commit message apache#6:

changes++

# This is the commit message apache#7:

TestOzoneManagerLock changes

# This is the commit message apache#8:

add changes

# This is the commit message apache#9:

add more minor changes

# This is the commit message apache#10:

add config to ozone-default.xml

# This is the commit message apache#11:

minor changes

# This is the commit message apache#12:

change modulo logic

# This is the commit message apache#13:

changes

# This is the commit message apache#14:

changes++

# This is the commit message apache#15:

add changes++

# This is the commit message apache#16:

minor changes

# This is the commit message apache#17:

Changes (to be reverted)

# This is the commit message apache#18:

Changes 09/05
vtutrinov pushed a commit to vtutrinov/ozone that referenced this pull request Apr 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants