Skip to content

Commit c36e40e

Browse files
committed
Revert "HBASE-25663 Make graceful_stop localhostname compare match even if fqdn (#3048)"
This reverts commit f4e1ab7.
1 parent 5457554 commit c36e40e

File tree

2 files changed

+53
-21
lines changed

2 files changed

+53
-21
lines changed

bin/graceful_stop.sh

Lines changed: 42 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,8 @@
1818
# * limitations under the License.
1919
# */
2020

21-
# Move regions off a server then stop it. Optionally restart and reload.
21+
# Move regions off a server then stop it. Optionally restart and reload.
22+
# Turn off the balancer before running this script.
2223
function usage {
2324
echo "Usage: graceful_stop.sh [--config <conf-dir>] [-e] [--restart [--reload]] [--thrift] \
2425
[--rest] [-n |--noack] [--maxthreads <number of threads>] [--movetimeout <timeout in seconds>] \
@@ -32,7 +33,7 @@ moving regions"
3233
echo " maxthreads xx Limit the number of threads used by the region mover. Default value is 1."
3334
echo " movetimeout xx Timeout for moving regions. If regions are not moved by the timeout value,\
3435
exit with error. Default value is INT_MAX."
35-
echo " hostname Hostname to stop; match what HBase uses; pass 'localhost' if local to avoid ssh"
36+
echo " hostname Hostname of server we are to stop"
3637
echo " e|failfast Set -e so exit immediately if any command exits with non-zero status"
3738
echo " nob|nobalancer Do not manage balancer states. This is only used as optimization in \
3839
rolling_restart.sh to avoid multiple calls to hbase shell"
@@ -101,6 +102,13 @@ fi
101102
hostname=$1
102103
filename="/tmp/$hostname"
103104

105+
local=
106+
localhostname=`/bin/hostname`
107+
108+
if [ "$localhostname" == "$hostname" ]; then
109+
local=true
110+
fi
111+
104112
if [ "$nob" == "true" ]; then
105113
log "[ $0 ] skipping disabling balancer -nob argument is used"
106114
HBASE_BALANCER_STATE=false
@@ -111,7 +119,7 @@ else
111119
fi
112120

113121
unload_args="--filename $filename --maxthreads $maxthreads $noack --operation unload \
114-
--timeout $movetimeout --regionserverhost $hostname"
122+
--timeout $movetimeout --regionserverhost $hostname"
115123

116124
if [ "$designatedfile" != "" ]; then
117125
unload_args="$unload_args --designatedfile $designatedfile"
@@ -131,25 +139,49 @@ hosts="/tmp/$(basename $0).$$.tmp"
131139
echo $hostname >> $hosts
132140
if [ "$thrift" != "" ]; then
133141
log "Stopping thrift server on $hostname"
134-
"$bin"/hbase-daemons.sh --config ${HBASE_CONF_DIR} --hosts ${hosts} stop thrift
142+
if [ "$local" == true ]; then
143+
"$bin"/hbase-daemon.sh --config ${HBASE_CONF_DIR} stop thrift
144+
else
145+
"$bin"/hbase-daemons.sh --config ${HBASE_CONF_DIR} --hosts ${hosts} stop thrift
146+
fi
135147
fi
136148
if [ "$rest" != "" ]; then
137149
log "Stopping rest server on $hostname"
138-
"$bin"/hbase-daemons.sh --config ${HBASE_CONF_DIR} --hosts ${hosts} stop rest
150+
if [ "$local" == true ]; then
151+
"$bin"/hbase-daemon.sh --config ${HBASE_CONF_DIR} stop rest
152+
else
153+
"$bin"/hbase-daemons.sh --config ${HBASE_CONF_DIR} --hosts ${hosts} stop rest
154+
fi
139155
fi
140156
log "Stopping regionserver on $hostname"
141-
"$bin"/hbase-daemons.sh --config ${HBASE_CONF_DIR} --hosts ${hosts} stop regionserver
157+
if [ "$local" == true ]; then
158+
"$bin"/hbase-daemon.sh --config ${HBASE_CONF_DIR} stop regionserver
159+
else
160+
"$bin"/hbase-daemons.sh --config ${HBASE_CONF_DIR} --hosts ${hosts} stop regionserver
161+
fi
142162
if [ "$restart" != "" ]; then
143163
log "Restarting regionserver on $hostname"
144-
"$bin"/hbase-daemons.sh --config ${HBASE_CONF_DIR} --hosts ${hosts} start regionserver
164+
if [ "$local" == true ]; then
165+
"$bin"/hbase-daemon.sh --config ${HBASE_CONF_DIR} start regionserver
166+
else
167+
"$bin"/hbase-daemons.sh --config ${HBASE_CONF_DIR} --hosts ${hosts} start regionserver
168+
fi
145169
if [ "$thrift" != "" ]; then
146170
log "Restarting thrift server on $hostname"
147171
# -b 0.0.0.0 says listen on all interfaces rather than just default.
148-
"$bin"/hbase-daemons.sh --config ${HBASE_CONF_DIR} --hosts ${hosts} start thrift -b 0.0.0.0
172+
if [ "$local" == true ]; then
173+
"$bin"/hbase-daemon.sh --config ${HBASE_CONF_DIR} start thrift -b 0.0.0.0
174+
else
175+
"$bin"/hbase-daemons.sh --config ${HBASE_CONF_DIR} --hosts ${hosts} start thrift -b 0.0.0.0
176+
fi
149177
fi
150178
if [ "$rest" != "" ]; then
151179
log "Restarting rest server on $hostname"
152-
"$bin"/hbase-daemons.sh --config ${HBASE_CONF_DIR} --hosts ${hosts} start rest
180+
if [ "$local" == true ]; then
181+
"$bin"/hbase-daemon.sh --config ${HBASE_CONF_DIR} start rest
182+
else
183+
"$bin"/hbase-daemons.sh --config ${HBASE_CONF_DIR} --hosts ${hosts} start rest
184+
fi
153185
fi
154186
if [ "$reload" != "" ]; then
155187
log "Reloading $hostname region(s)"
@@ -169,4 +201,4 @@ else
169201
fi
170202

171203
# Cleanup tmp files.
172-
trap "rm -f /tmp/$(basename $0).*.tmp &> /dev/null" EXIT
204+
trap "rm -f "/tmp/$(basename $0).*.tmp" &> /dev/null" EXIT

src/main/asciidoc/_chapters/ops_mgt.adoc

Lines changed: 11 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -1364,9 +1364,10 @@ Copy the script if you need to make use of it in a version of hbase previous to
13641364

13651365
A downside to the above stop of a RegionServer is that regions could be offline for a good period of time.
13661366
Regions are closed in order.
1367-
If many regions on the server, the first region to close may not be back online until all regions close and
1368-
after the master notices the RegionServer's znode gone. A node can be asked to gradually shed its load and
1369-
then shutdown itself using the _graceful_stop.sh_ script. Here is its usage:
1367+
If many regions on the server, the first region to close may not be back online until all regions close and after the master notices the RegionServer's znode gone.
1368+
In Apache HBase 0.90.2, we added facility for having a node gradually shed its load and then shutdown itself down.
1369+
Apache HBase 0.90.2 added the _graceful_stop.sh_ script.
1370+
Here is its usage:
13701371

13711372
----
13721373
$ ./bin/graceful_stop.sh
@@ -1392,17 +1393,16 @@ To decommission a loaded RegionServer, run the following: +$
13921393
[NOTE]
13931394
====
13941395
The `HOSTNAME` passed to _graceful_stop.sh_ must match the hostname that hbase is using to identify RegionServers.
1395-
HBase uses fully-qualified domain names usually. Check the list of RegionServers in the master UI for how HBase
1396-
is referring to servers. Whatever HBase is using, this is what you should pass the _graceful_stop.sh_ decommission script.
1397-
If you pass IPs, the script is not yet smart enough to make a hostname (or FQDN) of it and so it will fail when it checks
1398-
if server is currently running; the graceful unloading of regions will not run.
1396+
Check the list of RegionServers in the master UI for how HBase is referring to servers.
1397+
It's usually hostname but can also be FQDN.
1398+
Whatever HBase is using, this is what you should pass the _graceful_stop.sh_ decommission script.
1399+
If you pass IPs, the script is not yet smart enough to make a hostname (or FQDN) of it and so it will fail when it checks if server is currently running; the graceful unloading of regions will not run.
13991400
====
14001401

14011402
The _graceful_stop.sh_ script will move the regions off the decommissioned RegionServer one at a time to minimize region churn.
1402-
It will verify the region deployed in the new location before it will moves the next region and so on until the decommissioned
1403-
server is carrying zero regions. At this point, the _graceful_stop.sh_ tells the RegionServer `stop`.
1404-
The master will at this point notice the RegionServer gone but all regions will have already been redeployed and because the
1405-
RegionServer went down cleanly, there will be no WAL logs to split.
1403+
It will verify the region deployed in the new location before it will moves the next region and so on until the decommissioned server is carrying zero regions.
1404+
At this point, the _graceful_stop.sh_ tells the RegionServer `stop`.
1405+
The master will at this point notice the RegionServer gone but all regions will have already been redeployed and because the RegionServer went down cleanly, there will be no WAL logs to split.
14061406

14071407
[[lb]]
14081408
.Load Balancer

0 commit comments

Comments
 (0)