Closed
Description
After the follow deployment script,
curl https://raw.githubusercontent.com/kubeflow/kubeflow/v0.2.2/scripts/deploy.sh | bash
.
Ambassador failed to start on one node.
kubectl logs --namespace kubeflow ambassador-849fb9c8c5-kgrkb ambassador
./entrypoint.sh: set: line 65: can't access tty; job control turned off
2018-07-31 05:46:50 kubewatch 0.30.1 INFO: generating config with gencount 1 (4 changes)
2018-07-31 05:46:56 kubewatch 0.30.1 WARNING: Scout: could not post report: HTTPSConnectionPool(host='kubernaut.io', port=443): Max retries exceeded with url: /scout (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7f7383625940>: Failed to establish a new connection: [Errno -3] Try again',))
2018-07-31 05:46:56 kubewatch 0.30.1 INFO: Scout reports {"latest_version": "0.30.1", "exception": "could not post report: HTTPSConnectionPool(host='kubernaut.io', port=443): Max retries exceeded with url: /scout (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7f7383625940>: Failed to establish a new connection: [Errno -3] Try again',))", "cached": false, "timestamp": 1533016011.063859}
[2018-07-31 05:46:56.133][10][info][upstream] source/common/upstream/cluster_manager_impl.cc:132] cm init: all clusters initialized
[2018-07-31 05:46:56.133][10][info][config] source/server/configuration_impl.cc:55] loading 1 listener(s)
[2018-07-31 05:46:56.150][10][info][config] source/server/configuration_impl.cc:95] loading tracing configuration
[2018-07-31 05:46:56.150][10][info][config] source/server/configuration_impl.cc:122] loading stats sink configuration
AMBASSADOR: starting diagd
AMBASSADOR: starting Envoy
AMBASSADOR: waiting
PIDS: 11:diagd 12:envoy 13:kubewatch
[2018-07-31 05:46:56.556][14][info][main] source/server/server.cc:184] initializing epoch 0 (hot restart version=9.200.16384.127.options=capacity=16384, num_slots=8209 hash=228984379728933363)
[2018-07-31 05:46:57.574][14][info][config] source/server/configuration_impl.cc:55] loading 1 listener(s)
[2018-07-31 05:46:57.767][14][info][config] source/server/configuration_impl.cc:95] loading tracing configuration
[2018-07-31 05:46:57.767][14][info][config] source/server/configuration_impl.cc:122] loading stats sink configuration
[2018-07-31 05:46:57.769][14][info][main] source/server/server.cc:359] starting main dispatch loop
2018-07-31 05:47:04 diagd 0.30.1 WARNING: Scout: could not post report: HTTPSConnectionPool(host='kubernaut.io', port=443): Max retries exceeded with url: /scout (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7f0bee6d95f8>: Failed to establish a new connection: [Errno -3] Try again',))
2018-07-31 05:47:04 diagd 0.30.1 INFO: Scout reports {"latest_version": "0.30.1", "exception": "could not post report: HTTPSConnectionPool(host='kubernaut.io', port=443): Max retries exceeded with url: /scout (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7f0bee6d95f8>: Failed to establish a new connection: [Errno -3] Try again',))", "cached": false, "timestamp": 1533016019.808133}
2018-07-31 05:47:14 kubewatch 0.30.1 INFO: generating config with gencount 2 (4 changes)
2018-07-31 05:47:19 kubewatch 0.30.1 WARNING: Scout: could not post report: HTTPSConnectionPool(host='kubernaut.io', port=443): Max retries exceeded with url: /scout (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7f6fbb8468d0>: Failed to establish a new connection: [Errno -3] Try again',))
2018-07-31 05:47:19 kubewatch 0.30.1 INFO: Scout reports {"latest_version": "0.30.1", "exception": "could not post report: HTTPSConnectionPool(host='kubernaut.io', port=443): Max retries exceeded with url: /scout (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7f6fbb8468d0>: Failed to establish a new connection: [Errno -3] Try again',))", "cached": false, "timestamp": 1533016034.702365}
[2018-07-31 05:47:19.770][26][info][upstream] source/common/upstream/cluster_manager_impl.cc:132] cm init: all clusters initialized
[2018-07-31 05:47:19.771][26][info][config] source/server/configuration_impl.cc:55] loading 1 listener(s)
[2018-07-31 05:47:19.788][26][info][config] source/server/configuration_impl.cc:95] loading tracing configuration
[2018-07-31 05:47:19.788][26][info][config] source/server/configuration_impl.cc:122] loading stats sink configuration
unable to initialize hot restart: previous envoy process is still initializing
starting hot-restarter with target: /application/start-envoy.sh
forking and execing new child process at epoch 0
forked new child process with PID=14
got SIGHUP
forking and execing new child process at epoch 1
forked new child process with PID=27
got SIGCHLD
PID=27 exited with code=1
Due to abnormal exit, force killing all child processes and exiting
force killing PID=14
exiting due to lack of child processes
AMBASSADOR: envoy exited with status 1
Here's the envoy.json we were trying to run with:
{
"listeners": [
{
"address": "tcp://0.0.0.0:80",
"filters": [
{
"type": "read",
"name": "http_connection_manager",
"config": {"codec_type": "auto",
"stat_prefix": "ingress_http",
"access_log": [
{
"format": "ACCESS [%START_TIME%] \"%REQ(:METHOD)% %REQ(X-ENVOY-ORIGINAL-PATH?:PATH)% %PROTOCOL%\" %RESPONSE_CODE% %RESPONSE_FLAGS% %BYTES_RECEIVED% %BYTES_SENT% %DURATION% %RESP(X-ENVOY-UPSTREAM-SERVICE-TIME)% \"%REQ(X-FORWARDED-FOR)%\" \"%REQ(USER-AGENT)%\" \"%REQ(X-REQUEST-ID)%\" \"%REQ(:AUTHORITY)%\" \"%UPSTREAM_HOST%\"\n",
"path": "/dev/fd/1"
}
],
"route_config": {
"virtual_hosts": [
{
"name": "backend",
"domains": ["*"],"routes": [
{
"timeout_ms": 3000,"prefix": "/ambassador/v0/check_ready","prefix_rewrite": "/ambassador/v0/check_ready",
"weighted_clusters": {
"clusters": [
{ "name": "cluster_127_0_0_1_8877", "weight": 100.0 }
]
}
}
,
{
"timeout_ms": 3000,"prefix": "/ambassador/v0/check_alive","prefix_rewrite": "/ambassador/v0/check_alive",
"weighted_clusters": {
"clusters": [
{ "name": "cluster_127_0_0_1_8877", "weight": 100.0 }
]
}
}
,
{
"timeout_ms": 3000,"prefix": "/ambassador/v0/","prefix_rewrite": "/ambassador/v0/",
"weighted_clusters": {
"clusters": [
{ "name": "cluster_127_0_0_1_8877", "weight": 100.0 }
]
}
}
,
{
"timeout_ms": 3000,"prefix": "/tfjobs/","prefix_rewrite": "/tfjobs/",
"weighted_clusters": {
"clusters": [
{ "name": "cluster_tf_job_dashboard_default", "weight": 100.0 }
]
}
}
,
{
"timeout_ms": 3000,"prefix": "/k8s/ui/","prefix_rewrite": "/",
"weighted_clusters": {
"clusters": [
{ "name": "cluster_kubernetes_dashboard_kube_system_otls", "weight": 100.0 }
]
}
}
,
{
"timeout_ms": 300000,"prefix": "/user/","prefix_rewrite": "/user/",
"weighted_clusters": {
"clusters": [
{ "name": "cluster_tf_hub_lb_default", "weight": 100.0 }
]
}
}
,
{
"timeout_ms": 300000,"prefix": "/hub/","prefix_rewrite": "/hub/",
"weighted_clusters": {
"clusters": [
{ "name": "cluster_tf_hub_lb_default", "weight": 100.0 }
]
}
}
,
{
"timeout_ms": 3000,"prefix": "/","prefix_rewrite": "/",
"weighted_clusters": {
"clusters": [
{ "name": "cluster_centraldashboard_default", "weight": 100.0 }
]
}
}
]
}
]
},
"filters": [
{
"name": "cors",
"config": {}
},{"type": "decoder",
"name": "router",
"config": {}
}
]
}
}
]
}
],
"admin": {
"address": "tcp://127.0.0.1:8001",
"access_log_path": "/tmp/admin_access_log"
},
"cluster_manager": {
"clusters": [
{
"name": "cluster_127_0_0_1_8877",
"connect_timeout_ms": 3000,
"type": "strict_dns",
"lb_type": "round_robin",
"hosts": [
{
"url": "tcp://127.0.0.1:8877"
}
]},
{
"name": "cluster_centraldashboard_default",
"connect_timeout_ms": 3000,
"type": "strict_dns",
"lb_type": "round_robin",
"hosts": [
{
"url": "tcp://centraldashboard.default:80"
}
]},
{
"name": "cluster_kubernetes_dashboard_kube_system_otls",
"connect_timeout_ms": 3000,
"type": "strict_dns",
"lb_type": "round_robin",
"hosts": [
{
"url": "tcp://kubernetes-dashboard.kube-system:443"
}
],
"ssl_context": {
}},
{
"name": "cluster_tf_hub_lb_default",
"connect_timeout_ms": 3000,
"type": "strict_dns",
"lb_type": "round_robin",
"hosts": [
{
"url": "tcp://tf-hub-lb.default:80"
}
]},
{
"name": "cluster_tf_job_dashboard_default",
"connect_timeout_ms": 3000,
"type": "strict_dns",
"lb_type": "round_robin",
"hosts": [
{
"url": "tcp://tf-job-dashboard.default:80"
}
]}
]
},
"statsd_udp_ip_address": "127.0.0.1:8125",
"stats_flush_interval_ms": 1000
}AMBASSADOR: shutting down