Skip to content

Journald support #39

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Feb 1, 2017
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1 +1,2 @@
/bin/node-problem-detector
/Dockerfile
6 changes: 5 additions & 1 deletion Dockerfile → Dockerfile.in
Original file line number Diff line number Diff line change
Expand Up @@ -12,8 +12,12 @@
# See the License for the specific language governing permissions and
# limitations under the License.

FROM alpine:3.4
FROM @BASEIMAGE@
MAINTAINER Random Liu <lantaol@google.com>

# Avoid symlink of /etc/localtime.
RUN test -h /etc/localtime && rm -f /etc/localtime && cp /usr/share/zoneinfo/UTC /etc/localtime || true

ADD ./bin/node-problem-detector /node-problem-detector
ADD config /config
ENTRYPOINT ["/node-problem-detector", "--kernel-monitor=/config/kernel-monitor.json"]
8 changes: 4 additions & 4 deletions Godeps/Godeps.json

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

82 changes: 67 additions & 15 deletions Makefile
Original file line number Diff line number Diff line change
@@ -1,24 +1,73 @@
.PHONY: all build-container build-tar build push-container push-tar push clean vet fmt version
# Copyright 2017 The Kubernetes Authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Build the node-problem-detector image.

.PHONY: all build-container build-tar build push-container push-tar push clean vet fmt version Dockerfile

all: build

VERSION := $(shell git describe --tags --dirty)
# VERSION is the version of the binary.
VERSION:=$(shell git describe --tags --dirty)

TAG ?= $(VERSION)
# TAG is the tag of the container image, default to binary version.
TAG?=$(VERSION)

UPLOAD_PATH ?= gs://kubernetes-release
# PROJ is the image project.
PROJ?=gcr.io/google_containers

# UPLOAD_PATH is the cloud storage path to upload release tar.
UPLOAD_PATH?=gs://kubernetes-release
# Trim the trailing '/' in the path
UPLOAD_PATH := $(shell echo $(UPLOAD_PATH) | sed '$$s/\/*$$//')
UPLOAD_PATH:=$(shell echo $(UPLOAD_PATH) | sed '$$s/\/*$$//')

# PKG is the package name of node problem detector repo.
PKG:=k8s.io/node-problem-detector

# PKG_SOURCES are all the go source code.
PKG_SOURCES:=$(shell find pkg cmd -name '*.go')

PROJ ?= google_containers
# TARBALL is the name of release tar. Include binary version by default.
TARBALL:=node-problem-detector-$(VERSION).tar.gz

PKG := k8s.io/node-problem-detector
# IMAGE is the image name of the node problem detector container image.
IMAGE:=$(PROJ)/node-problem-detector:$(TAG)

PKG_SOURCES := $(shell find pkg cmd -name '*.go')
# ENABLE_JOURNALD enables build journald support or not. Building journald support needs libsystemd-dev
# or libsystemd-journal-dev.
# TODO(random-liu): Build NPD inside container.
ENABLE_JOURNALD?=1
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: spaces around the assignment operators missing to keep this in-sync with the other variables here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK. Will fix. Thanks.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


TARBALL := node-problem-detector-$(VERSION).tar.gz
# TODO(random-liu): Support different architectures.
BASEIMAGE:=alpine:3.4

IMAGE := gcr.io/$(PROJ)/node-problem-detector:$(TAG)
# Disable cgo by default to make the binary statically linked.
CGO_ENABLED:=0

# NOTE that enable journald will increase the image size.
ifeq ($(ENABLE_JOURNALD), 1)
# Enable journald build tag.
BUILD_TAGS:=-tags journald
# Use fedora because it has newer systemd version (229) and support +LZ4. +LZ4 is needed
# on some os distros such as GCI.
BASEIMAGE:=fedora
# Enable cgo because sdjournal needs cgo to compile. The binary will be dynamically
# linked if CGO_ENABLED is enabled. This is fine because fedora already has necessary
# dynamic library. We can not use `-extldflags "-static"` here, because go-systemd uses
# dlopen, and dlopen will not work properly in a statically linked application.
CGO_ENABLED:=1
endif

vet:
go list ./... | grep -v "./vendor/*" | xargs go vet
Expand All @@ -30,14 +79,17 @@ version:
@echo $(VERSION)

./bin/node-problem-detector: $(PKG_SOURCES)
GOOS=linux go build -o bin/node-problem-detector \
-ldflags '-w -extldflags "-static" -X $(PKG)/pkg/version.version=$(VERSION)' \
cmd/node_problem_detector.go
CGO_ENABLED=$(CGO_ENABLED) GOOS=linux go build -o bin/node-problem-detector \
-ldflags '-w -X $(PKG)/pkg/version.version=$(VERSION)' \
$(BUILD_TAGS) cmd/node_problem_detector.go

Dockerfile: Dockerfile.in
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should there be a .gitignore for Dockerfile?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you don't add Dockerfile to PHONY targets, users will be confused. Consider this scenario:

make
# The Dockerfile file didn't exist and make created it.
# @BASEIMAGE@ is Fedora and systemd-journal is enabled.
make ENABLE_JOURNALD=0
# The Dockerfile file already exists and make skipped the Dockerfile target. 
# However the user believes that @BASEIMAGE@ is set to "alpine:3.4" and
# systemd-journal is disabled.

Copy link
Member Author

@Random-Liu Random-Liu Nov 22, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jfilak I see. You are right. Thanks for explaining. :)
Will do.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Member

@andyxning andyxning Feb 3, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Random-Liu @jfilak

According to #39(discussion), we should also make ./bin/node-problem-detector target phony or otherwise make will think that ./bin/node-problem-detector is already exists and will not recompile it with journald disabled. WDYT.

sed -e 's|@BASEIMAGE@|$(BASEIMAGE)|g' $< >$@

test: vet fmt
go test -timeout=1m -v -race ./pkg/...
go test -timeout=1m -v -race ./pkg/... $(BUILD_TAGS)

build-container: ./bin/node-problem-detector
build-container: ./bin/node-problem-detector Dockerfile
docker build -t $(IMAGE) .

build-tar: ./bin/node-problem-detector
Expand Down
6 changes: 6 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -97,11 +97,17 @@ spec:
- name: log
mountPath: /log
readOnly: true
- name: localtime
mountPath: /etc/localtime
readOnly: true
volumes:
- name: log
# Config `log` to your system log directory
hostPath:
path: /var/log/
- name: localtime
hostPath:
path: /etc/localtime
```
* Edit node-problem-detector.yaml to fit your environment: Set `log` volume to your system log diretory. (Used by KernelMonitor)
* Create the DaemonSet with `kubectl create -f node-problem-detector.yaml`
Expand Down
10 changes: 6 additions & 4 deletions cmd/node_problem_detector.go
Original file line number Diff line number Diff line change
Expand Up @@ -18,15 +18,15 @@ package main

import (
"flag"
"fmt"
"net/url"
"os"

"github.com/golang/glog"

"k8s.io/node-problem-detector/pkg/kernelmonitor"
"k8s.io/node-problem-detector/pkg/problemdetector"
"k8s.io/node-problem-detector/pkg/version"

"github.com/golang/glog"
"fmt"
)

// TODO: Move flags to options directory.
Expand Down Expand Up @@ -86,5 +86,7 @@ func main() {

k := kernelmonitor.NewKernelMonitorOrDie(*kernelMonitorConfigPath)
p := problemdetector.NewProblemDetector(k, *apiServerOverride, nodeName)
p.Run()
if err := p.Run(); err != nil {
glog.Fatalf("Problem detector failed with error: %v", err)
}
}
3 changes: 2 additions & 1 deletion config/kernel-monitor.json
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
{
"logPath": "/log/kern.log",
"plugin": "journald",
"logPath": "/var/log/journal",
"lookback": "10m",
"startPattern": "Initializing cgroup subsys cpuset",
"bufferSize": 10,
Expand Down
11 changes: 10 additions & 1 deletion node-problem-detector.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ spec:
- name: node-problem-detector
command:
- /node-problem-detector
- --logtostderr
- --kernel-monitor=/config/kernel-monitor.json
image: gcr.io/google_containers/node-problem-detector:v0.2
imagePullPolicy: Always
Expand All @@ -24,7 +25,12 @@ spec:
fieldPath: spec.nodeName
volumeMounts:
- name: log
mountPath: /log
mountPath: /var/log
readOnly: true
# Make sure node problem detector is in the same timezone
# with the host.
- name: localtime
mountPath: /etc/localtime
readOnly: true
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI yesterday, Dan Walsh posted this on Atomic devel list (Atomic is basically Fedora):
https://lists.projectatomic.io/projectatomic-archives/atomic-devel/2016-December/msg00014.html

In short, if the /etc/localtime files is are symbolic link, then you end up with the localtime in the container not updated.

However, I don't think you can do something to avoid this issue.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's wired. When I initially decided to go with this approach, I remembered I've tried the symlink situation, it worked.

I'll try again then.

Copy link
Member Author

@Random-Liu Random-Liu Dec 9, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I read your link and understand what you mean. Thanks for sharing!

IIUC, it means that if /etc/localtime inside the container is a symlink (fedora is just using symlink), say /etc/localtime -> ../usr/share/zoneinfo/UTC, ../usr/share/zoneinfo/UTC will be mounted instead.

This should not cause trouble to golang, but it's worth fixing to avoid confusion.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

- name: config
mountPath: /config
Expand All @@ -34,6 +40,9 @@ spec:
# Config `log` to your system log directory
hostPath:
path: /var/log/
- name: localtime
hostPath:
path: /etc/localtime
- name: config
configMap:
name: node-problem-detector-config
31 changes: 18 additions & 13 deletions pkg/kernelmonitor/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,12 +9,8 @@ The rule list is extensible.

## Limitations

* Kernel Monitor only supports file based kernel log now. It doesn't support log tools
like journald. There is an [open issue](https://github.com/kubernetes/node-problem-detector/issues/14)
to add journald support.

* Kernel Monitor has assumption on kernel log format, now it only works on Ubuntu and
Debian. However, it is easy to extend it to [support other log format](#support-other-log-format).
* Kernel Monitor only supports syslog (rsyslog) and journald now, but it is easy
to extend it with [new log watcher](#new-log-watcher)

## Add New NodeConditions

Expand Down Expand Up @@ -43,14 +39,23 @@ with new rule definition:
}
```

## Change Log Path
## Log Watchers

Kernel monitor supports different log management tools with different log
watchers:
* [syslog](https://github.com/kubernetes/node-problem-detector/blob/master/pkg/kernelmonitor/logwatchers/syslog)
* [journald](https://github.com/kubernetes/node-problem-detector/blob/master/pkg/kernelmonitor/logwatchers/journald)

### Change Log Path

Kernel log in different OS distros may locate in different path. The `log`
Kernel log on different OS distros may locate in different path. The `logPath`
field in `config/kernel-monitor.json` is the log path inside the container.
You can always configure it to match your OS distro.
You can always configure `logPath` and volume mount to match your OS distro.
* syslog: `logPath` is the kernel log path, usually `/var/log/kern.log`.
* journald: `logPath` is the journal log directory, usually `/var/log/journal`.

## Support Other Log Format
### New Log Watcher

Kernel monitor uses [`Translator`](https://github.com/kubernetes/node-problem-detector/blob/master/pkg/kernelmonitor/translator/translator.go)
plugin to translate kernel log the internal data structure. It is easy to
implement a new translator for a new log format.
Kernel monitor uses [Log
Watcher](https://github.com/kubernetes/node-problem-detector/blob/master/pkg/kernelmonitor/logwatchers/types/log_watcher.go) to support different log management tools.
It is easy to implement a new log watcher.
49 changes: 49 additions & 0 deletions pkg/kernelmonitor/config.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
/*
Copyright 2016 The Kubernetes Authors All rights reserved.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/

package kernelmonitor

import (
watchertypes "k8s.io/node-problem-detector/pkg/kernelmonitor/logwatchers/types"
kerntypes "k8s.io/node-problem-detector/pkg/kernelmonitor/types"
"k8s.io/node-problem-detector/pkg/types"
)

// MonitorConfig is the configuration of kernel monitor.
type MonitorConfig struct {
// WatcherConfig is the configuration of kernel log watcher.
watchertypes.WatcherConfig
// BufferSize is the size (in lines) of the log buffer.
BufferSize int `json:"bufferSize"`
// Source is the source name of the kernel monitor
Source string `json:"source"`
// DefaultConditions are the default states of all the conditions kernel monitor should handle.
DefaultConditions []types.Condition `json:"conditions"`
// Rules are the rules kernel monitor will follow to parse the log file.
Rules []kerntypes.Rule `json:"rules"`
// StartPattern is the pattern of the start line
StartPattern string `json:"startPattern, omitempty"`
}

// applyDefaultConfiguration applies default configurations.
func applyDefaultConfiguration(cfg *MonitorConfig) {
if cfg.BufferSize == 0 {
cfg.BufferSize = 10
}
if cfg.WatcherConfig.Lookback == "" {
cfg.WatcherConfig.Lookback = "0"
}
}
Loading