telegraf just stops working

i'm testing telegraf on different systems and it seems to stop working after some time.
I guess tehere is a kind of network hickup, because it stops working on multipe servers at around the same time. But some Servers just continued to work. So I'm sure neither influx nor grafana had a problem.

The Systems in question are most Ubuntu 11,14 or 16. But one of the ubuntu 16 servers continue to work.
In all cases the logfiles just stopped containing anything, but the process continues to run. My guess is that there is no safeguard around the metrics, so when they start hanging for whatever reason telegraf stops working .?

Last Log entries are:

```
2016/05/13 11:15:30 Wrote 21 metrics to output influxdb in 46.968928ms
2016/05/13 11:15:40 Gathered metrics, (10s interval), from 11 inputs in 33.493802ms
```

Which is the point in time where telegraf stops reporting. 
It seems to be still running:

```
> ps aux | grep tel
telegraf  7095  0.0  0.0 129524  3372 ?        Sl   May13   0:25 /usr/bin/telegraf -pidfile /var/run/telegraf/telegraf.pid -config /etc/telegraf/telegraf.conf -config-directory /etc/telegraf/telegraf.d
```

After an service restart all is working fine again. But this happend before, so i guess it will also happen again.

I know this ticket is very broad, but i have nothing to pin it down to. Only there should be safeguards in place to prevent telegraf from stopping completely.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

telegraf just stops working #1230

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development