Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

configuring graphite/carbon for collecting spark metrics #9

Open
rokroskar opened this issue Dec 11, 2015 · 0 comments
Open

configuring graphite/carbon for collecting spark metrics #9

rokroskar opened this issue Dec 11, 2015 · 0 comments

Comments

@rokroskar
Copy link

this is not specifically an issue with your grafana-spark dashboard, but I'm not able to find any information on this anywhere other than your blog post describing this package... so: how do you actually configure carbon?

The problem I am seeing is that I don't seem to get all the metrics for all executors. This includes ones that should be present for all executors, like heap space data.

I thought the problem might be dropped packets if the smallest carbon collection period (specified in storage-schemas.conf) is longer than the spark sink.graphite.period (in metrics.properties) -- but setting the spark metrics period to be longer than the shortest collection period just results in a bunch of null values and does not resolve the problem of missing data for a fraction of the executors.

Here's a screenshot several minutes into an application that is running on 20 executors:

screen shot 2015-12-11 at 14 56 46

I don't think it's an issue of load on the carbon/graphite server, since it doesn't seem to be at all CPU bound and there are no errors from the Spark side about reporting the metrics to graphite.

I'm curious what your experience is with this? How do you have the metrics periods configured?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant