Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs(specs): Add probe as value to startup_error_behavior #16052

Merged
merged 17 commits into from
Dec 11, 2024

Conversation

LandonTClipp
Copy link
Contributor

@LandonTClipp LandonTClipp commented Oct 21, 2024

Summary

This PR submits a spec for a new probe value to startup_error_behavior.

Checklist

  • No AI generated code was used in this PR

Related issues

resolves #16028

@telegraf-tiger telegraf-tiger bot added the docs Issues related to Telegraf documentation and configuration descriptions label Oct 21, 2024
@LandonTClipp LandonTClipp changed the title docs: Add probe as value to startup_error_behavior docs(specs): Add probe as value to startup_error_behavior Oct 23, 2024
@LandonTClipp
Copy link
Contributor Author

@srebhan sorry for the ping, just curious if you have time to give thoughts on this spec?

Copy link
Member

@DStrand1 DStrand1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for making this PR! I have one question below, but additionally I think it may make more sense to include this as an edit to the existing spec (possibly dependent on the outcome from my other question)

Additionally, could you format the long lines with line breaks to reduce the line length a bit? Thanks!

docs/specs/tsd-008-probe-on-startup.md Outdated Show resolved Hide resolved
Copy link
Member

@srebhan srebhan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@LandonTClipp sorry for the late response and thanks for the spec! That already reads quite nicely. I think you need to more describe the intended behavior here than implementations. Use must and should for mandatory and optional features respectively. I would also love to see the Configuration part to be added to TSD-006 to keep all option related aspects there...

docs/specs/tsd-008-probe-on-startup.md Outdated Show resolved Hide resolved
docs/specs/tsd-008-probe-on-startup.md Outdated Show resolved Hide resolved
docs/specs/tsd-008-probe-on-startup.md Outdated Show resolved Hide resolved
docs/specs/tsd-008-probe-on-startup.md Outdated Show resolved Hide resolved
@srebhan srebhan self-assigned this Oct 29, 2024
@LandonTClipp
Copy link
Contributor Author

@srebhan @DStrand1 I addressed all of your comments, including the line width comment. Thank you for your thoughtful insight 🙏🏻

Copy link
Member

@srebhan srebhan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @LandonTClipp for working with us on this spec! Some more comments from my side, but I feel we are very close...

docs/specs/tsd-006-startup-error-behavior.md Outdated Show resolved Hide resolved
docs/specs/tsd-008-probe-on-startup.md Outdated Show resolved Hide resolved
docs/specs/tsd-008-probe-on-startup.md Outdated Show resolved Hide resolved
docs/specs/tsd-008-probe-on-startup.md Outdated Show resolved Hide resolved
docs/specs/tsd-008-probe-on-startup.md Outdated Show resolved Hide resolved
docs/specs/tsd-008-probe-on-startup.md Outdated Show resolved Hide resolved
LandonTClipp and others added 7 commits October 31, 2024 12:18
Co-authored-by: Sven Rebhan <36194019+srebhan@users.noreply.github.com>
Co-authored-by: Sven Rebhan <36194019+srebhan@users.noreply.github.com>
Co-authored-by: Sven Rebhan <36194019+srebhan@users.noreply.github.com>
Co-authored-by: Sven Rebhan <36194019+srebhan@users.noreply.github.com>
Co-authored-by: Sven Rebhan <36194019+srebhan@users.noreply.github.com>
@LandonTClipp
Copy link
Contributor Author

@srebhan addressed comments and made some typo fixes.

Also related, what do you think about reverting #15916 in favor of this spec?

Copy link
Member

@srebhan srebhan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @LandonTClipp! The PR looks great! Could you please rename the file to tsd-009-probing.md as we already merged TSD-008 in the meantime. Please also change the reference in TSD-006 accordingly!

Regarding reverting the code in nvidia-smi, yes I think you can revert or change the implementation to reflect this spec as we did not release the mentioned feature yet...

@LandonTClipp
Copy link
Contributor Author

Regarding reverting the code in nvidia-smi, yes I think you can revert or change the implementation to reflect this spec as we did not release the mentioned feature yet...

Excellent, I'll probably just revert for now, in case I cannot implement this spec in time for the release.

Copy link
Member

@srebhan srebhan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot @LandonTClipp!

@srebhan srebhan added the ready for final review This pull request has been reviewed and/or tested by multiple users and is ready for a final review. label Nov 11, 2024
@srebhan srebhan assigned DStrand1 and unassigned srebhan Nov 11, 2024
@LandonTClipp
Copy link
Contributor Author

@DStrand1 sorry for the ping but I wanted to check what you think of this. I'm motivated to implement it for the plugins I care about :D

@LandonTClipp
Copy link
Contributor Author

@srebhan @DStrand1 pinging again on this

@srebhan
Copy link
Member

srebhan commented Dec 5, 2024

@LandonTClipp could you please fix the linter issues? Seems like you do have a lot of trailing spaces in markdown....

@telegraf-tiger
Copy link
Contributor

telegraf-tiger bot commented Dec 5, 2024

Download PR build artifacts for linux_amd64.tar.gz, darwin_arm64.tar.gz, and windows_amd64.zip.
Downloads for additional architectures and packages are available below.

⚠️ This pull request increases the Telegraf binary size by 2.03 % for linux amd64 (new size: 261.9 MB, nightly size 256.7 MB)

📦 Click here to get additional PR build artifacts

Artifact URLs

DEB RPM TAR GZ ZIP
amd64.deb aarch64.rpm darwin_amd64.tar.gz windows_amd64.zip
arm64.deb armel.rpm darwin_arm64.tar.gz windows_arm64.zip
armel.deb armv6hl.rpm freebsd_amd64.tar.gz windows_i386.zip
armhf.deb i386.rpm freebsd_armv7.tar.gz
i386.deb ppc64le.rpm freebsd_i386.tar.gz
mips.deb riscv64.rpm linux_amd64.tar.gz
mipsel.deb s390x.rpm linux_arm64.tar.gz
ppc64el.deb x86_64.rpm linux_armel.tar.gz
riscv64.deb linux_armhf.tar.gz
s390x.deb linux_i386.tar.gz
linux_mips.tar.gz
linux_mipsel.tar.gz
linux_ppc64le.tar.gz
linux_riscv64.tar.gz
linux_s390x.tar.gz

@LandonTClipp
Copy link
Contributor Author

@srebhan done!

Copy link
Member

@DStrand1 DStrand1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the great work!

@DStrand1 DStrand1 merged commit 795b8a9 into influxdata:master Dec 11, 2024
27 checks passed
@github-actions github-actions bot added this to the v1.33.1 milestone Dec 11, 2024
justinwwhuang pushed a commit to justinwwhuang/telegraf_fork that referenced this pull request Dec 19, 2024
izekr pushed a commit to izekr/telegraf that referenced this pull request Dec 19, 2024
Comment on lines +86 to +87
If probing is available *and* returns an error Telegraf must *ignore* the
plugin as-if it was not configured at all.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not clear what happens if the probing is not returning an error. As it reads that this probing only needs to be done when startup returns errors.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Hipska perhaps one more paragraph added here: https://github.com/influxdata/telegraf/pull/16052/files#diff-2d519c82d1022b0befbd7601817fd1b2073ad5e9b607cd9c2205f0529b7d0fffR47 more clearly outlining what Telegraf does on Probe success is what you're looking for? I'm happy to add more clarification.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah indeed.

Also, is probing really only on startup? Meaning if conditions change and recourses come available, that will only take effect when telegraf gets restarted?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's correct. Sorry for the delayed response on the update to the spec, I will get to it soon.

This spec is backwards compatible so if you don't want this behavior, you don't need to do anything.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As said in other comment, maybe the spec should clarify a use case? When would it be handy to only be able to recover by restarting telegraf completely?

DStrand1 pushed a commit that referenced this pull request Jan 10, 2025
@LandonTClipp LandonTClipp deleted the LandonTClipp/probe_spec branch January 16, 2025 16:58
Comment on lines +14 to +20
When plugins are first instantiated, Telegraf will call the plugin's `Start()`
method (for inputs) or `Connect()` (for outputs) which will initialize its
configuration based off of config options and the running environment. It is
sometimes the case that while the initialization step succeeds, the upstream
service in which the plugin relies on is not actually running, or is not capable
of being communicated with due to incorrect configuration or environmental
problems. In situations like this, Telegraf does not detect that the plugin's
Copy link
Contributor

@Hipska Hipska Jan 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initialisation happens in Init(), connection attempts in Start() (for service plugins) / Connect() (for outputs).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs Issues related to Telegraf documentation and configuration descriptions ready for final review This pull request has been reviewed and/or tested by multiple users and is ready for a final review. specification
Projects
None yet
Development

Successfully merging this pull request may close these issues.

probe_on_startup: Add input config and interface similar to startup_error_behavior
4 participants