-
-
Notifications
You must be signed in to change notification settings - Fork 828
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Activity/OpenTelemetry support #1499
Comments
You could subclass any class in MailKit to add whatever functionality you want. I don't want to take a dependency on OpenTelemetry because most of my users won't want it. |
@jstedfast it doesn't require any external dependency. In-order to implement tracing and metrics in dotnet you use constructs from dotnet, namely |
@EugeneKrapivin right, but I'm talking about not wanting to add the OpenTelemetry NuGet dependency. Maybe I'm just misunderstanding the feature request? |
@jstedfast To get the library instrumented (tracing, metrics, logs) you don't have to use any external dependencies as the maintainer of this library. The users of your library, if they chose to export those traces, logs and metrics will use I'm working on a small POC, and actually using MailKit (that's how I wound up here), so I have a handy example: In a class called public SmtpSenderService(ILogger<SmtpSenderService> logger, IMeterFactory meterFactory)
{
_logger = logger;
_meterFactory = meterFactory;
_meter = _meterFactory.Create("SMTPSender"); // remember this part
_createClientHistogram = _meter.CreateHistogram<double>("smtp_create_client", "ms");
_connectClientHistogram = _meter.CreateHistogram<double>("smtp_connect_client", "ms");
_sendMessageHistogram = _meter.CreateHistogram<double>("smtp_send", "ms");
_exceptionsCounter = _meter.CreateCounter<long>("smtp_exception", "exception");
}
private static ActivitySource source = new("SmptSenderService", "1.0.0"); // remember this part
using System.Diagnostics;
using System.Diagnostics.Metrics;
using Microsoft.Extensions.Logging; namespaces, which require no external dependencies. if (_client is null)
{
var startCreate = Stopwatch.GetTimestamp();
_client ??= new SmtpClient();
var startConnect = Stopwatch.GetTimestamp();
using (var _ = source.StartActivity("Smtp.Connect", ActivityKind.Client))
await _client.ConnectAsync(config.ServerUrl, int.Parse(config.ServerPort), false, cts.Token);
using (var _ = source.StartActivity("Smtp.Authenticate", ActivityKind.Client))
await _client.AuthenticateAsync(config.Username, config.Password, cts.Token);
var elapsedConnect = Stopwatch.GetElapsedTime(startConnect);
var elapsedCreate = Stopwatch.GetElapsedTime(startCreate);
_createClientHistogram.Record(elapsedCreate.TotalMilliseconds);
_connectClientHistogram.Record(elapsedConnect.TotalMilliseconds, new KeyValuePair<string, object?>("flow", "create"));
} and using var sendActivity = source.StartActivity("Smtp.Send", ActivityKind.Client);
try
{
var r = await _client.SendAsync(message, cts.Token);
var total = Stopwatch.GetElapsedTime(start);
sendActivity?.AddTag("status", "success");
sendActivity?.AddEvent(new ActivityEvent("email sent"));
_sendMessageHistogram.Record(total.TotalMicroseconds, new KeyValuePair<string, object?>("status", "success"));
}
catch (SmtpCommandException ex) when (ex.ErrorCode == SmtpErrorCode.RecipientNotAccepted)
{
_logger.LogError(ex, "failed sending an email with error {errorCode}", ex.ErrorCode);
var total = Stopwatch.GetElapsedTime(start);
_sendMessageHistogram.Record(total.TotalMicroseconds, new KeyValuePair<string, object?>("status", "failed"));
_exceptionsCounter.Add(1,
new KeyValuePair<string, object?>("exception", ex.GetType().Name),
new KeyValuePair<string, object?>("flow", "send"));
sendActivity?.AddTag("status", "failed");
sendActivity?.AddTag("exception", ex.GetType().Name);
sendActivity?.AddTag("smtp.error_code", ex.ErrorCode);
return new(false, $"failed with {ex.ErrorCode}");
}
// the catch clauses continue, just to make sure I catch all the exceptions possible and log them properly As I've said before, none of this code requires any external dependencies. Having this code in MailKit could really help with observability and monitoring for mailkit users. In a different place in my service (the host part) I have my dependencies on OpenTelemetry: builder.Services
.AddOpenTelemetry()
.WithTracing(tracing =>
{
tracing.AddSource("SmptSenderService");
})
.WithMetrics(metrics =>
{
metrics
.AddMeter("SMTPSender");
}); these lines make sure that when traces and metrics are exported, my metrics and traces are also exported. and because logs are also enriched when exported: Having MailKit instrumentation out of the box would be really helpful in reducing boilerplate code for monitoring the system with traces and quality metrics. |
Okay, how do you propose adding this to MailKit? |
Well I'd suggest:
This is probably a huge ask, and I think it should be opened for discussion to see if it's worth the investment (aka will users of the library use it). As a trend the dotnet community pushes toward adding instrumentation to libraries (especially IO bound libraries) so that distributed tracing (which is priceless in debugging microservices plane) could be done through just as an example, my PoC uses: ASP.net minimal APIs, KafkaFlow, Orleans, Azure Tables (for orleans storage) and MailKit. which is accompanied by a correlated log: {
"body": "failed sending an email with error SenderNotAccepted",
"traceid": "b62dd4170b6c0b0da30a6dce0c0fcdf5",
"spanid": "b37a86f71fc2c09f",
"severity": "Error",
"flags": 1,
"attributes": {
"ParentId": "e5e6b9e3c5dce273",
"SpanId": "b37a86f71fc2c09f",
"TraceId": "b62dd4170b6c0b0da30a6dce0c0fcdf5",
"errorCode": "SenderNotAccepted",
"exception.message": "Invalid sender user681@mailhog.local",
"exception.stacktrace": "MailKit.Net.Smtp.SmtpCommandException: Invalid sender user681@mailhog.local\r\n at MailKit.Net.Smtp.SmtpClient.SendAsync(FormatOptions options, MimeMessage message, MailboxAddress sender, IList`1 recipients, CancellationToken cancellationToken, ITransferProgress progress)\r\n at MailKit.Net.Smtp.SmtpClient.SendAsync(FormatOptions options, MimeMessage message, MailboxAddress sender, IList`1 recipients, CancellationToken cancellationToken, ITransferProgress progress)\r\n at Processor.Grains.SmtpSenderService.SendMessage(SmtpConfig config, EmailRequest emailRequest) in C:\\Users\\nocgod\\source\\repos\\EmailServicePoc\\Processor\\Processor.Grains\\SmtpSenderService.cs:line 138",
"exception.type": "SmtpCommandException",
"{OriginalFormat}": "failed sending an email with error {errorCode}"
},
"resources": {
"service.instance.id": "243d4bd2-79e4-4ed1-a308-4b8d37bc7665",
"service.name": "processor.host",
"telemetry.sdk.language": "dotnet",
"telemetry.sdk.name": "opentelemetry",
"telemetry.sdk.version": "1.7.0"
},
"instrumentation_scope": {
"name": "Processor.Grains.SmtpSenderService"
}
} |
Right, but how should MailKit provide this info to the app?
Again, how should MailKit provide this info? via events? via some new IClientMetrics interface that can be set on any of the clients? I'm not sure what would make the most sense.
Currently, MailKit does not depend on Microsoft.Extensions.Logging or Microsoft.Extensions.DependencyInjection, so this would be awkward. I've been thinking about moving to DependencyInjection as an alternative to MimeKit.ParserOptions.Register() and MimeKit.Cryptography.CryptographyContext.Register() (especially with the request to make MimeKit depend on MimeKitLite rather than being more-or-less API duplicates of each other), but so far I haven't gotten around to that or even figured out exactly how I want to do that... What do those other libraries you are using do? It sounds like they just assume you'll be logging telemetry/metrics/etc and only really leave up the configuration to the developers, but MailKit is in a bit of a different situation because it's not really designed specifically for ASP.NET apps where you can pretty much count on everyone wanting that kind of telemetry logging. |
it should not concern the library. As long as you use builder.Services
.AddOpenTelemetry()
.WithMetrics(metrics =>
{
metrics
.AddRuntimeInstrumentation()
.AddMeter("Microsoft.Orleans")
.AddBuiltInMeters();
})
.WithTracing(tracing =>
{
if (builder.Environment.IsDevelopment())
{
// We want to view all traces in development
tracing.SetSampler(new AlwaysOnSampler());
}
tracing
.AddSource("Azure.*")
.AddSource("Microsoft.Orleans.Runtime")
.AddSource("Microsoft.Orleans.Application")
.AddSource(KafkaFlowInstrumentation.ActivitySourceName)
.AddAspNetCoreInstrumentation()
.AddGrpcClientInstrumentation()
.AddHttpClientInstrumentation(o => {
o.FilterHttpRequestMessage = (_) => Activity.Current?.Parent?.Source?.Name != "Azure.Core.Http";
});
}); Notice that
you could expose a logging adapter and a logging abstraction like KafkaFlow does.
I guess this will require a bigger refactoring that should probably be out-of-scope for the purposes of observability instrumentation.
for tracing and metrics they use standard dotnet APIs:
which are not dependant on ASP.NET specifically, as you can use them in a plain worker service without dependency on the Web SDKs. The logging is a design decision you could make:
|
@EugeneKrapivin can you help define which metrics (and other info) you want/need? I'll put this in the queue for a future v5.0 release but first I need to know what specific metrics developers care about having. (Note: it seems like there is growing demand for a unification of MimeKit and MimeKitLite which is going to require me to work on a v5.0 sooner rather than later to split MimeKit into MimeKit.Core and MimeKit.Cryptography - this seems like the perfect opportunity to add in metrics & tracing data). |
@jstedfast sorry for the late reply.
I'm adding some code from my POC project in which I've added some metrics, traces and logs around some simple usage of the SMTP client. Please do not take this example as best practice =) I'd suggest taking a look at the best practices from msft documentation :) // note that the SmtpClient is not thread safe (we shouldn't be using reentrant here)
[StatelessWorker]
public class SmtpSenderService : Grain, ISmptSenderService, IDisposable
{
private bool _disposedValue;
private SmtpClient _client;
private readonly ILogger<SmtpSenderService> _logger;
private readonly IMeterFactory _meterFactory;
private Meter _meter;
private Histogram<double> _createClientHistogram;
private Histogram<double> _connectClientHistogram;
private Histogram<double> _sendMessageHistogram;
private Counter<long> _exceptionsCounter;
public SmtpSenderService(ILogger<SmtpSenderService> logger, IMeterFactory meterFactory)
{
_logger = logger;
_meterFactory = meterFactory;
_meter = _meterFactory.Create("SMTPSender");
_createClientHistogram = _meter.CreateHistogram<double>("smtp_create_client", "ms");
_connectClientHistogram = _meter.CreateHistogram<double>("smtp_connect_client", "ms");
_sendMessageHistogram = _meter.CreateHistogram<double>("smtp_send", "ms");
_exceptionsCounter = _meter.CreateCounter<long>("smtp_exception", "exception");
}
private static ActivitySource source = new("SmptSenderService", "1.0.0");
public async Task<SendResult> SendMessage(SmtpConfig config, EmailRequest emailRequest)
{
var cts = new CancellationTokenSource(TimeSpan.FromSeconds(10));
try
{
if (_client is null)
{
var startCreate = Stopwatch.GetTimestamp();
_client ??= new SmtpClient();
var startConnect = Stopwatch.GetTimestamp();
using (var _ = source.StartActivity("Smtp.Connect", ActivityKind.Client))
await _client.ConnectAsync(config.ServerUrl, int.Parse(config.ServerPort), false, cts.Token);
using (var _ = source.StartActivity("Smtp.Authenticate", ActivityKind.Client))
await _client.AuthenticateAsync(config.Username, config.Password, cts.Token);
var elapsedConnect = Stopwatch.GetElapsedTime(startConnect);
var elapsedCreate = Stopwatch.GetElapsedTime(startCreate);
_createClientHistogram.Record(elapsedCreate.TotalMilliseconds);
_connectClientHistogram.Record(elapsedConnect.TotalMilliseconds, new KeyValuePair<string, object?>("flow", "create"));
}
var reconnect = false;
try
{
using var _ = source.StartActivity("Smtp.NoOp", ActivityKind.Client);
await _client.NoOpAsync();
}
catch (Exception ex)
{
reconnect = true;
}
if (!_client.IsConnected || reconnect)
{
var startConnect = Stopwatch.GetTimestamp();
using (var _ = source.StartActivity("Smtp.Connect", ActivityKind.Client))
await _client.ConnectAsync(config.ServerUrl, int.Parse(config.ServerPort), false, cts.Token);
using (var _ = source.StartActivity("Smtp.Authenticate", ActivityKind.Client))
await _client.AuthenticateAsync(config.Username, config.Password, cts.Token);
var elapsedConnect = Stopwatch.GetElapsedTime(startConnect);
_connectClientHistogram.Record(elapsedConnect.TotalMilliseconds, new KeyValuePair<string, object?>("flow", "create"));
}
DelayDeactivation(TimeSpan.FromSeconds(30));
}
catch (AuthenticationException ex)
{
_exceptionsCounter.Add(1,
new KeyValuePair<string, object?>("exception", ex.GetType().Name),
new KeyValuePair<string, object?>("flow", "connect"));
_logger.LogError(ex, "failed authenticating with smtp server");
}
catch (Exception ex) when (ex is OperationCanceledException || ex is TimeoutException)
{
_logger.LogError(ex, "failed sending message {messageId} due to a timeout", emailRequest.MessageId);
_exceptionsCounter.Add(1,
new KeyValuePair<string, object?>("exception", ex.GetType().Name),
new KeyValuePair<string, object?>("flow", "connect"));
return new(false, "timeout");
}
catch (Exception ex)
{
_logger.LogError(ex, "failed sending message {messageId} due to an exception", emailRequest.MessageId);
_exceptionsCounter.Add(1,
new KeyValuePair<string, object?>("exception", ex.GetType().Name),
new KeyValuePair<string, object?>("flow", "connect"));
return new(false, "exception");
}
var message = CreateMimeMessage(config, emailRequest);
var start = Stopwatch.GetTimestamp();
using var sendActivity = source.StartActivity("Smtp.Send", ActivityKind.Client);
try
{
var r = await _client.SendAsync(message, cts.Token);
var total = Stopwatch.GetElapsedTime(start);
sendActivity?.AddTag("status", "success");
sendActivity?.AddEvent(new ActivityEvent("email sent"));
_sendMessageHistogram.Record(total.TotalMicroseconds, new KeyValuePair<string, object?>("status", "success"));
}
catch (SmtpCommandException ex) when (ex.ErrorCode == SmtpErrorCode.RecipientNotAccepted)
{
_logger.LogError(ex, "failed sending an email with error {errorCode}", ex.ErrorCode);
var total = Stopwatch.GetElapsedTime(start);
_sendMessageHistogram.Record(total.TotalMicroseconds, new KeyValuePair<string, object?>("status", "failed"));
_exceptionsCounter.Add(1,
new KeyValuePair<string, object?>("exception", ex.GetType().Name),
new KeyValuePair<string, object?>("flow", "send"));
sendActivity?.AddTag("status", "failed");
sendActivity?.AddTag("exception", ex.GetType().Name);
sendActivity?.AddTag("smtp.error_code", ex.ErrorCode);
return new(false, $"failed with {ex.ErrorCode}");
}
catch (SmtpCommandException ex) when (ex.ErrorCode == SmtpErrorCode.SenderNotAccepted)
{
_logger.LogError(ex, "failed sending an email with error {errorCode}", ex.ErrorCode);
var total = Stopwatch.GetElapsedTime(start);
_sendMessageHistogram.Record(total.TotalMicroseconds, new KeyValuePair<string, object?>("status", "failed"));
_exceptionsCounter.Add(1,
new KeyValuePair<string, object?>("exception", ex.GetType().Name),
new KeyValuePair<string, object?>("flow", "send"));
sendActivity?.AddTag("status", "failed");
sendActivity?.AddTag("exception", ex.GetType().Name);
sendActivity?.AddTag("smtp.error_code", ex.ErrorCode);
return new(false, $"failed with {ex.ErrorCode}");
}
catch (SmtpCommandException ex) when (ex.ErrorCode == SmtpErrorCode.MessageNotAccepted)
{
_logger.LogError(ex, "failed sending an email with error {errorCode}", ex.ErrorCode);
var total = Stopwatch.GetElapsedTime(start);
_sendMessageHistogram.Record(total.TotalMicroseconds, new KeyValuePair<string, object?>("status", "failed"));
_exceptionsCounter.Add(1,
new KeyValuePair<string, object?>("exception", ex.GetType().Name),
new KeyValuePair<string, object?>("flow", "send"));
sendActivity?.AddTag("status", "failed");
sendActivity?.AddTag("exception", ex.GetType().Name);
sendActivity?.AddTag("smtp.error_code", ex.ErrorCode);
return new(false, $"failed with {ex.ErrorCode}");
}
catch (SmtpCommandException ex) when (ex.ErrorCode == SmtpErrorCode.UnexpectedStatusCode)
{
_logger.LogError(ex, "failed sending an email with error {errorCode}", ex.ErrorCode);
var total = Stopwatch.GetElapsedTime(start);
_sendMessageHistogram.Record(total.TotalMicroseconds, new KeyValuePair<string, object?>("status", "failed"));
_exceptionsCounter.Add(1,
new KeyValuePair<string, object?>("exception", ex.GetType().Name),
new KeyValuePair<string, object?>("flow", "send"));
sendActivity?.AddTag("status", "failed");
sendActivity?.AddTag("exception", ex.GetType().Name);
sendActivity?.AddTag("smtp.error_code", ex.ErrorCode);
return new(false, $"failed with {ex.ErrorCode}");
}
catch (SmtpProtocolException ex)
{
_logger.LogError(ex, "failed sending message {messageId} due to smtp protocol exception", emailRequest.MessageId);
var total = Stopwatch.GetElapsedTime(start);
_sendMessageHistogram.Record(total.TotalMicroseconds, new KeyValuePair<string, object?>("status", "failed"));
_exceptionsCounter.Add(1,
new KeyValuePair<string, object?>("exception", ex.GetType().Name),
new KeyValuePair<string, object?>("flow", "send"));
sendActivity?.AddTag("status", "failed");
sendActivity?.AddTag("exception", ex.GetType().Name);
return new(false, ex.Message);
}
catch (IOException ex)
{
_logger.LogError(ex, "failed sending message {messageId} due to an IO exception", emailRequest.MessageId);
var total = Stopwatch.GetElapsedTime(start);
_sendMessageHistogram.Record(total.TotalMicroseconds);
_exceptionsCounter.Add(1,
new KeyValuePair<string, object?>("exception", ex.GetType().Name),
new KeyValuePair<string, object?>("flow", "send"));
sendActivity?.AddTag("status", "failed");
sendActivity?.AddTag("exception", ex.GetType().Name);
return new(false, ex.Message);
}
catch (Exception ex) when (ex is OperationCanceledException || ex is TimeoutException)
{
_logger.LogError(ex, "failed sending message {messageId} due to a timeout", emailRequest.MessageId);
_exceptionsCounter.Add(1,
new KeyValuePair<string, object?>("exception", ex.GetType().Name),
new KeyValuePair<string, object?>("flow", "send"));
sendActivity?.AddTag("status", "failed");
sendActivity?.AddTag("exception", ex.GetType().Name);
return new(false, "timeout");
}
finally
{
sendActivity?.Dispose(); // avoid nesting of Disconnect under the Send activity
using var _ = source.StartActivity("Smtp.Disconnect", ActivityKind.Client);
await _client.DisconnectAsync(true, cts.Token);
}
return new(true, "sent");
}
private static MimeMessage CreateMimeMessage(SmtpConfig config, EmailRequest emailRequest)
{
var message = new MimeMessage();
message.From.Add(new MailboxAddress(config.FromName, config.From));
message.To.Add(new MailboxAddress(emailRequest.RecipientName, emailRequest.Recipient));
message.Subject = emailRequest.Subject;
message.Body = new TextPart("plain")
{
Text = emailRequest.Body
};
return message;
}
protected virtual void Dispose(bool disposing)
{
if (!_disposedValue)
{
if (disposing)
{
_logger.LogWarning("disposing of connection");
_client?.Dispose();
}
_disposedValue = true;
}
}
public void Dispose()
{
// Do not change this code. Put cleanup code in 'Dispose(bool disposing)' method
Dispose(disposing: true);
GC.SuppressFinalize(this);
}
} |
@EugeneKrapivin Thanks for taking the time to respond and share your ideas. I'll definitely be trying to read up on Microsoft's guides on best practices. |
Thanks, @jstedfast, for considering adding Telemetry support in Mailkit v5.0. All other libraries that we use to make external calls, such as |
Ah, cool, I should look at how HttpClient does stuff as that might make be a close match with MailKit's client APIs. |
Some docs on dotnet's built-in networking (including HttpClient) metrics: https://learn.microsoft.com/en-us/dotnet/core/diagnostics/built-in-metrics-system-net#metric-httpclientopen_connections Also seems like this Meter stuff is only in net8.0? |
No. We are running .NET 6 and using |
A few things I'm not clear about:
|
I would still like answers to my questions in the previous comment, but these are the metrics that I've come up with so far (mostly focused on SmtpClient at the moment since that's what you guys are focused on): Socket MetricsMetric:
|
Name | Instrument Type | Unit | Description |
---|---|---|---|
mailkit.net.socket.connect |
Histogram | ms |
The number of milliseconds taken for a socket to connect to a remote host. |
Attribute | Type | Description | Examples | Presence |
---|---|---|---|---|
socket.connect.result |
string | The connection result. | succeeded , failed , cancelled |
Always |
network.peer.address |
string | Peer IP address of the socket connection. | 10.5.3.2 |
Always |
server.address |
string | The host name that the socket is connecting to. | smtp.gmail.com |
Always |
server.port |
int | The port that the socket is connecting to. | 465 |
Always |
error.type |
string | The type of exception encountered. | System.Net.Sockets.SocketException |
If an error occurred. |
socket.error |
string | The socket error code. | ConnectionRefused , TimedOut , ... |
If one was received. |
This metric measures the time it takes to connect a socket to a remote host.
SmtpClient Metrics
Metric: mailkit.net.smtp.client.connect.counter
Name | Instrument Type | Unit | Description |
---|---|---|---|
mailkit.net.smtp.client.connect.counter |
Counter | {connection} |
Number of outbound SMTP connections that are currently active. |
Attribute | Type | Description | Examples | Presence |
---|---|---|---|---|
network.peer.address |
string | Peer IP address of the client connection. | 10.5.3.2 |
When available |
server.address |
string | The host name that the client is connected to. | smtp.gmail.com |
Always |
server.port |
int | The port that the client is connecting to. | 465 |
Always |
url.scheme |
string | The URL scheme of the protocol used. | smtp or smtps |
Always |
This metric tracks the number of successful connections made by the SmtpClient.
Metric: mailkit.net.smtp.client.connect.failures
Name | Instrument Type | Unit | Description |
---|---|---|---|
mailkit.net.smtp.client.connect.failures |
Counter | {failure} |
The number of times a connection failed to be established to an SMTP server. |
Attribute | Type | Description | Examples | Presence |
---|---|---|---|---|
server.address |
string | The host name that the client is connected to. | smtp.gmail.com |
Always |
server.port |
int | The port that the client is connecting to. | 465 |
Always |
url.scheme |
string | The URL scheme of the protocol used. | smtp or smtps |
Always |
error.type |
string | The type of exception encountered. | MailKit.Net.Smtp.SmtpCommandException |
If an error occurred. |
smtp.status_code |
int | The SMTP status code returned by the server. | 530 , 550 , 553 , ... |
If one was received. |
socket.error |
string | The socket error code. | ConnectionRefused , TimedOut , ... |
If one was received. |
This metric counts the number of failed connection attempts made by the SmtpClient.
Metric: mailkit.net.smtp.client.session.duration
Name | Instrument Type | Unit | Description |
---|---|---|---|
mailkit.net.smtp.client.session.duration |
Histogram | s |
The duration of successfully established connections to an SMTP server. |
Attribute | Type | Description | Examples | Presence |
---|---|---|---|---|
network.peer.address |
string | Peer IP address of the client connection. | 10.5.3.2 |
When available |
server.address |
string | The host name that the client is connected to. | smtp.gmail.com |
Always |
server.port |
int | The port that the client is connecting to. | 465 |
Always |
url.scheme |
string | The URL scheme of the protocol used. | smtp or smtps |
Always |
error.type |
string | The type of exception encountered. | MailKit.Net.Smtp.SmtpCommandException |
If an error occurred. |
smtp.status_code |
int | The SMTP status code returned by the server. | 530 , 550 , 553 , ... |
If one was received. |
socket.error |
string | The socket error code. | ConnectionRefused , TimedOut , ... |
If one was received. |
This metric tracks the session duration of each SmtpClient connection and records whether that session ended voluntarily or due to an error.
Metric: mailkit.net.smtp.client.send.messages
Name | Instrument Type | Unit | Description |
---|---|---|---|
mailkit.net.smtp.client.send.messages |
Counter | {message} |
The number of messages successfully sent to an SMTP server. |
Attribute | Type | Description | Examples | Presence |
---|---|---|---|---|
network.peer.address |
string | Peer IP address of the client connection. | 10.5.3.2 |
When available |
server.address |
string | The host name that the client is connected to. | smtp.gmail.com |
Always |
server.port |
int | The port that the client is connecting to. | 465 |
Always |
url.scheme |
string | The URL scheme of the protocol used. | smtp or smtps |
Always |
This metric tracks the number of messages successfully sent by an SmtpClient.
Metric: mailkit.net.smtp.client.send.failures
Name | Instrument Type | Unit | Description |
---|---|---|---|
mailkit.net.smtp.client.send.failures |
Counter | {failure} |
The number of times that sending a message to an SMTP server failed. |
Attribute | Type | Description | Examples | Presence |
---|---|---|---|---|
network.peer.address |
string | Peer IP address of the client connection. | 10.5.3.2 |
When available |
server.address |
string | The host name that the client is connected to. | smtp.gmail.com |
Always |
server.port |
int | The port that the client is connecting to. | 465 |
Always |
url.scheme |
string | The URL scheme of the protocol used. | smtp or smtps |
Always |
error.type |
string | The type of exception encountered. | MailKit.Net.Smtp.SmtpCommandException |
If an error occurred. |
smtp.status_code |
int | The SMTP status code returned by the server. | 530 , 550 , 553 , ... |
If one was received. |
socket.error |
string | The socket error code. | ConnectionRefused , TimedOut , ... |
If one was received. |
This metric tracks the number of times an SmtpClient encountered an error while trying to send a message.
Metric: mailkit.net.smtp.client.send.duration
Name | Instrument Type | Unit | Description |
---|---|---|---|
mailkit.net.smtp.client.send.duration |
Histogram | ms |
The amount of time it takes to send a message to an SMTP server. |
Attribute | Type | Description | Examples | Presence |
---|---|---|---|---|
network.peer.address |
string | Peer IP address of the client connection. | 10.5.3.2 |
When available |
server.address |
string | The host name that the client is connected to. | smtp.gmail.com |
Always |
server.port |
int | The port that the client is connecting to. | 465 |
Always |
url.scheme |
string | The URL scheme of the protocol used. | smtp or smtps |
Always |
error.type |
string | The type of exception encountered. | MailKit.Net.Smtp.SmtpCommandException |
If an error occurred. |
smtp.status_code |
int | The SMTP status code returned by the server. | 530 , 550 , 553 , ... |
If one was received. |
socket.error |
string | The socket error code. | ConnectionRefused , TimedOut , ... |
If one was received. |
This metric tracks the amount of time that it takes to send a message and records any error details if the message was not sent successfully.
First, I'd like to say that I'm not completely happy with the metric names I've come up with, especially the I'd also like to get some feedback on attribute names/values, especially the Type for both I started out with Does it make sense for both of these to be The Socket Metrics are because the way I implemented it was to implement it at the MailKit.Net.SocketUtils-level rather than it the higher-level to avoid duplicating as much code. Also if a connect is aborted due to the SmtpClient.Timeout value or the CancellationToken, we don't actually get accurate information if I was to do it at the SmtpClient-level. Thinking back on my questions from an earlier comment, I'm guessing all of my metrics should be "global" and not "instance", but I'm not 100% positive on that. Then of course, if they are "global", then does it make sense to add attributes to determine which instance a metric was logged from? e.g. perhaps have a Guid (or int?) |
Ok, so reading over https://github.com/open-telemetry/semantic-conventions/blob/main/docs/general/metrics.md#naming-rules-for-counters-and-updowncounters I guess I should probably make the following name changes:
(I think technically I also wonder if maybe |
instead of using
just an idea. PS: a bit of math :) a linearly increasing counter could give you a rate using a derivative operation. a rate meter could give you a count using an integral operation. sry for the math tidbits 🤣 |
Okay, so I'll drop the That still doesn't answer the most important question which is should these instruments (Counters/Histograms) be per-instance or global? |
Just to bookmark this for myself, this is the code for metrics reporting in HttpClient: |
Updated list of metrics for SmtpClient: SmtpClient MetricsMetric:
|
Name | Instrument Type | Unit | Description |
---|---|---|---|
mailkit.net.smtp.client.connect.count |
Counter | {attempt} |
The number of times a client has attempted to connect to an SMTP server. |
Attribute | Type | Description | Examples | Presence |
---|---|---|---|---|
network.peer.address |
string | Peer IP address of the client connection. | 142.251.167.109 |
When available |
server.address |
string | The host name that the client is connected to. | smtp.gmail.com |
Always |
server.port |
int | The port that the client is connected to. | 25 , 465 , 587 |
Always |
url.scheme |
string | The URL scheme of the protocol used. | smtp or smtps |
Always |
error.type |
string | The type of exception encountered. | System.Net.Sockets.SocketException |
If an error occurred. |
smtp.status_code |
int | The SMTP status code returned by the server. | 530 , 550 , 553 , ... |
If one was received. |
socket.error |
int | The socket error code. | 10054 , 10060 , 10061 , ... |
If one was received. |
This metric tracks the number of times an SmtpClient has attempted to connect to an SMTP server.
Metric: mailkit.net.smtp.client.connect.duration
Name | Instrument Type | Unit | Description |
---|---|---|---|
mailkit.net.smtp.client.connect.duration |
Histogram | ms |
The amount of time it takes for the client to connect to an SMTP server. |
Attribute | Type | Description | Examples | Presence |
---|---|---|---|---|
network.peer.address |
string | Peer IP address of the client connection. | 142.251.167.109 |
When available |
server.address |
string | The host name that the client is connected to. | smtp.gmail.com |
Always |
server.port |
int | The port that the client is connected to. | 25 , 465 , 587 |
Always |
url.scheme |
string | The URL scheme of the protocol used. | smtp or smtps |
Always |
error.type |
string | The type of exception encountered. | System.Net.Sockets.SocketException |
If an error occurred. |
smtp.status_code |
int | The SMTP status code returned by the server. | 530 , 550 , 553 , ... |
If one was received. |
socket.error |
int | The socket error code. | 10054 , 10060 , 10061 , ... |
If one was received. |
This metric tracks the amount of time it takes an SmtpClient to connect to an SMTP server.
Metric: mailkit.net.smtp.client.connection.duration
Name | Instrument Type | Unit | Description |
---|---|---|---|
mailkit.net.smtp.client.connection.duration |
Histogram | s |
The duration of successfully established connections to an SMTP server. |
Attribute | Type | Description | Examples | Presence |
---|---|---|---|---|
network.peer.address |
string | Peer IP address of the client connection. | 142.251.167.109 |
When available |
server.address |
string | The host name that the client is connected to. | smtp.gmail.com |
Always |
server.port |
int | The port that the client is connected to. | 25 , 465 , 587 |
Always |
url.scheme |
string | The URL scheme of the protocol used. | smtp or smtps |
Always |
error.type |
string | The type of exception encountered. | MailKit.Net.Smtp.SmtpCommandException |
If an error occurred. |
smtp.status_code |
int | The SMTP status code returned by the server. | 530 , 550 , 553 , ... |
If one was received. |
socket.error |
int | The socket error code. | 10054 , 10060 , 10061 , ... |
If one was received. |
This metric tracks the connection duration of each SmtpClient connection and records any error details if the connection was terminated involuntarily.
Metric: mailkit.net.smtp.client.send.count
Name | Instrument Type | Unit | Description |
---|---|---|---|
mailkit.net.smtp.client.send.count |
Counter | {message} |
The number of messages sent to an SMTP server. |
Attribute | Type | Description | Examples | Presence |
---|---|---|---|---|
network.peer.address |
string | Peer IP address of the client connection. | 142.251.167.109 |
When available |
server.address |
string | The host name that the client is connected to. | smtp.gmail.com |
Always |
server.port |
int | The port that the client is connected to. | 25 , 465 , 587 |
Always |
url.scheme |
string | The URL scheme of the protocol used. | smtp or smtps |
Always |
error.type |
string | The type of exception encountered. | MailKit.Net.Smtp.SmtpCommandException |
If an error occurred. |
smtp.status_code |
int | The SMTP status code returned by the server. | 530 , 550 , 553 , ... |
If one was received. |
socket.error |
int | The socket error code. | 10054 , 10060 , 10061 , ... |
If one was received. |
This metric tracks the number of messages sent by an SmtpClient and records any error details if the message was not sent successfully.
Metric: mailkit.net.smtp.client.send.duration
Name | Instrument Type | Unit | Description |
---|---|---|---|
mailkit.net.smtp.client.send.duration |
Histogram | ms |
The amount of time it takes to send a message to an SMTP server. |
Attribute | Type | Description | Examples | Presence |
---|---|---|---|---|
network.peer.address |
string | Peer IP address of the client connection. | 142.251.167.109 |
When available |
server.address |
string | The host name that the client is connected to. | smtp.gmail.com |
Always |
server.port |
int | The port that the client is connected to. | 25 , 465 , 587 |
Always |
url.scheme |
string | The URL scheme of the protocol used. | smtp or smtps |
Always |
error.type |
string | The type of exception encountered. | MailKit.Net.Smtp.SmtpCommandException |
If an error occurred. |
smtp.status_code |
int | The SMTP status code returned by the server. | 530 , 550 , 553 , ... |
If one was received. |
socket.error |
int | The socket error code. | 10054 , 10060 , 10061 , ... |
If one was received. |
This metric tracks the amount of time that it takes an SmtpClient to send a message and records any error details if the message was not sent successfully.
I'm sorry missed that question. I think you should share the Meters between instances, and it looks like the documentation says the APIs are thread-safe. I'd share the Meter (or use IMeterFactory which exists in net8 only😞), and create the instruments (counters, meters, histograms) in the constructor of the instance itself. But I'd be really happy to get some community input on this because I'm not sure whether this is the best practice.
source: https://learn.microsoft.com/en-us/dotnet/core/diagnostics/metrics-instrumentation I've seen your table, thats impressive :) You should be very careful with tags with high cardinality: like IP and hostname. |
I’m not entirely sure about the best course of action here. However, I'm leaning towards having more details in the traces rather than in the metrics. Personally, I'd rather avoid mixing external library logs with my app logs as it seems somewhat unconventional to me. Let’s consider adding metrics only when we feel they are essential for monitoring, as suggested by @EugeneKrapivin. If uncertain, let's refrain from adding it for now and revisit the decision when needed. |
I assume we'd like data that will help us monitor and troubleshoot the service faster, preferably before issues on the client side begin to bubble :)
it is an integral of the rate metric, so as long as you report 1 (either count or rate) you could calculate the other simply These are prometheus queries with grafana variables that I copied form a board we have to monitor number exceptions thrown by a service in production:
metrics like these are actually useful for developers who use MailKit, they could report those metrics and build useful alerts based on this metric. If the duration of a send request is increasing overtime its a cause for alarm, seeing quantiles (p99.9) of the duration also helps find issues, especially spikes that get lost in the standard average/mean aggregation people look at.
it'd be really helpful to get outgoing client requests as activities (trace) because it would allow the user to trace a single request and whether or not the email was sent successfully. that's (distributed traces) actually something I'm using daily while researching production issues. var start = Stopwatch.GetTimestamp();
using var sendActivity = source.StartActivity("Smtp.Send", ActivityKind.Client);
try
{
var r = await _client.SendAsync(message, cts.Token);
var total = Stopwatch.GetElapsedTime(start);
sendActivity?.AddTag("status", "success");
sendActivity?.AddEvent(new ActivityEvent("email sent"));
_sendMessageHistogram.Record(total.TotalMilliseconds new KeyValuePair<string, object?>("status", "success"));
} Especially since Traces and Logs are correlated using a RequestId parameter, meaning I could trace a request and find all logs written as a side-effect of the In general, metrics are only a partial view of the whole observability game which must include logs, traces and metrics :). I say you could have only 2-3 metrics, with good tags just to get you up and running with RED metrics.
of course I do not know the internals of the library so if there are other metrics that you think that might be useful for the users of the library, you should add them. the metrics I suggested would allow a user to:
of course all of those I could gather manually from outside the library by simply wrapping it with lots of code... but that's the added value you've mentioned :) Regarding traces, just instrument the most used (in your opinion) client and see if requests or MR for other clients arrive =) if you build it, they will come 🤣 Regarding logs, I strongly disagree with @prashanth-nani about library logs in my product logs, I think that once I use the library in my code, it is my job to monitor, read and understand its behavior. This is why all of our logs are gathered into elasticsearch under a single request ID for each and every request (pricy endeavor for sure). these include all the logs that are written by a running process. If there is something I don't want I just filter it in the log4net/logging/logstash/flume forwarders and drop it before it gets stored. logs from MailKit is as important as your log in the service once you use MailKit in production. @jstedfast I think you should be using the Microsoft logging abstraction, it'd allow the user to use any logger they want (all loggers implement the ILogger abstraction. In addition, those logs could be easily exported to OpenTelemetry collector easily in dotnet. Of course all you have to do is use the standard wow this got fairly long. sorry. |
@EugeneKrapivin, thanks for the detailed explanation! I fully agree with your points. I stand corrected regarding my previous thoughts on library logs. Including them where ever necessary using On instrumentation, I align with your suggestions for the traces and Additionally, Thanks again for your valuable insights! |
just be careful with tags with variadic values on metrics - it will explode the backend storage for sure :) it should be OK as an argument in Activity/Event trace |
@EugeneKrapivin Thanks for your thoughts, as always. I really appreciate the guidance because I'm not well-versed in this stuff. What you said makes sense and brings the bigger picture into better focus for me. |
…nticate Partial fix for issue #1499
…nticate Partial fix for issue #1499
…nticate Partial fix for issue #1499
…nticate Partial fix for issue #1499
…nticate Partial fix for issue #1499
Here's where I'm at with telemetry metrics: https://github.com/jstedfast/MailKit/blob/metrics/Telemetry.md A lot of this is based on HttpClient and reading up on the OpenTelemetry specifications, conventions, etc. I've also got Activities that span Connect/Async, Authenticate/Async and Send/Async. |
@jstedfast would you mind opening a draft merge/pull request so we could productively leave comments? |
Done: #1715 |
…nticate Partial fix for issue #1499
…nticate Partial fix for issue #1499
…nticate Partial fix for issue #1499
…nticate Partial fix for issue #1499
…nticate Partial fix for issue #1499
Is it possible to wrap the calling server around the Activity? So that, I can get elapsed time per calling.
The text was updated successfully, but these errors were encountered: