Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Greppable Logging #200

Closed
jbenet opened this issue Oct 25, 2014 · 14 comments
Closed

Greppable Logging #200

jbenet opened this issue Oct 25, 2014 · 14 comments
Labels
help wanted Seeking public contribution on this issue kind/enhancement A net-new feature or improvement to an existing feature

Comments

@jbenet
Copy link
Member

jbenet commented Oct 25, 2014

Would be nice to have logging be nicely greppable for things like specific queries, node IDs, etc. Already we include the node ID in many log statements, and perhaps we can turn this into prefixes or variables given the execution context. We could do this by using a logger in the context.Context (as @maybebtc suggested a long while back).

@whyrusleeping
Copy link
Member

When i woke up this morning, I had the idea that i wanted all logs related to a specific dht RPC to be tagged so i could grep them all out. so +100000

@jbenet jbenet added help wanted Seeking public contribution on this issue kind/enhancement A net-new feature or improvement to an existing feature ready labels Oct 31, 2014
@btc
Copy link
Contributor

btc commented Oct 31, 2014

This goes a long way toward generating insightful network visualizations. ipfs' log format ought to be machine readable. That way, it can be loaded into elasticsearch for use in kibana. Throw packetbeat on top and we're in business.

http://packetbeat.com/
https://github.com/packetbeat/packetbeat/

primary aspects

  • define network events
  • include relevant event-specific metadata (using either key-value pairs or json blobs (harder to read))
05-15-2014 - 11:45:42:000 - USER=wp34056 - action=login - page=blog.logentries.com - referrer=www.google.com - status=200
  • UTC timestamp
  • use unique-ish identifiers to tie together events belonging a larger transaction
  • consider the importance of being able to derive timing information
  • include the version of ipfs

secondary aspects

  • human-readable
  • include information about the package/function that produced it
  • define a strategy to improve SnR. focus logging efforts on high value events
  • define a structured log format
  • choose an implementation (logging context or static import or other)
  • outline a logging strategy for SnR

To aid in the development of https://github.com/jbenet/ipfs/issues/39

@btc
Copy link
Contributor

btc commented Nov 1, 2014

Here is my proposal. I'd love to get your thoughts. The interface is rather clunky. I think it can be whittled down into something that feels better to write.

One additional constraint that feels important is maintaining consistent metadata formatting across packages. Consistent metadata structure will help with analysis.

@whyrusleeping @jbenet @mappum

var log = Logger("dht")

func example() {

    local := peer.WithID(peer.ID("me"))
    remote := peer.WithID(peer.ID("u"))
    key := util.Key("baz")

    // init
    ctx := ContextWithLogData(context.TODO(),
        LocalPeer(local),
    )

    // Use Case: A message received from network. we want to track the message 
    // as it makes its way through the system:
    ctx = ContextWithLogData(context.TODO(),
        UUID("dht_request"),
    )

    log.Debug(ctx,
        Event("adding provider",
            LocalPeer(local),
            Key(key),
        ),
    )

    /* as json
       {
        "timestamp": "2014-10-27T09:44:55+00:00",
        "log": "dht",
        "level": "debug",
        "metadata": {
            "local_peer": "me",
            "dht_request": "123e4567-e89b-12d3-a456-426655440000",
            "event": {
                "name": "adding provider",
                "local_peer": "me",
                "key": "baz"
            }
        }
       }
    */

    // or as a log line
    // 2014-10-27T09:44:55+00:00 DEBUG DHT local_peer=me dht_request=123e45 event_name="adding provider" event_key=baz

    log.Debug(ctx,
        LocalPeer(local),
        RemotePeer(peer.WithID(peer.ID("bar"))),
        Error(errors.New("something happened. non-critical but odd")),
    )

    log.Debug(ctx,
        Event("connected", LocalPeer(local), RemotePeer(remote)),
    )

If you just want to spit some stuff out to the console, you can spit out a format string.

log.Debug(ctx.TODO(), F("peer %s received %s", peer.Key(), block.Key())

// or something marshallable
log.Debug(ctx.TODO(), M(peer))
log.Debug(ctx.TODO(), Marshalled(peer))

You can even generate a stack trace!

log.Debug(ctx.TODO(), LocalPeer(local), StackTrace)

You see, log.Debug has the following signature:

Open questions:

  • import a static var log logger or get the logger from the context?
  • name of package?
  • whether its preferable to nest events...
    log.Debug(ctx,
        Event("adding provider",
            LocalPeer(local),
            Key(key),
        ),
    )
    // without nesting 
    log.Debug(ctx,
        Event("adding provider"),
        LocalPeer(local),
        Key(key),
    )

@jbenet
Copy link
Member Author

jbenet commented Nov 1, 2014

👍 i like this all.

import a static var log logger or get the logger from the context?
name of package?

As discussed yesterday, if we get the name of the package on the log statement, then we can use just one global logger.

whether its preferable to nest events...

Constructing objects / calling functions still feels clunky. The less verbosity in describing each log statement the better. +1 to type safety, but, for example, type safety on the event doesn't do anything because the only relevant piece of information is the key, which is a string.

What's key from this pkg is ease of use in both reading + writing logs.

@whyrusleeping
Copy link
Member

I like this a lot, my only concern is that all the arguments to the debug calls will be evaluated before the function call is made, which drops our performance some when the logs are off (especially in tighter loops and critical paths)

@jbenet
Copy link
Member Author

jbenet commented Nov 1, 2014

Ahh yes! That'll marshall everything prematurely. I think this calls for continuing to give pointers. Or wrap in an object that marshals in its String().

@btc
Copy link
Contributor

btc commented Nov 1, 2014

I like this a lot, my only concern is that all the arguments to the debug calls will be evaluated before the function call is made, which drops our performance some when the logs are off (especially in tighter loops and critical paths)

This is true. Definitely a non-zero cost.

Primary resource hogs are hashing, network latency, and disk latency, yeah? Are there any hotspots that don't involve one of these three?

@btc
Copy link
Contributor

btc commented Nov 4, 2014

Concern

Because these helpers would be imported from another package, actual calls would look like this.

  log.Debug(ctx,
        foo.Event("adding provider"),
        foo.LocalPeer(local),
        foo.Key(key),
    )

That's rather inconvenient. It's possible to import . "foo" but I don't know if we want to go down that road.

Alternative

@jbenet suggested simplifying the design using marshallable structs for metadata. Perhaps this is the way to go after all.

Still, I have reservations with the "marshallable" approach. We would lose centralized control over the metadata object format. By that I mean we cannot control the field names used when logging metadata. ie.

  • local_peer
  • peer
  • peer_id
  • remote_peer
  • remote_peer_id
  • partner
  • provider
  • key
  • block
  • hash
  • message_id
  • message

A middleground?

Analysis requires more work when naming isn't consistent across the board. The design proposed a few days ago separates the act of logging from the eventual output format. I feel this is a really important characteristic to maintain.

Can we balance these concerns and find middleground?

@jbenet
Copy link
Member Author

jbenet commented Nov 4, 2014

That's rather inconvenient. It's possible to import . "foo" but I don't know if we want to go down that road.

i dont like this at all. no more unexplicit namespace pollution pls 👎 (the damn imports without local varname is bad enough). (i know you also dont like it, just being overly stern on this because /dislike).


Understood on your concerns. great points.

I've an idea -- take it or leave it:

// we define a Loggable interface that objects can optionally implement
// to control their own logging representation.

type interface Loggable {
  LoggerRep() interface{} // or map[string]interface{}, basically json.
}


// then, a function in our logger package
func loggerRep(obj interface{}) interface{} {

  // if not, we can -- right here -- do a type switch.

  switch obj.(type) {

  // our loggable! yay.
  case obj.(Loggable):
    obj = loggable.LoggerRep()


  // we can put our special cases here
  // (and later on extract these into another, overridable global, worst case, function catch all
  // to make the package not depend on our stuff at all)
  case peer.Peer:
    obj = extractPeer(obj)


  // general marshalling fallback
  case encoding.TextMarshaler:
    obj = obj.(encoding.TextMarshaller).MarshalText()


  // absolute worst case, String.
  default:
    obj = obj.String()
  }

  return obj
}


// then -- only in the last step of actually printing it out, do the Logger
// functions invoke loggerRep
{
  // deep in the bowels of the Logger stuff.
  // once we're sure this is getting printed.
  output := make([]interface{}, len(items)
  for _, item := range items {
    output[i] = loggerRep(item)
  }
}

@jbenet
Copy link
Member Author

jbenet commented Nov 13, 2014

@maybebtc maybe this can be priority right after we land commands-- @dborzov is making netVis and will want to hook it up to dhtHell soon! 😄

@whyrusleeping
Copy link
Member

Jbenet, I really like the logable interface. That's my vote. It also prevents evaluation of things if logs are off

@btc
Copy link
Contributor

btc commented Nov 14, 2014

I started on this a few days ago (going the Loggable route). about 30% of the way there. I'll open a PR for the WIP branch.

@btc
Copy link
Contributor

btc commented Nov 14, 2014

PR: #326

@btc
Copy link
Contributor

btc commented Nov 24, 2014

Greppable logging functionality was merged into master through PR #326

On a related note, a discussion around improving SnR of IPFS logs may be found in issue #292

@btc btc closed this as completed Nov 24, 2014
@btc btc removed the ready label Nov 24, 2014
@aschmahmann aschmahmann mentioned this issue Aug 23, 2021
62 tasks
@aschmahmann aschmahmann mentioned this issue Dec 1, 2021
80 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Seeking public contribution on this issue kind/enhancement A net-new feature or improvement to an existing feature
Projects
None yet
Development

No branches or pull requests

3 participants