Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: metric tracing crashes node #4076

Closed
deepfire opened this issue Jun 22, 2022 · 2 comments
Closed

bug: metric tracing crashes node #4076

deepfire opened this issue Jun 22, 2022 · 2 comments
Assignees
Labels
bug Something isn't working

Comments

@deepfire
Copy link
Contributor

deepfire commented Jun 22, 2022

The node has a non-trivial probability of crashing during startup:

cardano-node: ExceptionInLinkedThread "ThreadId 42" The name ""cardano.node.blockContext"" is already taken by a metric.
CallStack (from HasCallStack):
  error, called at ./System/Metrics.hs:214:5 in ekg-core-0.1.1.7-FjoslY1tzknIAl90c73kOZ:System.Metrics

..or something similar -- the metric name varies.

Repro:

  • cardano-node master
  • make default-autostay, which starts six nodes -- and that should trigger it, at least for one of them.
  • If not -- retry until success : -)
@deepfire deepfire added bug Something isn't working tracing labels Jun 22, 2022
@deepfire
Copy link
Contributor Author

Running a make default-prof cluster sheds further light:

cardano-node: ExceptionInLinkedThread "ThreadId 35" The name ""cardano.node.blockContext"" is already taken by a metric.
CallStack (from HasCallStack):
  error, called at ./System/Metrics.hs:214:5 in ekg-core-0.1.1.7-FjoslY1tzknIAl90c73kOZ:System.Metrics
CallStack (from -prof):
  System.Metrics.registerGauge (System/Metrics.hs:(172,1)-(173,39))
  System.Metrics.createGauge (System/Metrics.hs:(303,1)-(306,16))
  Cardano.Logging.Tracer.EKG.ekgTracer (src/Cardano/Logging/Tracer/EKG.hs:(26,1)-(94,61))
  Cardano.Node.Tracing.API.initTraceDispatcher (src/Cardano/Node/Tracing/API.hs:(61,1)-(120,28))
  Ouroboros.Consensus.Node.Tracers.forgeTracer (src/Ouroboros/Consensus/Node/Tracers.hs:62:5-15)
  Ouroboros.Consensus.BlockchainTime.API.knownSlotWatcher (src/Ouroboros/Consensus/BlockchainTime/API.hs:(64,1)-(77,40))
  Control.Concurrent.Async.async (Control/Concurrent/Async.hs:264:1-35)
  Ouroboros.Consensus.Util.ResourceRegistry.allocateEither (src/Ouroboros/Consensus/Util/ResourceRegistry.hs:(979,1)-(1011,9))
  Ouroboros.Consensus.Util.ResourceRegistry.allocate (src/Ouroboros/Consensus/Util/ResourceRegistry.hs:(968,1)-(969,72))
  Ouroboros.Consensus.Util.ResourceRegistry.forkThread (src/Ouroboros/Consensus/Util/ResourceRegistry.hs:(1149,1)-(1186,35))
  Ouroboros.Consensus.Util.ResourceRegistry.forkLinkedThread (src/Ouroboros/Consensus/Util/ResourceRegistry.hs:(1261,1)-(1268,12))
  Ouroboros.Consensus.Util.STM.forkLinkedWatcher (src/Ouroboros/Consensus/Util/STM.hs:(163,1)-(164,56))
  Ouroboros.Consensus.NodeKernel.initNodeKernel (src/Ouroboros/Consensus/NodeKernel.hs:(146,1)-(178,7))
  Ouroboros.Consensus.Util.ResourceRegistry.withRegistry (src/Ouroboros/Consensus/Util/ResourceRegistry.hs:649:1-54)
  Ouroboros.Consensus.Node.Recovery.runWithCheckedDB (src/Ouroboros/Consensus/Node/Recovery.hs:(97,1)-(134,39))
  Ouroboros.Consensus.Node.DbLock.withLockDB_ (src/Ouroboros/Consensus/Node/DbLock.hs:(57,1)-(80,53))
  Ouroboros.Consensus.Node.DbLock.withLockDB (src/Ouroboros/Consensus/Node/DbLock.hs:(27,1)-(32,19))
  Ouroboros.Consensus.Node.stdWithCheckedDB (src/Ouroboros/Consensus/Node.hs:(504,1)-(517,35))
  Ouroboros.Consensus.Node.stdLowLevelRunNodeArgsIO (src/Ouroboros/Consensus/Node.hs:(752,1)-(851,60))
  Ouroboros.Consensus.Node.runWith (src/Ouroboros/Consensus/Node.hs:(283,1)-(491,35))
  Ouroboros.Consensus.Node.run (src/Ouroboros/Consensus/Node.hs:267:1-73)
  Cardano.Node.Handlers.Shutdown.withShutdownHandling (src/Cardano/Node/Handlers/Shutdown.hs:(114,1)-(130,65))
  Cardano.Node.Run.runNode (src/Cardano/Node/Run.hs:(110,1)-(152,58))
  Main.runRunCommand (app/cardano-node.hs:99:1-40)
  Cardano.Node.Handlers.TopLevel.toplevelExceptionHandler (src/Cardano/Node/Handlers/TopLevel.hs:(67,1)-(114,42))
  Main.main (app/cardano-node.hs:(27,1)-(59,44))

iohk-bors bot added a commit that referenced this issue Jun 28, 2022
4108: Fix for: metric tracing crashes node #4076 r=deepfire a=jutaro

Use an MVAR instaed of an IOVar for metrics

Co-authored-by: Yupanqui <jnf@arcor.de>
@deepfire
Copy link
Contributor Author

Fixed in #4108

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants