Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(jwt): implement JWT flow #9

Merged
merged 5 commits into from
Dec 6, 2022

Conversation

andrewazores
Copy link
Member

@andrewazores andrewazores commented Aug 25, 2022

@andrewazores andrewazores added the feat New feature or request label Aug 25, 2022
@maxcao13
Copy link
Member

How can we test the agent itself?

@andrewazores
Copy link
Member Author

As of this PR it can be a little convoluted - the way I was doing it was mvn install and building a customized -reports image including that, and then running that image as part of the main smoketest.sh setup.

#12 includes a run.sh file that runs the agent as a standalone application so that you can run smoketest.sh in one terminal and the agent's run.sh in another terminal to manually verify that it properly registers.

I will update this PR soon to pull that run.sh file down into here.

@andrewazores
Copy link
Member Author

This can now be manually tested just using cryostat's smoketest.sh as of latest main and the run.sh in this PR afterward.

@github-actions
Copy link

This PR/issue depends on:

  • cryostatio/cryostat#1059
  • cryostatio/cryostat#1061
    By Dependent Issues (🤖). Happy coding!

@maxcao13
Copy link
Member

maxcao13 commented Sep 20, 2022

Running on cryostat main with sh smoketest.sh and this pr sh. run.sh with default config results in lots of the agent waiting for a response from posting to /api/v2.2/discovery. and not getting a response within the 1 sec timeout period.

Sep 20, 2022 11:39:22 AM io.cryostat.agent.Registration lambda$tryRegister$4
INFO: Registration retry period: 5000(ms)
Sep 20, 2022 11:39:28 AM io.cryostat.agent.Registration lambda$tryRegister$4
SEVERE: Registration failure
io.vertx.core.http.impl.NoStackTraceTimeoutException: The timeout period of 1000ms has been exceeded while executing POST /api/v2.2/discovery for server localhost:8181

If I run smoketest while the agent is running, I assume due to timing, the agent actually does register itself once but after trying to update, it gets timed out again and also doesn't deregister itself.

Sep 20, 2022 11:39:03 AM io.cryostat.agent.Registration lambda$tryRegister$4
INFO: Registration retry period: 5000(ms)
Sep 20, 2022 11:39:08 AM io.cryostat.agent.Registration lambda$tryRegister$3
INFO: Registered as 37152548-14ed-40e2-b744-e076413a4781
Sep 20, 2022 11:39:08 AM io.cryostat.agent.Registration tryUpdate
INFO: publishing self as service:jmx:rmi:///jndi/rmi://fedora:10203/jmxrmi
Sep 20, 2022 11:39:08 AM io.cryostat.agent.WebServer lambda$start$1
INFO: HTTP Server started on 0.0.0.0:9977
Sep 20, 2022 11:39:09 AM io.cryostat.agent.Registration lambda$tryUpdate$7
SEVERE: Update failure
io.vertx.core.http.impl.NoStackTraceTimeoutException: The timeout period of 1000ms has been exceeded while executing POST /api/v2.2/discovery/37152548-14ed-40e2-b744-e076413a4781?token=eyJjdHkiOiJKV1QiLCJlbmMiOiJBMjU2R0NNIiwiYWxnIjoiZGlyIn0..6cOQXBD3srfSytui.zhHCr9uPEKq-xrGUpPXcS5ubQ_7NqOj9_gValp7uD6zV2s1ucQxb0okltJOglwaAABQ0EGgdsrVz6u1vF2oFmxFG0xKSjZDQil7rfRZlEikXcF7yRg_c951uZikOMGracyrKX1ZR9uBrTRBK4Ik_Xo66TcG02nAKxRM908srf8eBa3RNPqihnroLVB2fWjg9QQSIxs8SLldIld_4Uo-NP3gH4iFF8PHB1oPRnKHmrymbwv4-f-m90SOAe9W6YYGkCOUjIlfXovky2_AHd3cPvO--B86ysd9wiyyJHswsSKPx5lo5CG6lorMuRFwuXysmC4YI6ZflddXcCvzrfgSBFe12xb8fv2TE2xGzC_deCuQxrtMZIzqT5KvsGNRkE0ACMte1K-7SKHCXFyfoncFF3KppSHe0_n2ghLTk4CZYxShS-pp1vbIkDtMbE6Lz5gH3vIh2VJgR53STbmFUl8J99WKoYSu1eqUCAGl5Wx-tiLOv53m-17IhwY3H9Fel7RyIB7WMO8hMONVu23cQICAaMB7ULP_X5sYy6yg_WqQusyTz4jUQkfwI-DYPLUBYYJMpTrAojWLi6NxVw993nICaIJIpZULDiB7FYl3Z6hA5dLg2EnHDWT1ciCU77R8DwDqBxVqewONCsi6d4dtBurxgccE.B7d6beO_uX0mzdfJLUsXkQ for server localhost:8181

Sep 20, 2022 11:39:10 AM io.cryostat.agent.Registration lambda$deregister$9
WARNING: Failed to deregister as Cryostat discovery plugin [37152548-14ed-40e2-b744-e076413a4781]
Sep 20, 2022 11:39:10 AM io.cryostat.agent.WebServer stop
INFO: HTTP Server stopping
Sep 20, 2022 11:39:10 AM io.cryostat.agent.WebServer lambda$stop$2
INFO: HTTP Server stopped

@andrewazores
Copy link
Member Author

Huh. I can't reproduce that on my end. How did you build the agent? mvn clean package?

It's weird that the registration succeeds in your second case, but then updates and even deregistration fail afterward. I'm wondering if something else is going on on the Cryostat server side that is causing it to stop responding to requests, or not respond in a timely manner. Could be a re-appearance of https://github.com/cryostatio/cryostat/issues/929 or something similar.

@maxcao13
Copy link
Member

maxcao13 commented Sep 20, 2022

Huh. I can't reproduce that on my end. How did you build the agent? mvn clean package?

It's weird that the registration succeeds in your second case, but then updates and even deregistration fail afterward. I'm wondering if something else is going on on the Cryostat server side that is causing it to stop responding to requests, or not respond in a timely manner. Could be a re-appearance of cryostatio/cryostat#929 or something similar.

Hmm that's weird.

Yeah my exact steps are in 1 terminal

# -agent
$ mvn clean package
$ sh run.sh

other

# in cryostat -main
$ mvn clean package && sh smoketest.sh

Interestingly it seems like my cryostat seems to hang on a discovery endpoint POST for 30 seconds sometimes. I'm not sure why, because I've packaged cryostat...

Sep 20, 2022 6:08:03 PM io.cryostat.core.log.Logger info
INFO: (10.0.2.100:44292): POST /api/v2.2/discovery 200 1000ms
Sep 20, 2022 6:08:09 PM io.cryostat.core.log.Logger info
INFO: (10.0.2.100:48000): POST /api/v2.2/discovery 200 1001ms
Sep 20, 2022 6:08:15 PM io.cryostat.core.log.Logger info
INFO: (10.0.2.100:48014): POST /api/v2.2/discovery 200 1001ms
Sep 20, 2022 6:08:21 PM io.cryostat.core.log.Logger info
INFO: (10.0.2.100:57866): POST /api/v2.2/discovery 200 1001ms
Sep 20, 2022 6:08:26 PM io.cryostat.core.log.Logger info
INFO: (10.0.2.100:42912): POST /api/v2.2/discovery 200 30030ms
Sep 20, 2022 6:08:27 PM io.cryostat.core.log.Logger info
INFO: (10.0.2.100:57870): POST /api/v2.2/discovery 200 1001ms
Sep 20, 2022 6:08:33 PM io.cryostat.core.log.Logger info
INFO: (10.0.2.100:36610): POST /api/v2.2/discovery 200 1001ms
Sep 20, 2022 6:08:39 PM io.cryostat.core.log.Logger info
INFO: (10.0.2.100:53428): POST /api/v2.2/discovery 200 1001ms
Sep 20, 2022 6:08:45 PM io.cryostat.core.log.Logger info
INFO: (10.0.2.100:53430): POST /api/v2.2/discovery 200 1001ms
Sep 20, 2022 6:08:46 PM io.cryostat.core.log.Logger info
INFO: (10.0.2.100:57864): POST /api/v2.2/discovery 200 30028ms

@andrewazores
Copy link
Member Author

andrewazores commented Sep 20, 2022

Interesting. If that request is hanging then it shouldn't be blocking other things, so that does sound like https://github.com/cryostatio/cryostat/issues/929 again. Can you try https :8181/health/liveness while this is happening and see if it responds to that probe?

Timeouts occurring during registration seem like it's probably related to the plugin callback. Can you try changing the callback in run.sh to different values and see if that changes the behaviour at all? ex. try changing cryostat to localhost, or try making it an invalid URL, etc., just to check if that's really the problem. I wonder if the 30s timeout is a DNS resolution timeout or something like that from Cryostat trying to verify that the plugin is reachable at that callback, and the 1s timeout is the plugin timing out on its registration request because Cryostat is busy trying to resolve DNS.

@maxcao13
Copy link
Member

maxcao13 commented Sep 20, 2022

Interesting. If that request is hanging then it shouldn't be blocking other things, so that does sound like cryostatio/cryostat#929 again. Can you try https :8181/health/liveness while this is happening and see if it responds to that probe?

Timeouts occurring during registration seem like it's probably related to the plugin callback. Can you try changing the callback in run.sh to different values and see if that changes the behaviour at all? ex. try changing cryostat to localhost, or try making it an invalid URL, etc., just to check if that's really the problem. I wonder if the 30s timeout is a DNS resolution timeout or something like that from Cryostat trying to verify that the plugin is reachable at that callback, and the 1s timeout is the plugin timing out on its registration request because Cryostat is busy trying to resolve DNS.

I don't believe the problem is with DNS or anything like that as I've tried changing host values.

I think the problem is that when service:jmx:rmi:///jndi/rmi://fedora:10203/jmxrmi appears, RecordingMetadataManager tries to connect to the service to get the jvmId but never does but also never errors out so requests get "stuck".

Sep 20, 2022 6:39:07 PM io.cryostat.core.log.Logger info
INFO: Connection for service:jmx:rmi:///jndi/rmi://fedora:10203/jmxrmi closed
Sep 20, 2022 6:39:07 PM io.cryostat.core.log.Logger error
SEVERE: Exception thrown
org.openjdk.jmc.rjmx.ConnectionException caused by java.io.IOException: Failed to retrieve RMIServer stub: javax.naming.ServiceUnavailableException [Root exception is java.rmi.ConnectException: Connection refused to host: fedora; nested exception is: 
	java.net.ConnectException: Connection refused]
	at org.openjdk.jmc.rjmx.internal.RJMXConnection.connect(RJMXConnection.java:441)
	at io.cryostat.core.net.JFRConnection.attemptConnect(JFRConnection.java:300)
	at io.cryostat.core.net.JFRConnection.connect(JFRConnection.java:257)
	at io.cryostat.core.net.JFRConnection.getJvmId(JFRConnection.java:168)
	at io.cryostat.recordings.RecordingMetadataManager.lambda$computeJvmId$25(RecordingMetadataManager.java:839)
	at io.cryostat.net.TargetConnectionManager.executeConnectedTask(TargetConnectionManager.java:148)
	at io.cryostat.net.TargetConnectionManager.executeConnectedTask(TargetConnectionManager.java:126)
	at io.cryostat.recordings.RecordingMetadataManager.computeJvmId(RecordingMetadataManager.java:836)
	at io.cryostat.recordings.RecordingMetadataManager.lambda$getJvmId$26(RecordingMetadataManager.java:857)
	at java.base/java.util.concurrent.ConcurrentHashMap.computeIfAbsent(ConcurrentHashMap.java:1708)
	at io.cryostat.recordings.RecordingMetadataManager.getJvmId(RecordingMetadataManager.java:854)
	at io.cryostat.recordings.RecordingMetadataManager.accept(RecordingMetadataManager.java:390)
	at io.cryostat.recordings.RecordingMetadataManager.accept(RecordingMetadataManager.java:88)
	at io.cryostat.discovery.AbstractPlatformClientVerticle.lambda$notifyAsyncTargetDiscovery$0(AbstractPlatformClientVerticle.java:71)

When I tried doing requests during this time like
$ http :8181/health/liveness, I would get a response like

http: error: ConnectionError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response')) while doing a GET request to URL: http://localhost:8181/health/liveness```

Then after a few minutes, the requests finally timeout and so I get a bunch of responses at the same time:

Sep 20, 2022 6:41:56 PM io.cryostat.core.log.Logger info
INFO: GET https://cryostat:8181/ status 200: OK
Hibernate: 
    /* insert io.cryostat.discovery.PluginInfo
        */ insert 
        into
            PluginInfo
            (callback, realm, subtree, id) 
        values
            (?, ?, ?, ?)
Sep 20, 2022 6:41:56 PM org.hibernate.engine.jdbc.spi.SqlExceptionHelper logExceptions
WARN: SQL Error: 23505, SQLState: 23505
Sep 20, 2022 6:41:56 PM org.hibernate.engine.jdbc.spi.SqlExceptionHelper logExceptions
ERROR: Unique index or primary key violation: "PUBLIC.UK_LFC8YJ7NOT9MTD4XYCNE72QEC_INDEX_5 ON PUBLIC.PLUGININFO(CALLBACK NULLS FIRST) VALUES ( /* 4 */ 'https://cryostat:8181/' )"; SQL statement:
/* insert io.cryostat.discovery.PluginInfo */ insert into PluginInfo (cal [23505-214]
Sep 20, 2022 6:41:56 PM org.hibernate.engine.jdbc.batch.internal.AbstractBatchImpl release
INFO: HHH000010: On release of batch it still contained JDBC statements
Sep 20, 2022 6:41:56 PM io.cryostat.core.log.Logger error
SEVERE: Exception thrown
javax.persistence.RollbackException: Error while committing the transaction
	at org.hibernate.internal.ExceptionConverterImpl.convertCommitException(ExceptionConverterImpl.java:81)
	at org.hibernate.engine.transaction.internal.TransactionImpl.commit(TransactionImpl.java:104)
	at io.cryostat.storage.AbstractDao.save(AbstractDao.java:71)
	at io.cryostat.discovery.PluginInfoDao.save(PluginInfoDao.java:73)
	at io.cryostat.discovery.DiscoveryStorage.register(DiscoveryStorage.java:195)
	at io.cryostat.net.web.http.api.v2.DiscoveryRegistrationHandler.handle(DiscoveryRegistrationHandler.java:162)
	at io.cryostat.net.web.http.api.v2.AbstractV2RequestHandler.handle(AbstractV2RequestHandler.java:123)
	at io.cryostat.net.web.http.api.v2.AbstractV2RequestHandler.handle(AbstractV2RequestHandler.java:75)
	at io.vertx.ext.web.impl.BlockingHandlerDecorator.lambda$handle$0(BlockingHandlerDecorator.java:48)
	at io.vertx.core.impl.ContextImpl.lambda$null$0(ContextImpl.java:159)
	at io.vertx.core.impl.AbstractContext.dispatch(AbstractContext.java:100)
	at io.vertx.core.impl.ContextImpl.lambda$executeBlocking$1(ContextImpl.java:157)
	at io.vertx.core.impl.TaskQueue.run(TaskQueue.java:76)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
	at java.base/java.lang.Thread.run(Thread.java:833)
Caused by: javax.persistence.PersistenceException: org.hibernate.exception.ConstraintViolationException: could not execute statement
	at org.hibernate.internal.ExceptionConverterImpl.convert(ExceptionConverterImpl.java:154)
	at org.hibernate.internal.ExceptionConverterImpl.convert(ExceptionConverterImpl.java:181)
	at org.hibernate.internal.ExceptionConverterImpl.convertCommitException(ExceptionConverterImpl.java:65)
	... 16 more
Caused by: org.hibernate.exception.ConstraintViolationException: could not execute statement
	at org.hibernate.exception.internal.SQLExceptionTypeDelegate.convert(SQLExceptionTypeDelegate.java:59)
	at org.hibernate.exception.internal.StandardSQLExceptionConverter.convert(StandardSQLExceptionConverter.java:37)
	at org.hibernate.engine.jdbc.spi.SqlExceptionHelper.convert(SqlExceptionHelper.java:113)
	at org.hibernate.engine.jdbc.spi.SqlExceptionHelper.convert(SqlExceptionHelper.java:99)
	at org.hibernate.engine.jdbc.internal.ResultSetReturnImpl.executeUpdate(ResultSetReturnImpl.java:200)
	at org.hibernate.engine.jdbc.batch.internal.NonBatchingBatch.addToBatch(NonBatchingBatch.java:46)
	at org.hibernate.persister.entity.AbstractEntityPersister.insert(AbstractEntityPersister.java:3375)
	at org.hibernate.persister.entity.AbstractEntityPersister.insert(AbstractEntityPersister.java:3908)
	at org.hibernate.action.internal.EntityInsertAction.execute(EntityInsertAction.java:107)
	at org.hibernate.engine.spi.ActionQueue.executeActions(ActionQueue.java:604)
	at org.hibernate.engine.spi.ActionQueue.lambda$executeActions$1(ActionQueue.java:478)
	at java.base/java.util.LinkedHashMap.forEach(LinkedHashMap.java:721)
	at org.hibernate.engine.spi.ActionQueue.executeActions(ActionQueue.java:475)
	at org.hibernate.event.internal.AbstractFlushingEventListener.performExecutions(AbstractFlushingEventListener.java:344)
	at org.hibernate.event.internal.DefaultFlushEventListener.onFlush(DefaultFlushEventListener.java:40)
	at org.hibernate.event.service.internal.EventListenerGroupImpl.fireEventOnEachListener(EventListenerGroupImpl.java:107)
	at org.hibernate.internal.SessionImpl.doFlush(SessionImpl.java:1407)
	at org.hibernate.internal.SessionImpl.managedFlush(SessionImpl.java:489)
	at org.hibernate.internal.SessionImpl.flushBeforeTransactionCompletion(SessionImpl.java:3290)
	at org.hibernate.internal.SessionImpl.beforeTransactionCompletion(SessionImpl.java:2425)
	at org.hibernate.engine.jdbc.internal.JdbcCoordinatorImpl.beforeTransactionCompletion(JdbcCoordinatorImpl.java:449)
	at org.hibernate.resource.transaction.backend.jdbc.internal.JdbcResourceLocalTransactionCoordinatorImpl.beforeCompletionCallback(JdbcResourceLocalTransactionCoordinatorImpl.java:183)
	at org.hibernate.resource.transaction.backend.jdbc.internal.JdbcResourceLocalTransactionCoordinatorImpl.access$300(JdbcResourceLocalTransactionCoordinatorImpl.java:40)
	at org.hibernate.resource.transaction.backend.jdbc.internal.JdbcResourceLocalTransactionCoordinatorImpl$TransactionDriverControlImpl.commit(JdbcResourceLocalTransactionCoordinatorImpl.java:281)
	at org.hibernate.engine.transaction.internal.TransactionImpl.commit(TransactionImpl.java:101)
	... 15 more
Caused by: org.h2.jdbc.JdbcSQLIntegrityConstraintViolationException: Unique index or primary key violation: "PUBLIC.UK_LFC8YJ7NOT9MTD4XYCNE72QEC_INDEX_5 ON PUBLIC.PLUGININFO(CALLBACK NULLS FIRST) VALUES ( /* 4 */ 'https://cryostat:8181/' )"; SQL statement:
/* insert io.cryostat.discovery.PluginInfo */ insert into PluginInfo (cal [23505-214]
	at org.h2.message.DbException.getJdbcSQLException(DbException.java:508)
	at org.h2.message.DbException.getJdbcSQLException(DbException.java:477)
	at org.h2.message.DbException.get(DbException.java:223)
	at org.h2.message.DbException.get(DbException.java:199)
	at org.h2.index.Index.getDuplicateKeyException(Index.java:525)
	at org.h2.mvstore.db.MVSecondaryIndex.checkUnique(MVSecondaryIndex.java:223)
	at org.h2.mvstore.db.MVSecondaryIndex.add(MVSecondaryIndex.java:184)
	at org.h2.mvstore.db.MVTable.addRow(MVTable.java:519)
	at org.h2.command.dml.Insert.insertRows(Insert.java:174)
	at org.h2.command.dml.Insert.update(Insert.java:135)
	at org.h2.command.dml.DataChangeStatement.update(DataChangeStatement.java:74)
	at org.h2.command.CommandContainer.update(CommandContainer.java:169)
	at org.h2.command.Command.executeUpdate(Command.java:252)
	at org.h2.jdbc.JdbcPreparedStatement.executeUpdateInternal(JdbcPreparedStatement.java:209)
	at org.h2.jdbc.JdbcPreparedStatement.executeUpdate(JdbcPreparedStatement.java:169)
	at org.hibernate.engine.jdbc.internal.ResultSetReturnImpl.executeUpdate(ResultSetReturnImpl.java:197)
	... 35 more

Sep 20, 2022 6:41:56 PM io.cryostat.core.log.Logger warn
WARNING: HTTP 400: Failed to register new provider for "cryostat-agent" @ "https://cryostat:8181/": Error while committing the transaction
io.vertx.ext.web.handler.HttpException: Bad Request
Caused by: io.cryostat.net.web.http.api.v2.ApiException: Failed to register new provider for "cryostat-agent" @ "https://cryostat:8181/": Error while committing the transaction
	at io.cryostat.net.web.http.api.v2.DiscoveryRegistrationHandler.handle(DiscoveryRegistrationHandler.java:164)
	at io.cryostat.net.web.http.api.v2.AbstractV2RequestHandler.handle(AbstractV2RequestHandler.java:123)
	at io.cryostat.net.web.http.api.v2.AbstractV2RequestHandler.handle(AbstractV2RequestHandler.java:75)
	at io.vertx.ext.web.impl.BlockingHandlerDecorator.lambda$handle$0(BlockingHandlerDecorator.java:48)
	at io.vertx.core.impl.ContextImpl.lambda$null$0(ContextImpl.java:159)
	at io.vertx.core.impl.AbstractContext.dispatch(AbstractContext.java:100)
	at io.vertx.core.impl.ContextImpl.lambda$executeBlocking$1(ContextImpl.java:157)
	at io.vertx.core.impl.TaskQueue.run(TaskQueue.java:76)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
	at java.base/java.lang.Thread.run(Thread.java:833)
Caused by: io.cryostat.discovery.RegistrationException: Failed to register new provider for "cryostat-agent" @ "https://cryostat:8181/": Error while committing the transaction
	at io.cryostat.discovery.DiscoveryStorage.register(DiscoveryStorage.java:208)
	at io.cryostat.net.web.http.api.v2.DiscoveryRegistrationHandler.handle(DiscoveryRegistrationHandler.java:162)
	... 11 more
Caused by: javax.persistence.RollbackException: Error while committing the transaction
	at org.hibernate.internal.ExceptionConverterImpl.convertCommitException(ExceptionConverterImpl.java:81)
	at org.hibernate.engine.transaction.internal.TransactionImpl.commit(TransactionImpl.java:104)
	at io.cryostat.storage.AbstractDao.save(AbstractDao.java:71)
	at io.cryostat.discovery.PluginInfoDao.save(PluginInfoDao.java:73)
	at io.cryostat.discovery.DiscoveryStorage.register(DiscoveryStorage.java:195)
	... 12 more
Caused by: javax.persistence.PersistenceException: org.hibernate.exception.ConstraintViolationException: could not execute statement
	at org.hibernate.internal.ExceptionConverterImpl.convert(ExceptionConverterImpl.java:154)
	at org.hibernate.internal.ExceptionConverterImpl.convert(ExceptionConverterImpl.java:181)
	at org.hibernate.internal.ExceptionConverterImpl.convertCommitException(ExceptionConverterImpl.java:65)
	... 16 more
Caused by: org.hibernate.exception.ConstraintViolationException: could not execute statement
	at org.hibernate.exception.internal.SQLExceptionTypeDelegate.convert(SQLExceptionTypeDelegate.java:59)
	at org.hibernate.exception.internal.StandardSQLExceptionConverter.convert(StandardSQLExceptionConverter.java:37)
	at org.hibernate.engine.jdbc.spi.SqlExceptionHelper.convert(SqlExceptionHelper.java:113)
	at org.hibernate.engine.jdbc.spi.SqlExceptionHelper.convert(SqlExceptionHelper.java:99)
	at org.hibernate.engine.jdbc.internal.ResultSetReturnImpl.executeUpdate(ResultSetReturnImpl.java:200)
	at org.hibernate.engine.jdbc.batch.internal.NonBatchingBatch.addToBatch(NonBatchingBatch.java:46)
	at org.hibernate.persister.entity.AbstractEntityPersister.insert(AbstractEntityPersister.java:3375)
	at org.hibernate.persister.entity.AbstractEntityPersister.insert(AbstractEntityPersister.java:3908)
	at org.hibernate.action.internal.EntityInsertAction.execute(EntityInsertAction.java:107)
	at org.hibernate.engine.spi.ActionQueue.executeActions(ActionQueue.java:604)
	at org.hibernate.engine.spi.ActionQueue.lambda$executeActions$1(ActionQueue.java:478)
	at java.base/java.util.LinkedHashMap.forEach(LinkedHashMap.java:721)
	at org.hibernate.engine.spi.ActionQueue.executeActions(ActionQueue.java:475)
	at org.hibernate.event.internal.AbstractFlushingEventListener.performExecutions(AbstractFlushingEventListener.java:344)
	at org.hibernate.event.internal.DefaultFlushEventListener.onFlush(DefaultFlushEventListener.java:40)
	at org.hibernate.event.service.internal.EventListenerGroupImpl.fireEventOnEachListener(EventListenerGroupImpl.java:107)
	at org.hibernate.internal.SessionImpl.doFlush(SessionImpl.java:1407)
	at org.hibernate.internal.SessionImpl.managedFlush(SessionImpl.java:489)
	at org.hibernate.internal.SessionImpl.flushBeforeTransactionCompletion(SessionImpl.java:3290)
	at org.hibernate.internal.SessionImpl.beforeTransactionCompletion(SessionImpl.java:2425)
	at org.hibernate.engine.jdbc.internal.JdbcCoordinatorImpl.beforeTransactionCompletion(JdbcCoordinatorImpl.java:449)
	at org.hibernate.resource.transaction.backend.jdbc.internal.JdbcResourceLocalTransactionCoordinatorImpl.beforeCompletionCallback(JdbcResourceLocalTransactionCoordinatorImpl.java:183)
	at org.hibernate.resource.transaction.backend.jdbc.internal.JdbcResourceLocalTransactionCoordinatorImpl.access$300(JdbcResourceLocalTransactionCoordinatorImpl.java:40)
	at org.hibernate.resource.transaction.backend.jdbc.internal.JdbcResourceLocalTransactionCoordinatorImpl$TransactionDriverControlImpl.commit(JdbcResourceLocalTransactionCoordinatorImpl.java:281)
	at org.hibernate.engine.transaction.internal.TransactionImpl.commit(TransactionImpl.java:101)
	... 15 more
Caused by: org.h2.jdbc.JdbcSQLIntegrityConstraintViolationException: Unique index or primary key violation: "PUBLIC.UK_LFC8YJ7NOT9MTD4XYCNE72QEC_INDEX_5 ON PUBLIC.PLUGININFO(CALLBACK NULLS FIRST) VALUES ( /* 4 */ 'https://cryostat:8181/' )"; SQL statement:
/* insert io.cryostat.discovery.PluginInfo */ insert into PluginInfo (cal [23505-214]
	at org.h2.message.DbException.getJdbcSQLException(DbException.java:508)
	at org.h2.message.DbException.getJdbcSQLException(DbException.java:477)
	at org.h2.message.DbException.get(DbException.java:223)
	at org.h2.message.DbException.get(DbException.java:199)
	at org.h2.index.Index.getDuplicateKeyException(Index.java:525)
	at org.h2.mvstore.db.MVSecondaryIndex.checkUnique(MVSecondaryIndex.java:223)
	at org.h2.mvstore.db.MVSecondaryIndex.add(MVSecondaryIndex.java:184)
	at org.h2.mvstore.db.MVTable.addRow(MVTable.java:519)
	at org.h2.command.dml.Insert.insertRows(Insert.java:174)
	at org.h2.command.dml.Insert.update(Insert.java:135)
	at org.h2.command.dml.DataChangeStatement.update(DataChangeStatement.java:74)
	at org.h2.command.CommandContainer.update(CommandContainer.java:169)
	at org.h2.command.Command.executeUpdate(Command.java:252)
	at org.h2.jdbc.JdbcPreparedStatement.executeUpdateInternal(JdbcPreparedStatement.java:209)
	at org.h2.jdbc.JdbcPreparedStatement.executeUpdate(JdbcPreparedStatement.java:169)
	at org.hibernate.engine.jdbc.internal.ResultSetReturnImpl.executeUpdate(ResultSetReturnImpl.java:197)
	... 35 more

This was happening in other places, so I think I "solved" this in the the archives PR #1047 by just adding a timeout to the future which gets the jvmId.

@tthvo
Copy link
Member

tthvo commented Sep 20, 2022

Right, this behaviour happened exactly the same to me too :D

@maxcao13
Copy link
Member

maxcao13 commented Sep 20, 2022

Also interestingly though when I test on the branch I've been working on, the plugin will register and deregister every time.
cryostat

Sep 20, 2022 7:10:26 PM io.cryostat.core.log.Logger info
INFO: (10.0.2.100:52856): DELETE /api/v2.2/discovery/ecb35849-4454-49c5-991b-aef004d1ac1b 200 1000ms
Sep 20, 2022 7:10:26 PM io.cryostat.core.log.Logger error
SEVERE: Exception thrown
java.util.concurrent.TimeoutException
        at java.base/java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1960)
        at java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2095)
        at io.cryostat.recordings.JvmIdHelper.computeJvmId(JvmIdHelper.java:98)
        at io.cryostat.recordings.JvmIdHelper.lambda$getJvmId$2(JvmIdHelper.java:116)
        at java.base/java.util.concurrent.ConcurrentHashMap.computeIfAbsent(ConcurrentHashMap.java:1708)
        at io.cryostat.recordings.JvmIdHelper.getJvmId(JvmIdHelper.java:113)
        at io.cryostat.recordings.RecordingMetadataManager.accept(RecordingMetadataManager.java:401)
        at io.cryostat.recordings.RecordingMetadataManager.accept(RecordingMetadataManager.java:89)
        at io.cryostat.discovery.AbstractPlatformClientVerticle.lambda$notifyAsyncTargetDiscovery$0(AbstractPlatformClientVerticle.java:71)
        at java.base/java.lang.Iterable.forEach(Iterable.java:75)
        at io.cryostat.discovery.AbstractPlatformClientVerticle.notifyAsyncTargetDiscovery(AbstractPlatformClientVerticle.java:71)
        at io.cryostat.discovery.DiscoveryStorage.lambda$deregister$13(DiscoveryStorage.java:251)
        at java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
        at java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:197)
        at java.base/java.util.HashMap$KeySpliterator.forEachRemaining(HashMap.java:1707)
        at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:509)
        at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:499)
        at java.base/java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
        at java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
        at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
        at java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:596)
        at io.cryostat.discovery.DiscoveryStorage.deregister(DiscoveryStorage.java:251)
        at io.cryostat.net.web.http.api.v2.DiscoveryDeregistrationHandler.handleWithValidJwt(DiscoveryDeregistrationHandler.java:119)
        at io.cryostat.net.web.http.api.v2.AbstractDiscoveryJwtConsumingHandler.handle(AbstractDiscoveryJwtConsumingHandler.java:113)
        at io.cryostat.net.web.http.api.v2.AbstractDiscoveryJwtConsumingHandler.handle(AbstractDiscoveryJwtConsumingHandler.java:77)
        at io.vertx.ext.web.impl.BlockingHandlerDecorator.lambda$handle$0(BlockingHandlerDecorator.java:48)
        at io.vertx.core.impl.ContextImpl.lambda$null$0(ContextImpl.java:159)
        at io.vertx.core.impl.AbstractContext.dispatch(AbstractContext.java:100)
        at io.vertx.core.impl.ContextImpl.lambda$executeBlocking$1(ContextImpl.java:157)
        at io.vertx.core.impl.TaskQueue.run(TaskQueue.java:76)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
        at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
        at java.base/java.lang.Thread.run(Thread.java:833)

Sep 20, 2022 7:10:26 PM io.cryostat.core.log.Logger error
SEVERE: Could not compute jvmId on FOUND target service:jmx:rmi:///jndi/rmi://fedora:10203/jmxrmi, msg: {}

agent

INFO: Registered as a6c984bd-c84b-4ca2-a73f-b7663c1cfa55
Sep 20, 2022 3:07:00 PM io.cryostat.agent.Registration tryUpdate
INFO: publishing self as service:jmx:rmi:///jndi/rmi://fedora:10203/jmxrmi
Sep 20, 2022 3:07:01 PM io.cryostat.agent.Registration lambda$tryUpdate$7
SEVERE: Update failure
io.vertx.core.http.impl.NoStackTraceTimeoutException: The timeout period of 1000ms has been exceeded while executing POST /api/v2.2/discovery/a6c984bd-c84b-4ca2-a73f-b7663c1cfa55?token=eyJjdHkiOiJKV1QiLCJlbmMiOiJBMjU2R0NNIiwiYWxnIjoiZGlyIn0..hrFMW7EV2PNLDPVM.Rfd_wAc0fgWb1M09sf0UdB96BaYmCKKpZ2ZWm_kWfeewOAgbCmON799qaMAfsHUtTViMvSsgYacm2G-pP2qgiDSYX-2V8nToJuk3Ul5J7NfpPiQCRXbj4_17nSJOI7UGPrVP5nB1Qpshg2EbenPByvTXWVQH5B2tmAOE-XoYl-CwsiWwGFDWLgCzDursA-eOB4oMMFCeoTLfxjZ2TRZuWYbXVVOYgYTjhz4y__B0z_qCOGHBFnRxjme0VZWSlPGUcmxyRtreVFmTYfUn5w6uth50TMjonoOql6uMzaUfzDJozem0BywLln-1sNFMf8VTHhOibHNs7oW3ZoTCdRQkY1A0aj-k0TGxoFg9ya4UHTFdixFA7XoOPyhPG2REHy88WYtSEYef0a5oavPLot9bniF-I-VAPjpcBy5e9V2vq_79boNBd5pcmW0tEpct-zdL6XYs_aBj5C_RCdaoLYI4KvladmcsEcKj6hx1-WupH8Vb18E2P8YFdBXqhzaDWdUl368zqllC5itAxhvkLnig-LXf_bOSNdIL7flgwJqV4zwRK6A7DrSskXkT7Ep36iBtp5sSmk35tPJSvlas4HoQRg.0D5azplXsCM8wh3y-K2vww for server localhost:8181

Sep 20, 2022 3:07:02 PM io.cryostat.agent.Registration lambda$deregister$9
WARNING: Failed to deregister as Cryostat discovery plugin [a6c984bd-c84b-4ca2-a73f-b7663c1cfa55]
Sep 20, 2022 3:07:07 PM io.cryostat.agent.Registration lambda$tryRegister$3
INFO: Registered as d2a1c51a-aac6-4bf7-a85c-eddd1fa956e2
Sep 20, 2022 3:07:07 PM io.cryostat.agent.Registration tryUpdate
INFO: publishing self as service:jmx:rmi:///jndi/rmi://fedora:10203/jmxrmi
Sep 20, 2022 3:07:08 PM io.cryostat.agent.Registration lambda$tryUpdate$7
SEVERE: Update failure
io.vertx.core.http.impl.NoStackTraceTimeoutException: The timeout period of 1000ms has been exceeded while executing POST /api/v2.2/discovery/d2a1c51a-aac6-4bf7-a85c-eddd1fa956e2?token=eyJjdHkiOiJKV1QiLCJlbmMiOiJBMjU2R0NNIiwiYWxnIjoiZGlyIn0..Y31dOYxKaW4hL4-a.30mDIlGf0K8ULyZuVC8DtHapSlNsiWAQJ5tQNhapaT1gaae4WuooXChwpcg-OH-vxsdAeoe0xhe11-35q4tM7sCWXJLgjCBRxboqCEbbkDc6iYHEJGGQNMfYE4eccuE6ZjIfPhJCoIAAR7kjvdgni-MBIwtiGl3RauW2sFnIYpWGEfPil7tO1h4T0XN70N0f5jVfKOI6SV-7dLwrsV65X1dDJ1KfdXP2Idyk-Ne5k67d1RU-5nQMXT_GuYoVBDuPlskCRI5lWsIy5CItpE7FQN9zLMQQDH_nHokJ68w8XhzPldmTaZFqaBRS5vVRztz-83KbMiUarkxv3Tv5bzoZ2D-vVa_XqTllYN8gCHUJ2OS9w1rii3I34TC-7OKHyrTM8PdZXS6b068Rm2YZnirHY2Ycw9uN8x2zRg84Kw7B4Y3Rwm7MPoM26nqb9IZQM4rJ0Q0mTJRvl7XtbWnZmPlx17nfv27Ly_o6ej7sxBp8S0WVzWZgM_xo_lCIblem0-lf8C0HbN9AIymNrD9q31V0c55zUgyoHkOSMkpkW2gsB4C20g3hejvgeTtc_86mP7uw05WQAk-a7Fm046WsAdETsg.Fyjy2ydfH1ZfS8w7KPEw1w for server localhost:8181

and a jvmId cannot be computed on the FOUND agent plugin jvm every time as well. Not sure what's going on there but I will look at it after? What do you think should be the way to go about this?

@andrewazores
Copy link
Member Author

andrewazores commented Sep 20, 2022

The -agent run.sh is a hack, the callback URL that it publishes is actually a URL pointing back to Cryostat itself within the smoketest.sh pod. Because of that setup, Cryostat cannot actually connect to the agent over JMX either. It's like that because the agent program is running as its own standalone JVM, not packaged within a container, and not running within the Cryostat pod. It's just a Java process running directly on your machine. The Cryostat pod publishes some of its ports externally, so the agent can talk into the pod to register itself, but Cryostat can't reach back out of the pod to make the callback or open JMX connections.

A proper setup would be like https://github.com/cryostatio/cryostat/pull/1048 , where the agent is attached to a JVM that runs within the pod. This way the Cryostat container and the agent container can actually talk to each other, the callback is "real", and JMX connections can be opened to do things like compute JVM IDs or whatever.

@andrewazores
Copy link
Member Author

I'll see if I can prepare a branch of github.com/andrewazores/vertx-fib-demo that includes the -agent so that we can use that for testing the agent in a more realistic way.

It's still good that we can catch these kinds of timeout problems or failures that can occur from well-formed but invalid data input. More good bugs to tackle.

It probably shouldn't take so long for a JMX connection attempt to fail, for example.

@andrewazores
Copy link
Member Author

Getting back on track with this - https://github.com/andrewazores/quarkus-test has been set up for a little while to build with the -agent JAR included and configured. The typical smoketest.sh setup can be modified as needed to update the image tag to match whatever the locally-built version is.

mvn install in this repo should install the -agent JAR artifact to the ~/.m2 repository, so subsequent builds of quarkus-test should pick up and use that artifact (assuming the built -agent version matches the version in quarkus-test pom.xml).

It's still a little convoluted and takes a few manual steps, but it's fairly doable. It could probably be smoothed out by adding the -agent as a Maven submodule to the quarkus-test build so that changes to the -agent can be made directly within the quarkus-test repo and then automatically built and pulled in to new container images. I'll look into a setup like that or similar soon.

@andrewazores
Copy link
Member Author

Somewhat more concrete steps:

  1. cd cryostat-agent
  2. Make any source changes and update version in pom.xml, then mvn install
  3. cd quarkus-test
  4. Update pom.xml to depend on new agent version, then ./mvnw -U package && podman build -t quay.io/andrewazores/quarkus-test:latest -f src/main/docker/Dockerfile.jvm .
  5. cd cryostat
  6. Edit smoketest.sh to run quarkus-test:latest instead of 0.0.10 or whatever version is listed
  7. sh smoketest.sh

Copy link
Member

@maxcao13 maxcao13 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looks good, and testing works well as well.

There is a bug now however with metadata labels. If you have an Active Recording with labels in an agent equipped discovered target, and the target is pinged to re-register, the associated active metadata files are deleted.

I investigated a bit and I think it's because when the DiscoveryPostHandler calls DiscoveryStorage#update, it calls JvmdIdHelper#resolveId on all target children of the plugin which always puts into the ids LoadingCache. I think when this happens, the RecordingMetadataManager is hooked in by the IdEvent.INVALIDATED event which is emitted everytime an id is "removed" from the cache (however this is a bug, because the jvmId will actually be the same, so it is replacing itself in the cache with itself), and event is emitted, the metadata manager calls #removeLostTargetMetadata which deletes all ActiveRecording metadata associated with the "lost" target.

I believe a simple if check inside the

            future.thenAccept(
                    id -> {
                        if (this.ids.synchronous().get(sr.getServiceUri().toString()).equals(id)) {
                            return;
                        }
                        this.ids.synchronous().put(sr.getServiceUri().toString(), id);
}); 

would probably fix this.

run.sh Outdated Show resolved Hide resolved
@andrewazores
Copy link
Member Author

There is a bug now however with metadata labels...

@maxcao13 your explanation makes sense. Could you file a backend issue for it?

Copy link
Member

@maxcao13 maxcao13 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me then, works well!

@andrewazores andrewazores merged commit 6ff6c43 into cryostatio:main Dec 6, 2022
@andrewazores andrewazores deleted the jwt-flow branch December 6, 2022 14:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feat New feature or request
Projects
No open projects
Status: Done
Development

Successfully merging this pull request may close these issues.

[Task] Implement JWT flow [Task] Implement a small example/template microservice (ex. for JDP)
3 participants