Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add integration test that ensures that server process does not start against newer version of accumulo. #4513

Open
wants to merge 5 commits into
base: 2.1
Choose a base branch
from

Conversation

ArbaazKhan1
Copy link
Contributor

At the current moment, checking for errors while each server type starts up fails.

We should be expecting them to throw errors during start up since it should detect that there is a different version of hdfs, but at the moment they are not giving errors.

issue #4268

@dlmarion
Copy link
Contributor

dlmarion commented May 1, 2024

MiniAccumuloClusterImpl.verifyUp (called from start) should throw an IllegalStateException if the expected processes are not running.

@ArbaazKhan1 ArbaazKhan1 marked this pull request as ready for review May 16, 2024 21:23
Comment on lines 71 to 82
ExecutorService executor = Executors.newSingleThreadExecutor();
Future<?> future = executor.submit(() -> {
for (ServerType st : ServerType.values()) {
try {
getCluster().getClusterControl().start(st);
} catch (IOException e) {
throw new RuntimeException(e);
}
}
});
// Check to see if servers fail to start
assertThrows(TimeoutException.class, () -> future.get(20, TimeUnit.SECONDS));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this will only check that the first ServerType in ServerType.values() will timeout. We might want to create a future for each ServerType and check that each one times out. That will make this test pretty long but will make sure that none of them start up.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked into starting each server individually and it like only couple servers actually time out, the rest seem to start normally. For the check is it worth checking each servers behavior or just look for the ones that timeout?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably want to ensure that all server types don't load with a mismatched data version.
If they are starting normally then that could cause issues.

@DomGarguilo
Copy link
Member

A more descriptive name for this IT might be better. ServerVersionIT is a bit vague. Maybe something that hints at what is being tested here about the servers not starting due to data version.

@ctubbsii ctubbsii modified the milestones: 3.1.0, 2.1.3 Jul 12, 2024
Copy link
Contributor

@ddanielr ddanielr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs to cover all server types and make sure that nothing successfully comes up if the data version is mismatched.

@ctubbsii ctubbsii modified the milestones: 2.1.3, 2.1.4 Jul 26, 2024
@ctubbsii
Copy link
Member

ctubbsii commented Aug 7, 2024

Since ITs don't run in GitHub Actions by default, I kicked off a special build at https://github.com/apache/accumulo/actions/runs/10293114308/job/28488804041 to run it.

@ctubbsii
Copy link
Member

ctubbsii commented Aug 7, 2024

Since ITs don't run in GitHub Actions by default, I kicked off a special build at https://github.com/apache/accumulo/actions/runs/10293114308/job/28488804041 to run it.

The IT passes. Nevermind. I'm not sure it worked correctly. Trying again with https://github.com/apache/accumulo/actions/runs/10293516758/job/28489957136

@ctubbsii
Copy link
Member

ctubbsii commented Aug 8, 2024

Okay that newer job definitely passed.

for (ServerType st : ServerType.values()) {

if (!st.getDeclaringClass().isAnnotationPresent(Deprecated.class)) {
Process p = getCluster().exec(Main.class, st.prettyPrint()).getProcess();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't seem to be working as intended.
I ran the IT locally and looked at the log files for the Main class.

accumulo.test.DataVersionIT_SharedMiniClusterBase/lib/ext
Invalid argument: Java <main class> 'TServer' was not found.  Please use the wholly qualified package name.

So the test is passing because all processes are failing to start. However, it's not because of the VersionNumber mismatch, it's just due to an invalid class name being used.

Looking at a running TServer process, the process is started with the full class name: org.apache.accumulo.start.Main org.apache.accumulo.tserver.TabletServer


assertTrue(fs.exists(rootPath));

getCluster().stop();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will probably stop the zookeeper nodes as well.
If servers can't talk to ZK they will fail to start.

You probably want to turn ZK back on.

Suggested change
getCluster().stop();
getCluster().stop();
getCluster().getClusterControl().start(ServerType.ZOOKEEPER);

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants