Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 20 additions & 0 deletions pinot-tools/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,8 @@
<properties>
<pinot.root>${basedir}/..</pinot.root>
<aws.version>2.14.28</aws.version>
<scala.version>2.12</scala.version>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once you start pulling in Scala code, you need to ensure that every dependency that's also using Scala is using the same version. So I think this should go in the top-level pom.xml file. Also note that pinot-kafka and pinot-spark have dependencies on Scala 2.11, which I believe will cause runtime problems if they are on the classpath when the tool is being run (and it's using 2.12).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have actually excluded the scala code from plugin. It is just that the plugin requires it in name.

<spark.version>3.2.1</spark.version>
</properties>
<dependencies>
<dependency>
Expand Down Expand Up @@ -268,6 +270,24 @@
<artifactId>mockito-core</artifactId>
<scope>test</scope>
</dependency>

<!--
This dependency is needed for LaunchSparkDataIngestionJobCommand.
The dependency only contains a few classes and scala library which has been excluded.
Hence, it will not interfere with spark-core classes present in runtime env
and will use the env spark version to actually execute the spark job
-->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-launcher_${scala.version}</artifactId>
<version>${spark.version}</version>
<exclusions>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do have some minor concerns about adding dependencies on Spark here. But since it's only pinot-tools, it's not so important.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This dep is pretty basic. Only contains couple of classes and one more dependency. So including it isn't a main concern. Earlier I had included spark-core which was a major one and could have caused a lot of issues.

<exclusion>
<groupId>org.scala-lang</groupId>
<artifactId>scala-library</artifactId>
</exclusion>
</exclusions>
</dependency>
</dependencies>
<build>
<plugins>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,7 @@
import org.apache.pinot.tools.admin.command.ImportDataCommand;
import org.apache.pinot.tools.admin.command.JsonToPinotSchema;
import org.apache.pinot.tools.admin.command.LaunchDataIngestionJobCommand;
import org.apache.pinot.tools.admin.command.LaunchSparkDataIngestionJobCommand;
import org.apache.pinot.tools.admin.command.MoveReplicaGroup;
import org.apache.pinot.tools.admin.command.OfflineSegmentIntervalCheckerCommand;
import org.apache.pinot.tools.admin.command.OperateClusterConfigCommand;
Expand Down Expand Up @@ -94,6 +95,7 @@ public class PinotAdministrator {
SUBCOMMAND_MAP.put("OperateClusterConfig", new OperateClusterConfigCommand());
SUBCOMMAND_MAP.put("GenerateData", new GenerateDataCommand());
SUBCOMMAND_MAP.put("LaunchDataIngestionJob", new LaunchDataIngestionJobCommand());
SUBCOMMAND_MAP.put("LaunchSparkDataIngestionJob", new LaunchSparkDataIngestionJobCommand());
SUBCOMMAND_MAP.put("CreateSegment", new CreateSegmentCommand());
SUBCOMMAND_MAP.put("ImportData", new ImportDataCommand());
SUBCOMMAND_MAP.put("StartZookeeper", new StartZookeeperCommand());
Expand Down
Loading