This is the version of Kafka running at LinkedIn.
Kafka was born at LinkedIn. We run thousands of brokers to deliver trillions of messages per day. We run a slightly modified version of Apache Kafka trunk. This branch contains the LinkedIn Kafka release.
This branch is made up of:
- Apache Kafka trunk (upstream) up to some branch point, see -li branch name for base version, you'll be able to get the exact commit from git
- Cherry-picked commits from upstream after branch point
- Patches that are on their way upstream but we have deployed internally in the meantime
- Patches that are of no interest to upstream
We are making this branch available for people interested. We will be documenting the changes in the near future with some more detailed explanations in the LinkedIn Engineering Blog.
If you are interested in learning more, we invite you to our Streaming Meetup where we discuss streaming technologies like Kafka and Samza.
You are encouraged to check out other Kafka projects from LinkedIn:
We are currently using Github Actions as the CI framework, and the testing results can be found here. To publish a release, go to the release page and manually create a new release. Once the release tag is created, a test job will be triggered to run the necessary tests. And once the test passes, the artifacts will be published to the bintray hosting LinkedIn projects.
Currently we've configured the CI flow to run only unit tests for 'clients' and 'core' when a pull request is created or updated:
./gradlew :clients:unitTest :core:unitTest
In contrast, all tests for 'cliests' and 'core' are run when creating a release, which may be significantly longer than running the unit tests:
./gradlew :clients:test :core:test
The reason for this mixed approach is to get faster feedback from CI during code reviews and still gain the more through test coverage when publishing a release.
At this moment we are not accepting external contributions directly. Please contribute to Apache Kafka.
For security issues with this branch please review LinkedIn Security Guidelines. General Kafka issues should be communicated via the Kafka community.
See our web site for details on the project.
You need to have Java installed.
Java 8 should be used for building in order to support both Java 8 and Java 11 at runtime.
Scala 2.12 is used by default, see below for how to use a different Scala version or all of the supported Scala versions.
./gradlew jar
Follow instructions in https://kafka.apache.org/documentation.html#quickstart
./gradlew srcJar
./gradlew aggregatedJavadoc
./gradlew javadoc
./gradlew javadocJar # builds a javadoc jar for each module
./gradlew scaladoc
./gradlew scaladocJar # builds a scaladoc jar for each module
./gradlew docsJar # builds both (if applicable) javadoc and scaladoc jars for each module
./gradlew test # runs both unit and integration tests
./gradlew unitTest
./gradlew integrationTest
./gradlew cleanTest test
./gradlew cleanTest unitTest
./gradlew cleanTest integrationTest
./gradlew clients:test --tests RequestResponseTest
./gradlew core:test --tests kafka.api.ProducerFailureHandlingTest.testCannotSendToInternalTopic
./gradlew clients:test --tests org.apache.kafka.clients.MetadataTest.testMetadataUpdateWaitTime
Change the log4j setting in either clients/src/test/resources/log4j.properties
or core/src/test/resources/log4j.properties
./gradlew clients:test --tests RequestResponseTest
Generate coverage reports for the whole project:
./gradlew reportCoverage
Generate coverage for a single module, i.e.:
./gradlew clients:reportCoverage
./gradlew clean releaseTarGz
The above command will fail if you haven't set up the signing key. To bypass signing the artifact, you can run:
./gradlew clean releaseTarGz -x signArchives
The release file can be found inside ./core/build/distributions/
.
./gradlew clean
Note that if building the jars with a version other than 2.12.x, you need to set the SCALA_VERSION
variable or change it in bin/kafka-run-class.sh
to run the quick start.
You can pass either the major version (eg 2.12) or the full version (eg 2.12.7):
./gradlew -PscalaVersion=2.12 jar
./gradlew -PscalaVersion=2.12 test
./gradlew -PscalaVersion=2.12 releaseTarGz
Append All
to the task name:
./gradlew testAll
./gradlew jarAll
./gradlew releaseTarGzAll
This is for core
, examples
and clients
./gradlew core:jar
./gradlew core:test
./gradlew tasks
Note that this is not strictly necessary (IntelliJ IDEA has good built-in support for Gradle projects, for example).
./gradlew eclipse
./gradlew idea
The eclipse
task has been configured to use ${project_dir}/build_eclipse
as Eclipse's build directory. Eclipse's default
build directory (${project_dir}/bin
) clashes with Kafka's scripts directory and we don't use Gradle's build directory
to avoid known issues with this configuration.
./gradlew -Pversion=<release version> uploadArchivesAll
By default, this command will publish artifacts to a Bintray repository named "kafka" under an account specified by the BINTRAY_USER environment variable. The BINTRAY_KEY environment variable is used for the password for that account.
If you want to override this to use a different maven repository, you should create/update ${GRADLE_USER_HOME}/gradle.properties
(typically, ~/.gradle/gradle.properties
)
and assign the following variables
mavenUrl=
mavenUsername=
mavenPassword=
Signing is disabled by default. If you need signing, please set the following variables in gradle.properties
as well:
signing.keyId=
signing.password=
signing.secretKeyRingFile=
For the Streams archetype project, one cannot use gradle to upload to maven; instead the mvn deploy
command needs to be called at the quickstart folder:
cd streams/quickstart
mvn deploy
Please note for this to work you should create/update user maven settings (typically, ${USER_HOME}/.m2/settings.xml
) to assign the following variables
<settings xmlns="http://maven.apache.org/SETTINGS/1.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/SETTINGS/1.0.0
https://maven.apache.org/xsd/settings-1.0.0.xsd">
...
<servers>
...
<server>
<id>apache.snapshots.https</id>
<username>${maven_username}</username>
<password>${maven_password}</password>
</server>
<server>
<id>apache.releases.https</id>
<username>${maven_username}</username>
<password>${maven_password}</password>
</server>
...
</servers>
...
./gradlew installAll
./gradlew testJar
./gradlew core:dependencies --configuration runtime
./gradlew dependencyUpdates
There are two code quality analysis tools that we regularly run, spotbugs and checkstyle.
Checkstyle enforces a consistent coding style in Kafka. You can run checkstyle using:
./gradlew checkstyleMain checkstyleTest
The checkstyle warnings will be found in reports/checkstyle/reports/main.html
and reports/checkstyle/reports/test.html
files in the
subproject build directories. They are also are printed to the console. The build will fail if Checkstyle fails.
Spotbugs uses static analysis to look for bugs in the code. You can run spotbugs using:
./gradlew spotbugsMain spotbugsTest -x test
The spotbugs warnings will be found in reports/spotbugs/main.html
and reports/spotbugs/test.html
files in the subproject build
directories. Use -PxmlSpotBugsReport=true to generate an XML report instead of an HTML one.
The following options should be set with a -P
switch, for example ./gradlew -PmaxParallelForks=1 test
.
commitId
: sets the build commit ID as .git/HEAD might not be correct if there are local commits added for build purposes.mavenUrl
: sets the URL of the maven deployment repository (file://path/to/repo
can be used to point to a local repository).maxParallelForks
: limits the maximum number of processes for each task.showStandardStreams
: shows standard out and standard error of the test JVM(s) on the console.skipSigning
: skips signing of artifacts.testLoggingEvents
: unit test events to be logged, separated by comma. For example./gradlew -PtestLoggingEvents=started,passed,skipped,failed test
.xmlSpotBugsReport
: enable XML reports for spotBugs. This also disables HTML reports as only one can be enabled at a time.
The gradle dependency debugging documentation mentions using the dependencies
or dependencyInsight
tasks to debug dependencies for the root project or individual subprojects.
Alternatively, use the allDeps
or allDepInsight
tasks for recursively iterating through all subprojects:
./gradlew allDeps
./gradlew allDepInsight --configuration runtime --dependency com.fasterxml.jackson.core:jackson-databind
These take the same arguments as the builtin variants.
See tests/README.md.
See vagrant/README.md.
Apache Kafka is interested in building the community; we would welcome any thoughts or patches. You can reach us on the Apache mailing lists.
To contribute follow the instructions here: