Skip to content

Commit

Permalink
KAFKA-3219: Fix long topic name validation
Browse files Browse the repository at this point in the history
This fixes an issue with long topic names by considering, during topic
validation, the '-' and the partition id that is appended to the log
folder created for each topic partition.

Author: Vahid Hashemian <vahidhashemian@us.ibm.com>

Reviewers: Gwen Shapira, Grant Henke

Closes apache#898 from vahidhashemian/KAFKA-3219
  • Loading branch information
vahidhashemian authored and gwenshap committed Mar 22, 2016
1 parent ca77d67 commit ad3dfc6
Show file tree
Hide file tree
Showing 3 changed files with 5 additions and 2 deletions.
2 changes: 1 addition & 1 deletion core/src/main/scala/kafka/common/Topic.scala
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ import kafka.coordinator.GroupCoordinator

object Topic {
val legalChars = "[a-zA-Z0-9\\._\\-]"
private val maxNameLength = 255
private val maxNameLength = 249
private val rgx = new Regex(legalChars + "+")

def validate(topic: String) {
Expand Down
3 changes: 2 additions & 1 deletion core/src/test/scala/unit/kafka/common/TopicTest.scala
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@ class TopicTest {
for (i <- 1 to 6)
longName += longName
invalidTopicNames += longName
invalidTopicNames += longName.drop(6)
val badChars = Array('/', '\\', ',', '\u0000', ':', "\"", '\'', ';', '*', '?', ' ', '\t', '\r', '\n', '=')
for (weirdChar <- badChars) {
invalidTopicNames += "Is" + weirdChar + "illegal"
Expand All @@ -47,7 +48,7 @@ class TopicTest {
}

val validTopicNames = new ArrayBuffer[String]()
validTopicNames += ("valid", "TOPIC", "nAmEs", "ar6", "VaL1d", "_0-9_.")
validTopicNames += ("valid", "TOPIC", "nAmEs", "ar6", "VaL1d", "_0-9_.", longName.drop(7))
for (i <- 0 until validTopicNames.size) {
try {
Topic.validate(validTopicNames(i))
Expand Down
2 changes: 2 additions & 0 deletions docs/ops.html
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,8 @@ <h4><a id="basic_ops_add_topic" href="#basic_ops_add_topic">Adding and removing
<p>
The partition count controls how many logs the topic will be sharded into. There are several impacts of the partition count. First each partition must fit entirely on a single server. So if you have 20 partitions the full data set (and read and write load) will be handled by no more than 20 servers (no counting replicas). Finally the partition count impacts the maximum parallelism of your consumers. This is discussed in greater detail in the <a href="#intro_consumers">concepts section</a>.
<p>
Each sharded partition log is placed into its own folder under the Kafka log directory. The name of such folders consists of the topic name, appended by a dash (-) and the partition id. Since a typical folder name can not be over 255 characters long, there will be a limitation on the length of topic names. We assume the number of partitions will not ever be above 100,000. Therefore, topic names cannot be longer than 249 characters. This leaves just enough room in the folder name for a dash and a potentially 5 digit long partition id.
<p>
The configurations added on the command line override the default settings the server has for things like the length of time data should be retained. The complete set of per-topic configurations is documented <a href="#topic-config">here</a>.

<h4><a id="basic_ops_modify_topic" href="#basic_ops_modify_topic">Modifying topics</a></h4>
Expand Down

0 comments on commit ad3dfc6

Please sign in to comment.