-
Notifications
You must be signed in to change notification settings - Fork 3
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
add brief description of enhancements
- Loading branch information
yongkun.wang
committed
Oct 28, 2013
1 parent
a6ba113
commit f26d01c
Showing
1 changed file
with
23 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,23 @@ | ||
|
||
This is a repository for the flume source code we are using inside our company. | ||
It is forked from Cloudera's 0.9.3-cdh3u0 distribution (Flume-OG). | ||
|
||
Two major enhancements: | ||
|
||
1) Masterless Ack: | ||
This enhancement aims to increase the reliability and throughput of the whole distributed collecting system. | ||
Flume provides End-to-End delivery mode to guarantee the data delivery; an acknowledgement message (ACK) is sent back to original node to confirm the successful delivery of a group of messages. However, the ACKs are sent back through master, which could be a single-point-of-failure or bottle neck of the whole system. | ||
Therefore, I re-designed the ACKs system to let the ACK go back via the route of Event. | ||
|
||
You can also get some information here https://issues.apache.org/jira/browse/FLUME-640 | ||
This enhancement was supposed to be merged into 0.10 if Flume was not upgraded to NG; | ||
|
||
2) Append to HDFS with new file rotation method. | ||
Use HDFS append() and change the file rotation mechanism to create large HDFS files, which could increase the performance of Map/Reduce program when using these files as input, and reduce the number of block mapping entries in Hadoop NameNode. | ||
|
||
This modified version has been heavily used inside our company, with single collector receiving more than 300GB data per day. | ||
|
||
Contact me: | ||
yongkun at gmail.com | ||
|
||
https://github.com/yongkun/flume-0.9.3-cdh3u0 |