Skip to content

Commit

Permalink
MAPREDUCE-2765. DistCp Rewrite. (Mithun Radhakrishnan via mahadev)
Browse files Browse the repository at this point in the history
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1236045 13f79535-47bb-0310-9956-ffa450edef68
  • Loading branch information
Mahadev Konar committed Jan 26, 2012
1 parent fae75c2 commit d069480
Show file tree
Hide file tree
Showing 50 changed files with 10,389 additions and 0 deletions.
2 changes: 2 additions & 0 deletions hadoop-mapreduce-project/CHANGES.txt
Original file line number Diff line number Diff line change
Expand Up @@ -195,6 +195,8 @@ Release 0.23.1 - Unreleased
MAPREDUCE-3710. Improved FileInputFormat to return better locality for the
last split. (Siddarth Seth via vinodkv)

MAPREDUCE-2765. DistCp Rewrite. (Mithun Radhakrishnan via mahadev)

OPTIMIZATIONS

MAPREDUCE-3567. Extraneous JobConf objects in AM heap. (Vinod Kumar
Expand Down
18 changes: 18 additions & 0 deletions hadoop-project/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -709,11 +709,21 @@
<artifactId>maven-project-info-reports-plugin</artifactId>
<version>2.4</version>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-resources-plugin</artifactId>
<version>2.2</version>
</plugin>
<plugin>
<groupId>org.codehaus.mojo</groupId>
<artifactId>exec-maven-plugin</artifactId>
<version>1.2</version>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-pdf-plugin</artifactId>
<version>1.1</version>
</plugin>
</plugins>
</pluginManagement>

Expand Down Expand Up @@ -773,6 +783,14 @@
</excludes>
</configuration>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-pdf-plugin</artifactId>
<configuration>
<outputDirectory>${project.reporting.outputDirectory}</outputDirectory>
<includeReports>false</includeReports>
</configuration>
</plugin>
</plugins>
</build>

Expand Down
7 changes: 7 additions & 0 deletions hadoop-tools/hadoop-distcp/README
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
DistCp (distributed copy) is a tool used for large inter/intra-cluster copying.
It uses Map/Reduce to effect its distribution, error handling and recovery,
and reporting. It expands a list of files and directories into input to map tasks,
each of which will copy a partition of the files specified in the source list.

Version 0.1 (2010/08/02 sriksun)
- Initial Version
185 changes: 185 additions & 0 deletions hadoop-tools/hadoop-distcp/pom.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,185 @@
<?xml version="1.0" encoding="UTF-8"?>
<project>
<modelVersion>4.0.0</modelVersion>
<parent>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-project</artifactId>
<version>0.23.1-SNAPSHOT</version>
<relativePath>../../hadoop-project</relativePath>
</parent>
<groupId>org.apache.hadoop.tools</groupId>
<artifactId>hadoop-distcp</artifactId>
<version>0.23.1-SNAPSHOT</version>
<description>Apache Hadoop Distributed Copy</description>
<name>Apache Hadoop Distributed Copy</name>
<packaging>jar</packaging>

<properties>
<file.encoding>UTF-8</file.encoding>
<downloadSources>true</downloadSources>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
</properties>

<dependencies>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-annotations</artifactId>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-mapreduce-client-app</artifactId>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-mapreduce-client-hs</artifactId>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-mapreduce-client-core</artifactId>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-mapreduce-client-jobclient</artifactId>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-mapreduce-client-jobclient</artifactId>
<scope>test</scope>
<type>test-jar</type>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-hdfs</artifactId>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-hdfs</artifactId>
<scope>test</scope>
<type>test-jar</type>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
<scope>test</scope>
<type>test-jar</type>
</dependency>
</dependencies>

<build>
<resources>
<resource>
<directory>src/main/resources</directory>
<filtering>true</filtering>
</resource>
</resources>
<testResources>
<testResource>
<directory>src/test/resources</directory>
<filtering>true</filtering>
</testResource>
</testResources>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-surefire-plugin</artifactId>
<configuration>
<forkMode>always</forkMode>
<forkedProcessTimeoutInSeconds>600</forkedProcessTimeoutInSeconds>
<argLine>-Xmx1024m</argLine>
<includes>
<include>**/Test*.java</include>
</includes>
<redirectTestOutputToFile>true</redirectTestOutputToFile>
<systemProperties>
<property>
<name>test.build.data</name>
<value>${basedir}/target/test/data</value>
</property>
<property>
<name>hadoop.log.dir</name>
<value>target/test/logs</value>
</property>
<property>
<name>org.apache.commons.logging.Log</name>
<value>org.apache.commons.logging.impl.SimpleLog</value>
</property>
<property>
<name>org.apache.commons.logging.simplelog.defaultlog</name>
<value>warn</value>
</property>
</systemProperties>
</configuration>
</plugin>
<plugin>
<artifactId>maven-dependency-plugin</artifactId>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>copy-dependencies</goal>
</goals>
<configuration>
<outputDirectory>${project.build.directory}/lib</outputDirectory>
</configuration>
</execution>
</executions>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-checkstyle-plugin</artifactId>
<configuration>
<enableRulesSummary>true</enableRulesSummary>
</configuration>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-jar-plugin</artifactId>
<configuration>
<archive>
<manifest>
<mainClass>org.apache.hadoop.tools.DistCp</mainClass>
</manifest>
</archive>
</configuration>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-source-plugin</artifactId>
<configuration>
<attach>true</attach>
</configuration>
<executions>
<execution>
<goals>
<goal>jar</goal>
</goals>
</execution>
</executions>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-pdf-plugin</artifactId>
<executions>
<execution>
<id>pdf</id>
<phase>package</phase>
<goals>
<goal>pdf</goal>
</goals>
</execution>
</executions>
</plugin>
</plugins>
</build>
</project>
Loading

0 comments on commit d069480

Please sign in to comment.