Hadoop how to do CopyMerge in Hadoop 3.0

248
February 04, 2017, at 1:06 PM

I know hadoop version 2.7's FileUtil has the copyMerge function that merges multiple files into a new 1. The copyMerge function is no longer supported per the API in the 3.0 version.

Any ideas on how to merge all files within a directory into a new single file in the 3.0 version of hadoop?

Answer 1

FileUtil#copyMerge method has been removed. See details for the major change:

https://issues.apache.org/jira/browse/HADOOP-12967

https://issues.apache.org/jira/browse/HADOOP-11392

You can use getmerge

Usage: hadoop fs -getmerge [-nl]

Takes a source directory and a destination file as input and concatenates files in src into the destination local file. Optionally -nl can be set to enable adding a newline character (LF) at the end of each file. -skip-empty-file can be used to avoid unwanted newline characters in case of empty files.

Examples:

hadoop fs -getmerge -nl /src /opt/output.txt
hadoop fs -getmerge -nl /src/file1.txt /src/file2.txt /output.txt

Exit Code: Returns 0 on success and non-zero on error.

https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/FileSystemShell.html#getmerge

READ ALSO
How can we bundle Java Webserver (Tomcat) and NW.js together?

How can we bundle Java Webserver (Tomcat) and NW.js together?

How can we bundle any Java Server or have Java backend for NWjs for developing desktop applications? Ideally the idea is need to run the webserver from the bundled application an NW

272
Java How to check all possible patterns in ArrayList without repeating cells?

Java How to check all possible patterns in ArrayList without repeating cells?

I am trying to 7 random values from an array greater than 7, and find all possible outcomes without repeating cells

186
Passing one class to another class using a constructor

Passing one class to another class using a constructor

First of all, I have 3 ClassesThe Main class is named 'main', a runnable called StartRunnable, and another runnable called Elimination Runnable

274
Best way to find a substring in an API?

Best way to find a substring in an API?

I'm trying to use Riot Game's API using Java

256