hadoop copy a local file system folder to HDFS

HadoopHdfs

Hadoop Problem Overview


I need to copy a folder from local file system to HDFS. I could not find any example of moving a folder(including its all subfolders) to HDFS

$ hadoop fs -copyFromLocal /home/ubuntu/Source-Folder-To-Copy HDFS-URI

Hadoop Solutions


Solution 1 - Hadoop

You could try:

hadoop fs -put /path/in/linux /hdfs/path

or even

hadoop fs -copyFromLocal /path/in/linux /hdfs/path

By default both put and copyFromLocal would upload directories recursively to HDFS.

Solution 2 - Hadoop

In Short

> ## hdfs dfs -put <localsrc> <dest>

In detail with an example:

Checking source and target before placing files into HDFS

[cloudera@quickstart ~]$ ll files/
total 132
-rwxrwxr-x 1 cloudera cloudera  5387 Nov 14 06:33 cloudera-manager
-rwxrwxr-x 1 cloudera cloudera  9964 Nov 14 06:33 cm_api.py
-rw-rw-r-- 1 cloudera cloudera   664 Nov 14 06:33 derby.log
-rw-rw-r-- 1 cloudera cloudera 53655 Nov 14 06:33 enterprise-deployment.json
-rw-rw-r-- 1 cloudera cloudera 50515 Nov 14 06:33 express-deployment.json

[cloudera@quickstart ~]$ hdfs dfs -ls
Found 1 items
drwxr-xr-x   - cloudera cloudera          0 2017-11-14 00:45 .sparkStaging

Copy files HDFS using -put or -copyFromLocal command

[cloudera@quickstart ~]$ hdfs dfs -put files/ files

Verify the result in HDFS

[cloudera@quickstart ~]$ hdfs dfs -ls
Found 2 items
drwxr-xr-x   - cloudera cloudera          0 2017-11-14 00:45 .sparkStaging
drwxr-xr-x   - cloudera cloudera          0 2017-11-14 06:34 files

[cloudera@quickstart ~]$ hdfs dfs -ls files
Found 5 items
-rw-r--r--   1 cloudera cloudera       5387 2017-11-14 06:34 files/cloudera-manager
-rw-r--r--   1 cloudera cloudera       9964 2017-11-14 06:34 files/cm_api.py
-rw-r--r--   1 cloudera cloudera        664 2017-11-14 06:34 files/derby.log
-rw-r--r--   1 cloudera cloudera      53655 2017-11-14 06:34 files/enterprise-deployment.json
-rw-r--r--   1 cloudera cloudera      50515 2017-11-14 06:34 files/express-deployment.json

Solution 3 - Hadoop

If you copy a folder from local then it will copy folder with all its sub folders to HDFS.

For copying a folder from local to hdfs, you can use

hadoop fs -put localpath

or

hadoop fs -copyFromLocal localpath

or

hadoop fs -put localpath hdfspath

or

hadoop fs -copyFromLocal localpath hdfspath

Note:

If you are not specified hdfs path then folder copy will be copy to hdfs with the same name of that folder.

To copy from hdfs to local

 hadoop fs -get hdfspath localpath

Solution 4 - Hadoop

You can use :

1.LOADING DATA FROM LOCAL FILE TO HDFS

Syntax:$hadoop fs –copyFromLocal

EX: $hadoop fs –copyFromLocal localfile1 HDIR

2. Copying data From HDFS to Local

Sys: $hadoop fs –copyToLocal < new file name>

EX: $hadoop fs –copyToLocal hdfs/filename myunx;

Solution 5 - Hadoop

To copy a folder file from local to hdfs, you can the below command

hadoop fs -put /path/localpath  /path/hdfspath
 

or

hadoop fs -copyFromLocal /path/localpath  /path/hdfspath

Solution 6 - Hadoop

Navigate to your "/install/hadoop/datanode/bin" folder or path where you could execute your hadoop commands:

To place the files in HDFS: Format: hadoop fs -put "Local system path"/filename.csv "HDFS destination path"

eg)./hadoop fs -put /opt/csv/load.csv /user/load

Here the /opt/csv/load.csv is source file path from my local linux system.

/user/load means HDFS cluster destination path in "hdfs://hacluster/user/load"

To get the files from HDFS to local system: Format : hadoop fs -get "/HDFSsourcefilepath" "/localpath"

eg)hadoop fs -get /user/load/a.csv /opt/csv/

After executing the above command, a.csv from HDFS would be downloaded to /opt/csv folder in local linux system.

This uploaded files could also be seen through HDFS NameNode web UI.

Solution 7 - Hadoop

using the following commands -

hadoop fs -copyFromLocal <local-nonhdfs-path> <hdfs-target-path>

hadoop fs -copyToLocal   <hdfs-input-path> <local-nonhdfs-path>

Or you also use spark FileSystem library to get or put hdfs file.

Hope this is helpful.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionTariqView Question on Stackoverflow
Solution 1 - HadoopAshrithView Answer on Stackoverflow
Solution 2 - HadoopmrsrinivasView Answer on Stackoverflow
Solution 3 - HadoopKumarView Answer on Stackoverflow
Solution 4 - HadoopJubrayil JamadarView Answer on Stackoverflow
Solution 5 - HadoopPraveenkumar SekarView Answer on Stackoverflow
Solution 6 - HadoopPrasannaView Answer on Stackoverflow
Solution 7 - Hadoopuser11714872View Answer on Stackoverflow