Hadoop Distributed File System (HDFS) is a key component of the Hadoop ecosystem, designed to store vast amounts of data across multiple nodes, providing high availability and fault tolerance. Understanding and mastering Hadoop involves getting hands-on with HDFS commands to manage and manipulate the filesystem. In this article, we’ll take a deep dive into some of the most essential HDFS commands, exploring what they do, and how to use them effectively.

Advertisement

Understanding Hadoop Distributed File System (HDFS)

HDFS is a distributed file system that is designed to handle large datasets by distributing them across numerous nodes in a cluster. It is resilient to node failure, which ensures data reliability. HDFS stores metadata on a dedicated server, known as the NameNode, while actual data is stored on other servers called DataNodes.

Starting with HDFS Commands

All HDFS commands are invoked by the bin/hdfs script. Running the hdfs script without any arguments will print out descriptions for all Hadoop commands. However, to manage files in HDFS, we use the following syntax:

hdfs dfs -command 

Note: ‘dfs’ stands for Distributed File System. You can also use ‘fs’ instead of ‘dfs’, both will work the same.

Now let’s explore some essential HDFS commands that you should master.

1. Listing Files/Directories

The `-ls` command allows you to list all the files and directories in HDFS. The syntax is similar to the UNIX ls command.

hdfs dfs -ls /path 

2. Creating Directories

You can create directories in HDFS using the `-mkdir` command.

hdfs dfs -mkdir /path/to/directory 

3. Deleting Files/Directories

To remove a file or directory in HDFS, you can use the `-rm` command. To remove a directory, you would need to use the `-r` (recursive) option.

hdfs dfs -rm /path/to/file 

hdfs dfs -rm -r /path/to/directory 

4. Moving Files/Directories

The `-mv` command allows you to move files or directories from one location to another within HDFS.

hdfs dfs -mv /source/path /destination/path 

5. Copying Files/Directories

To copy files or directories within HDFS, use the `-cp` command.

hdfs dfs -cp /source/path /destination/path 

6. Displaying the Content of a File

You can display the contents of a file in HDFS using the `-cat` command.

hdfs dfs -cat /path/to/file 

7. Copying Files to HDFS

To copy files from the local filesystem to HDFS, use the `-put` or `-copyFromLocal` command.

hdfs dfs -put localfile /path/in/hdfs 

hdfs dfs -copyFromLocal localfile /path/in/hdfs 

8. Copying Files from HDFS

To copy files from HDFS to the local filesystem, use the `-get` or `-copyToLocal` command.

hdfs dfs -get /path/in/hdfs localfile 

hdfs dfs -copyToLocal /path/in/hdfs localfile 

9. File/Directory Permissions

HDFS commands for file or directory permissions mirror the chmod, chown, and chgrp commands in UNIX.

hdfs dfs -chmod 755 /path/to/file 

hdfs dfs -chown user:group /path/to/file 

hdfs dfs -chgrp group /path/to/file 

10. Checking Disk Usage

The `-du` command displays the size of a directory or file, and `-dus` displays a summary of the disk usage.

hdfs dfs -du /path/to/directory 

hdfs dfs -dus /path/to/directory 

Conclusion

Mastering HDFS is a crucial part of becoming proficient in Hadoop. It gives you the skills to effectively manage and manipulate large amounts of data in a distributed environment. By understanding and practicing the commands discussed above, you are taking a big step forward in mastering Hadoop and the HDFS file system. Remember, practice is key to proficiency, so don’t hesitate to get hands-on with HDFS commands.

Share.

1 Comment

  1. Vidya Vinodini on

    Hi
    This tutorial is very useful for me..I followed this for my hadoop installation…but while running wordcount example – javac -classpath $(HADOOP_CLASSPATH) -d ‘/home/hduser/Desktop/WordCountTutorial/tutorial_classes’ ‘/home/hduser/Desktop/WordCountTutorial/WordCount.java’ i am facing with the error

    HADOOP_CLASSPATH: command not found
    javac: invalid flag: /home/hduser/Desktop/WordCountTutorial/tutorial_classes
    Usage: javac
    use -help for a list of possible options
    so how can i fix the problem??

    Thankyou

Exit mobile version