Facebook X (Twitter) Instagram
    TecAdmin
    • Home
    • FeedBack
    • Submit Article
    • About Us
    Facebook X (Twitter) Instagram
    TecAdmin
    You are at:Home»BIG-DATA»Hadoop Commands to Manage Files on HDFS

    Hadoop Commands to Manage Files on HDFS

    By RahulJune 26, 20233 Mins Read

    Hadoop Distributed File System (HDFS) is a key component of the Hadoop ecosystem, designed to store vast amounts of data across multiple nodes, providing high availability and fault tolerance. Understanding and mastering Hadoop involves getting hands-on with HDFS commands to manage and manipulate the filesystem. In this article, we’ll take a deep dive into some of the most essential HDFS commands, exploring what they do, and how to use them effectively.

    Understanding Hadoop Distributed File System (HDFS)

    HDFS is a distributed file system that is designed to handle large datasets by distributing them across numerous nodes in a cluster. It is resilient to node failure, which ensures data reliability. HDFS stores metadata on a dedicated server, known as the NameNode, while actual data is stored on other servers called DataNodes.

    Starting with HDFS Commands

    All HDFS commands are invoked by the bin/hdfs script. Running the hdfs script without any arguments will print out descriptions for all Hadoop commands. However, to manage files in HDFS, we use the following syntax:

    hdfs dfs -command 
    

    Note: ‘dfs’ stands for Distributed File System. You can also use ‘fs’ instead of ‘dfs’, both will work the same.

    Now let’s explore some essential HDFS commands that you should master.

    1. Listing Files/Directories

    The `-ls` command allows you to list all the files and directories in HDFS. The syntax is similar to the UNIX ls command.

    hdfs dfs -ls /path 
    

    2. Creating Directories

    You can create directories in HDFS using the `-mkdir` command.

    hdfs dfs -mkdir /path/to/directory 
    

    3. Deleting Files/Directories

    To remove a file or directory in HDFS, you can use the `-rm` command. To remove a directory, you would need to use the `-r` (recursive) option.

    hdfs dfs -rm /path/to/file 
    
    hdfs dfs -rm -r /path/to/directory 
    

    4. Moving Files/Directories

    The `-mv` command allows you to move files or directories from one location to another within HDFS.

    hdfs dfs -mv /source/path /destination/path 
    

    5. Copying Files/Directories

    To copy files or directories within HDFS, use the `-cp` command.

    hdfs dfs -cp /source/path /destination/path 
    

    6. Displaying the Content of a File

    You can display the contents of a file in HDFS using the `-cat` command.

    hdfs dfs -cat /path/to/file 
    

    7. Copying Files to HDFS

    To copy files from the local filesystem to HDFS, use the `-put` or `-copyFromLocal` command.

    hdfs dfs -put localfile /path/in/hdfs 
    
    hdfs dfs -copyFromLocal localfile /path/in/hdfs 
    

    8. Copying Files from HDFS

    To copy files from HDFS to the local filesystem, use the `-get` or `-copyToLocal` command.

    hdfs dfs -get /path/in/hdfs localfile 
    
    hdfs dfs -copyToLocal /path/in/hdfs localfile 
    

    9. File/Directory Permissions

    HDFS commands for file or directory permissions mirror the chmod, chown, and chgrp commands in UNIX.

    hdfs dfs -chmod 755 /path/to/file 
    
    hdfs dfs -chown user:group /path/to/file 
    
    hdfs dfs -chgrp group /path/to/file 
    

    10. Checking Disk Usage

    The `-du` command displays the size of a directory or file, and `-dus` displays a summary of the disk usage.

    hdfs dfs -du /path/to/directory 
    
    hdfs dfs -dus /path/to/directory 
    

    Conclusion

    Mastering HDFS is a crucial part of becoming proficient in Hadoop. It gives you the skills to effectively manage and manipulate large amounts of data in a distributed environment. By understanding and practicing the commands discussed above, you are taking a big step forward in mastering Hadoop and the HDFS file system. Remember, practice is key to proficiency, so don’t hesitate to get hands-on with HDFS commands.

    DFS hadoop hdfs
    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email WhatsApp

    Related Posts

    Using HDFS Filesystem (CLI)

    Creating Directory In HDFS And Copy Files (Hadoop)

    How to Install Hadoop on Ubuntu 22.04

    How to Install Apache Hadoop on Ubuntu 22.04

    How to Install and Configure Hadoop on Ubuntu 20.04

    View 1 Comment

    1 Comment

    1. Vidya Vinodini on August 22, 2017 4:53 am

      Hi
      This tutorial is very useful for me..I followed this for my hadoop installation…but while running wordcount example – javac -classpath $(HADOOP_CLASSPATH) -d ‘/home/hduser/Desktop/WordCountTutorial/tutorial_classes’ ‘/home/hduser/Desktop/WordCountTutorial/WordCount.java’ i am facing with the error

      HADOOP_CLASSPATH: command not found
      javac: invalid flag: /home/hduser/Desktop/WordCountTutorial/tutorial_classes
      Usage: javac
      use -help for a list of possible options
      so how can i fix the problem??

      Thankyou

      Reply

    Leave A Reply Cancel Reply

    Advertisement
    Recent Posts
    • Difference Between Full Virtualization vs Paravirtualization
    • Virtualization vs. Containerization: A Comparative Analysis
    • Using .env Files in Django
    • Using .env File in FastAPI
    • Setting Up Email Notifications for Django Error Reporting
    Facebook X (Twitter) Instagram Pinterest
    © 2023 Tecadmin.net. All Rights Reserved | Terms  | Privacy Policy

    Type above and press Enter to search. Press Esc to cancel.