Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

HDFS File System Commands for Big Data Analytics in MCA34 at RIT, Summaries of Law

A laboratory manual for the MCA34: Big Data Analytics course at RIT's Department of Master of Computer Applications. It covers various HDFS shell commands, including help, usage, ls, mkdir, copyFromLocal, put, df, expunge, mv, rm, tail, test, count, and setrep. These commands are essential for managing files in HDFS and are similar to UNIX file system commands.

Typology: Summaries

2022/2023

Uploaded on 12/26/2022

1MS21MC010
1MS21MC010 🇮🇳

1 document

1 / 17

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
MCA34: Big Data Analytics
RIT DEPARTMENT OF MASTER OF COMPUTER APPLICATIONS Page 1
RAMAIAH INSTITUTE OF TECHNOLOGY
(Autonomous Institute, Affiliated to VTU)
MSR NAGAR, MSRIT POST, BANGALORE 560 054
DEPARTMENT OF MASTER OF COMPUTER APPLICATIONS
MCA34: Big Data Analysis
LABORATORY MANUAL
IIIrd Semester Computer Applications
Batch: 2021-2023
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff

Partial preview of the text

Download HDFS File System Commands for Big Data Analytics in MCA34 at RIT and more Summaries Law in PDF only on Docsity!

RAMAIAH INSTITUTE OF TECHNOLOGY

(Autonomous Institute, Affiliated to VTU)

MSR NAGAR, MSRIT POST, BANGALORE – 560 054

DEPARTMENT OF MASTER OF COMPUTER APPLICATIONS

MCA 34 : Big Data Analysis

LABORATORY MANUAL

III

rd

Semester Computer Applications

Batch: 2021 - 2023

1) help HDFS Shell Command Syntax of help hdfs Command $ hadoop fs – help Help hdfs shell command helps hadoop developers with all the available hadoop commands and how to use them. Variations of the Hadoop fs Help Command $ hadoop fs – help ls Using the help command with a specific command lists the usage information along with the options to use the command. 2) Usage HDFS Shell Command $ hadoop fs – usage ls Usage command gives all the options that can be used with a particular hdfs command. 3) ls HDFS Shell Command Syntax for ls Hadoop Command - $ hadoop fs – ls / This command will list all the available files and subdirectories under default directory Variations of Hadoop ls Shell Command $ hadoop fs – ls / Returns all the available files and subdirectories present under the root directory. 4) mkdir- Used to create a new directory in HDFS at a given location. $ hadoop fs – mkdir /USN/D The above command will create a new directory D1 under your USN Create another directory D2 under the USN

10) Expunge This HDFS command empties the trash by deleting all the files and directories. Example - $ hadoop fs – expunge 11) Cat This is similar to the cat command in Unix and displays the contents of a file. Example - $ hadoop fs – cat /USN/D1/Sample1.txt 12) cp Copy files from one HDFS location to another HDFS location. Example – $ hadoop fs – cp /USN/D1/sample.txt /USN/D 13) mv Move files from one HDFS location to another HDFS location. Example – $ hadoop fs – mv /USN/D1/Sample1.txt /USN/D 14) rm Removes the file or directory from the mentioned HDFS location. Example – $ hadoop fs – rm - r /USN/D Deletes or removes the directory and its content from HDFS location in a recursive manner. 15) tail This hadoop command will show the last kilobyte of the file to stdout. Example – $ hadoop fs - tail /USN/D1/sample1.txt Example – $ hadoop fs - tail – f - tail /USN/D1/sample1.txt Using the tail commands with - f option, shows the last kilobyte of the file from end in a page wise format.

16) copyToLocal Copies the files to the local filesystem. This is similar to hadoop fs - get command but in this case the destination location msut be a local file reference Example - $ hadoop fs – copyToLocal /USN/D1/Sample1.txt /home/sk/USN/ 17) get Downloads or Copies the files to the local filesystem. Example - $ hadoop fs – get /USN/D1/Sample1.txt /home/sk/USN/ 18) touchz Used to create an emplty file at the specified location. Example - $ hadoop fs – touchz /USN/D1/Sample2.txt It will create a new empty file Sample2.txt in (hdfs path) 19) test Used for file test operations. Options: -

  • e check to see if the file exists. Return 0 if true.
  • z check to see if the file is zero length. Return 0 if true.
  • d check return 1 if the path is directory else return 0. Example: - $ hadoop fs - test - e /USN/D1/sample1.txt 20) count Counts the number of directories, number of files present and bytes under the paths that match the specified file pattern. Example: - $ hadoop fs - count /USN/D

options:

  • d The option is used to list the directories as plain files
  • h The option is used to format the sizes of files into a human-readable manner than just number of bytes
  • R The option is used to recursively list the contents of directories Syntax: 1 $ hadoop fs - ls [-d] [-h] [-R] Example: $ hadoop fs - ls / $ hadoop fs - lsr / The command above will match the specified file pattern, and directory entries are of the form (as shown below) Output: permissions - userId groupId sizeOfDirectory(in bytes) modificationDate(yyyy- MM-dd HH:mm) directoryName’’ 3. put: This command is used to copy files from the local file system to the HDFS filesystem. This command is similar to – copyFromLocal command. This command will not work if the file already exists unless the – f flag is given to the command. This overwrites the destination if the file already exists before the copy Option:
  • p The flag preserves the access, modification time, ownership and the mode Syntax: 1 $ hadoop fs - put [-f] [-p] ... Example: 1 $ hadoop fs - put sample.txt /user/data/

4. get: This command is used to copy files from HDFS file system to the local file system, just the opposite to put command. Syntax: 1 $ hadoop fs - get [-f] [-p] Example: 1 $ hadoop fs - get /user/data/sample.txt workspace/ 5. cat: This command is similar to the UNIX cat command and is used for displaying the contents of a file on the console. Example: 1 $ hadoop fs - cat /user/data/sampletext.txt 6. cp: This command is similar to the UNIX cp command, and it is used for copying files from one directory to another directory within the HDFS file system. Example: $ hadoop fs - cp /user/data/sample1.txt /user/hadoop $ hadoop fs - cp /user/data/sample2.txt /user/test/in 7. mv: This command is similar to the UNIX mv command, and it is used for moving a file from one directory to another directory within the HDFS file system. Example: 1 $ hadoop fs - mv /user/hadoop/sample1.txt /user/text/

Syntax: 1 $ hadoop fs - setrep [-R] [-w] Example: 1 $ hadoop fs - setrep - R /user/hadoop/

11. touchz: This command can be used to create a file of zero bytes size in HDFS filesystem. Example: 1 $ hadoop fs - touchz URI 12. test: This command is used to test an HDFS file’s existence of zero length of the file or whether if it is a directory or not. options:

  • d used to check whether if it is a directory or not, returns 0 if it is a directory
  • e used to check whether they exist or not, returns 0 if the exists
  • f used to check whether there is a file or not, returns 0 if the file exists
  • s used to check whether the file size is greater than 0 bytes or not, returns 0 if the size is greater than 0 bytes
  • z used to check whether the file size is zero bytes or not. If the file size is zero bytes, then returns 0 or else returns 1. Example: 1 $ hadoop fs - test - [defsz] /user/test/test.txt

13. expunge: This command is used to empty the trash available in an HDFS system. Syntax: 1 $ hadoop fs – expunge Example: user@ubuntu1:~$ hadoop fs – expunge 17/10/15 10:15:22 INFO fs.TrashPolicyDefault: Namenode trash configuration: Deletion interval = 0 minutes, Emptier interval = 0 minutes. 14. appendToFile: This command appends the contents of all the given local files to the provided destination file on the HDFS file system. The destination file will be created if it is not existing earlier. Syntax: 1 $ hadoop fs - appendToFile Example: user@ubuntu1:~$ hadoop fs - appendToFile derby.log data.tsv /in/appendfile user@ubuntu1:~$ hadoop fs - cat /in/appendfile 15. tail: This command is used to show the last 1KB of the file. option:

  • f used to the show appended data as the file grows Syntax: 1 $ hadoop fs - tail [-f] Example: user@tri03ws-386:~$ hadoop fs - tail /in/appendfile

19. du: This command is used to show the amount of space in bytes that have been used by the files that match the specified file pattern. Even without the – s option, this only shows the size summaries one level deep in the directory. Options:

  • s used to show the size of each individual file that matches the pattern, shows the total (summary) size
  • h used to format the sizes of the files in a human readable manner rather than the number of bytes. Syntax: 1 $ hadoop fs - du [-s] [-h] 20. count: This command is used to count the number of directories, files, and bytes under the path that matches the provided file pattern. Syntax: 1 $ hadoop fs - count [-q] Output: The output columns are as follows: **_1. DIR_COUNT FILE_COUNT CONTENT_SIZE FILE_NAME
  1. QUOTA REMAINING_QUOTA SPACE_QUOTA REMAINING_SPACE_QUOTA
  2. DIR_COUNT FILE_COUNT CONTENT_SIZE FILE_NAME_** 21. chgrp: This command is used to change the group of a file or a path. Syntax: 1 $ hadoop fs - chgrp [-R] groupname 22. chmod: This command is used to change the permissions of a file, this command works similar to LINUX’s shell command chmod with a few exceptions.

Option:

  • R Used to modify the files recursively and it is the only option that is being supported currently. The is the same as the mode used for the shell command. The letters that are recognized are ‘rwxXt’. This is the mode specified in 3 or 4 digits. The first maybe 0 or 1 to turn the sticky bit OFF or ON respectively. Syntax: 1 $ hadoop fs - chmod [-R] PATH 23. chown: This command is used to change the owner and group of a file. This command is similar to the shell’s chown command with a few exceptions. If only the owner of the group is specified then only the owner of the group is modified via this command. The owner and group names may only consist of digits, alphabets and any of the characters mentioned here [-_./@a-zA-Z0-9]. The names thus specified are case sensitive as well. It is better to avoid using ‘.’ to separate username and the group just the way LINUX allows it. If the usernames have dots in them and if you are using a local file system, you might see surprising results since the shell command chown is used for the local file alone. Option:
  • R Modifies the files recursively and is the only option that is being supported currently Syntax: 1 $ hadoop fs - chown [-R] [OWNER][:[GROUP]] PATH Now that we have understood Hadoop distributed file commands (HDFS) we will learn frequently used Hadoop administration commands. Hadoop admin commands 24. balancer: This command is used to run the cluster-balancing utility. Syntax: 1 hadoop balancer [-threshold ] Example: 1 hadoop balancer - threshold 20
  • force a checkpoint is performed regardless of the EditLog size;
  • geteditsize EditLog size is displayed Syntax: 1 hadoop secondarynamenode [-checkpoint [force]] | [-geteditsize] Example: 1 hadoop secondarynamenode – geteditsize 28. tasktracker: This command is used to run a MapReduce TaskTracker node. Syntax: 1 hadoop tasktracker Example: 1 hadoop tasktracker 29. jobtracker: This command is used to run the MapReduce JobTracker node, which coordinates the data processing system for Hadoop. Option:
  • dumpConfiguration Used by the JobTracker and the queue configuration in JSON format are written to standard output. Syntax: 1 hadoop jobtracker [-dumpConfiguration] Example: 1 hadoop jobtracker – dumpConfiguration 30. daemonlog: This command is used to get or set the log level for each daemon. The changes reflect only when the daemon restarts. Syntax: 1 **_hadoop daemonlog - getlevel <host:port> ; hadoop daemonlog
  • setlevel <host:port> _**