Prepare for your exams
Get points
Guidelines and tips

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search Store documents

The best documents sold by students who completed their studies

Search through all study resources

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

University Rankings

Discover the best universities in your country according to Docsity users

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

From our blog

Exams and Study

Go to the blog

HDFS File System Commands for Big Data Analytics in MCA34 at RIT, Summaries of Law

M.S. Ramaiah University of Applied Sciences Law

A laboratory manual for the MCA34: Big Data Analytics course at RIT's Department of Master of Computer Applications. It covers various HDFS shell commands, including help, usage, ls, mkdir, copyFromLocal, put, df, expunge, mv, rm, tail, test, count, and setrep. These commands are essential for managing files in HDFS and are similar to UNIX file system commands.

Typology: Summaries

2022/2023

Uploaded on 12/26/2022

1MS21MC010 🇮🇳

1 document

1 / 17

This page cannot be seen from the preview

Don't miss anything!

MCA34: Big Data Analytics

RIT DEPARTMENT OF MASTER OF COMPUTER APPLICATIONS Page 1

RAMAIAH INSTITUTE OF TECHNOLOGY

(Autonomous Institute, Affiliated to VTU)

MSR NAGAR, MSRIT POST, BANGALORE – 560 054

DEPARTMENT OF MASTER OF COMPUTER APPLICATIONS

MCA34: Big Data Analysis

LABORATORY MANUAL

IIIrd Semester Computer Applications

Batch: 2021-2023

Partial preview of the text

Download HDFS File System Commands for Big Data Analytics in MCA34 at RIT and more Summaries Law in PDF only on Docsity!

RAMAIAH INSTITUTE OF TECHNOLOGY

(Autonomous Institute, Affiliated to VTU)

MSR NAGAR, MSRIT POST, BANGALORE – 560 054

DEPARTMENT OF MASTER OF COMPUTER APPLICATIONS

MCA 34 : Big Data Analysis

LABORATORY MANUAL

III

Semester Computer Applications

Batch: 2021 - 2023

1) help HDFS Shell Command Syntax of help hdfs Command $ hadoop fs – help Help hdfs shell command helps hadoop developers with all the available hadoop commands and how to use them. Variations of the Hadoop fs Help Command $ hadoop fs – help ls Using the help command with a specific command lists the usage information along with the options to use the command. 2) Usage HDFS Shell Command $ hadoop fs – usage ls Usage command gives all the options that can be used with a particular hdfs command. 3) ls HDFS Shell Command Syntax for ls Hadoop Command - $ hadoop fs – ls / This command will list all the available files and subdirectories under default directory Variations of Hadoop ls Shell Command $ hadoop fs – ls / Returns all the available files and subdirectories present under the root directory. 4) mkdir- Used to create a new directory in HDFS at a given location. $ hadoop fs – mkdir /USN/D The above command will create a new directory D1 under your USN Create another directory D2 under the USN

10) Expunge This HDFS command empties the trash by deleting all the files and directories. Example - $ hadoop fs – expunge 11) Cat This is similar to the cat command in Unix and displays the contents of a file. Example - $ hadoop fs – cat /USN/D1/Sample1.txt 12) cp Copy files from one HDFS location to another HDFS location. Example – $ hadoop fs – cp /USN/D1/sample.txt /USN/D 13) mv Move files from one HDFS location to another HDFS location. Example – $ hadoop fs – mv /USN/D1/Sample1.txt /USN/D 14) rm Removes the file or directory from the mentioned HDFS location. Example – $ hadoop fs – rm - r /USN/D Deletes or removes the directory and its content from HDFS location in a recursive manner. 15) tail This hadoop command will show the last kilobyte of the file to stdout. Example – $ hadoop fs - tail /USN/D1/sample1.txt Example – $ hadoop fs - tail – f - tail /USN/D1/sample1.txt Using the tail commands with - f option, shows the last kilobyte of the file from end in a page wise format.

16) copyToLocal Copies the files to the local filesystem. This is similar to hadoop fs - get command but in this case the destination location msut be a local file reference Example - $ hadoop fs – copyToLocal /USN/D1/Sample1.txt /home/sk/USN/ 17) get Downloads or Copies the files to the local filesystem. Example - $ hadoop fs – get /USN/D1/Sample1.txt /home/sk/USN/ 18) touchz Used to create an emplty file at the specified location. Example - $ hadoop fs – touchz /USN/D1/Sample2.txt It will create a new empty file Sample2.txt in (hdfs path) 19) test Used for file test operations. Options: -

e check to see if the file exists. Return 0 if true.
z check to see if the file is zero length. Return 0 if true.
d check return 1 if the path is directory else return 0. Example: - $ hadoop fs - test - e /USN/D1/sample1.txt 20) count Counts the number of directories, number of files present and bytes under the paths that match the specified file pattern. Example: - $ hadoop fs - count /USN/D

options:

d The option is used to list the directories as plain files
h The option is used to format the sizes of files into a human-readable manner than just number of bytes
R The option is used to recursively list the contents of directories Syntax: 1 $ hadoop fs - ls [-d] [-h] [-R] Example: $ hadoop fs - ls / $ hadoop fs - lsr / The command above will match the specified file pattern, and directory entries are of the form (as shown below) Output: permissions - userId groupId sizeOfDirectory(in bytes) modificationDate(yyyy- MM-dd HH:mm) directoryName’’ 3. put: This command is used to copy files from the local file system to the HDFS filesystem. This command is similar to – copyFromLocal command. This command will not work if the file already exists unless the – f flag is given to the command. This overwrites the destination if the file already exists before the copy Option:
p The flag preserves the access, modification time, ownership and the mode Syntax: 1 $ hadoop fs - put [-f] [-p] ... Example: 1 $ hadoop fs - put sample.txt /user/data/

4. get: This command is used to copy files from HDFS file system to the local file system, just the opposite to put command. Syntax: 1 $ hadoop fs - get [-f] [-p] Example: 1 $ hadoop fs - get /user/data/sample.txt workspace/ 5. cat: This command is similar to the UNIX cat command and is used for displaying the contents of a file on the console. Example: 1 $ hadoop fs - cat /user/data/sampletext.txt 6. cp: This command is similar to the UNIX cp command, and it is used for copying files from one directory to another directory within the HDFS file system. Example: $ hadoop fs - cp /user/data/sample1.txt /user/hadoop $ hadoop fs - cp /user/data/sample2.txt /user/test/in 7. mv: This command is similar to the UNIX mv command, and it is used for moving a file from one directory to another directory within the HDFS file system. Example: 1 $ hadoop fs - mv /user/hadoop/sample1.txt /user/text/

Syntax: 1 $ hadoop fs - setrep [-R] [-w] Example: 1 $ hadoop fs - setrep - R /user/hadoop/

11. touchz: This command can be used to create a file of zero bytes size in HDFS filesystem. Example: 1 $ hadoop fs - touchz URI 12. test: This command is used to test an HDFS file’s existence of zero length of the file or whether if it is a directory or not. options:

d used to check whether if it is a directory or not, returns 0 if it is a directory
e used to check whether they exist or not, returns 0 if the exists
f used to check whether there is a file or not, returns 0 if the file exists
s used to check whether the file size is greater than 0 bytes or not, returns 0 if the size is greater than 0 bytes
z used to check whether the file size is zero bytes or not. If the file size is zero bytes, then returns 0 or else returns 1. Example: 1 $ hadoop fs - test - [defsz] /user/test/test.txt

13. expunge: This command is used to empty the trash available in an HDFS system. Syntax: 1 $ hadoop fs – expunge Example: user@ubuntu1:~$ hadoop fs – expunge 17/10/15 10:15:22 INFO fs.TrashPolicyDefault: Namenode trash configuration: Deletion interval = 0 minutes, Emptier interval = 0 minutes. 14. appendToFile: This command appends the contents of all the given local files to the provided destination file on the HDFS file system. The destination file will be created if it is not existing earlier. Syntax: 1 $ hadoop fs - appendToFile Example: user@ubuntu1:~$ hadoop fs - appendToFile derby.log data.tsv /in/appendfile user@ubuntu1:~$ hadoop fs - cat /in/appendfile 15. tail: This command is used to show the last 1KB of the file. option:

f used to the show appended data as the file grows Syntax: 1 $ hadoop fs - tail [-f] Example: user@tri03ws-386:~$ hadoop fs - tail /in/appendfile

19. du: This command is used to show the amount of space in bytes that have been used by the files that match the specified file pattern. Even without the – s option, this only shows the size summaries one level deep in the directory. Options:

s used to show the size of each individual file that matches the pattern, shows the total (summary) size
h used to format the sizes of the files in a human readable manner rather than the number of bytes. Syntax: 1 $ hadoop fs - du [-s] [-h] 20. count: This command is used to count the number of directories, files, and bytes under the path that matches the provided file pattern. Syntax: 1 $ hadoop fs - count [-q] Output: The output columns are as follows: **_1. DIR_COUNT FILE_COUNT CONTENT_SIZE FILE_NAME

QUOTA REMAINING_QUOTA SPACE_QUOTA REMAINING_SPACE_QUOTA
DIR_COUNT FILE_COUNT CONTENT_SIZE FILE_NAME_** 21. chgrp: This command is used to change the group of a file or a path. Syntax: 1 $ hadoop fs - chgrp [-R] groupname 22. chmod: This command is used to change the permissions of a file, this command works similar to LINUX’s shell command chmod with a few exceptions.

Option:

R Used to modify the files recursively and it is the only option that is being supported currently. The is the same as the mode used for the shell command. The letters that are recognized are ‘rwxXt’. This is the mode specified in 3 or 4 digits. The first maybe 0 or 1 to turn the sticky bit OFF or ON respectively. Syntax: 1 $ hadoop fs - chmod [-R] PATH 23. chown: This command is used to change the owner and group of a file. This command is similar to the shell’s chown command with a few exceptions. If only the owner of the group is specified then only the owner of the group is modified via this command. The owner and group names may only consist of digits, alphabets and any of the characters mentioned here [-_./@a-zA-Z0-9]. The names thus specified are case sensitive as well. It is better to avoid using ‘.’ to separate username and the group just the way LINUX allows it. If the usernames have dots in them and if you are using a local file system, you might see surprising results since the shell command chown is used for the local file alone. Option:
R Modifies the files recursively and is the only option that is being supported currently Syntax: 1 $ hadoop fs - chown [-R] [OWNER][:[GROUP]] PATH Now that we have understood Hadoop distributed file commands (HDFS) we will learn frequently used Hadoop administration commands. Hadoop admin commands 24. balancer: This command is used to run the cluster-balancing utility. Syntax: 1 hadoop balancer [-threshold ] Example: 1 hadoop balancer - threshold 20

force a checkpoint is performed regardless of the EditLog size;
geteditsize EditLog size is displayed Syntax: 1 hadoop secondarynamenode [-checkpoint [force]] | [-geteditsize] Example: 1 hadoop secondarynamenode – geteditsize 28. tasktracker: This command is used to run a MapReduce TaskTracker node. Syntax: 1 hadoop tasktracker Example: 1 hadoop tasktracker 29. jobtracker: This command is used to run the MapReduce JobTracker node, which coordinates the data processing system for Hadoop. Option:
dumpConfiguration Used by the JobTracker and the queue configuration in JSON format are written to standard output. Syntax: 1 hadoop jobtracker [-dumpConfiguration] Example: 1 hadoop jobtracker – dumpConfiguration 30. daemonlog: This command is used to get or set the log level for each daemon. The changes reflect only when the daemon restarts. Syntax: 1 **_hadoop daemonlog - getlevel <host:port> ; hadoop daemonlog
setlevel <host:port> _**

HDFS File System Commands for Big Data Analytics in MCA34 at RIT, Summaries of Law

Related documents

Partial preview of the text

Download HDFS File System Commands for Big Data Analytics in MCA34 at RIT and more Summaries Law in PDF only on Docsity!

RAMAIAH INSTITUTE OF TECHNOLOGY

MSR NAGAR, MSRIT POST, BANGALORE – 560 054

DEPARTMENT OF MASTER OF COMPUTER APPLICATIONS

MCA 34 : Big Data Analysis

LABORATORY MANUAL

III

Semester Computer Applications

Batch: 2021 - 2023