Chapter 2

Files and File Processing

Questions and Problems

1.What is a file in LINUX/UNIX?

A file is a sequence of bytes. This means that everything in LINUX/UNIX is a file, including all hardware devices such as keyboard, mouse, disk drive, tape drive, and network interface card.

2.Does LINUX/UNIX support any file types? If so, name them. Does LINUX/UNIX support file extensions?

Yes, LINUX/UNIX supports five file types. They are: ordinary file, directory, device/special file (block special and character special), symbolic link, and named pipe. BSD compliant LINUX/UNIX system support an additional file type, called socket. You can determine the type of a file by using the ls -l file_name command and observing the first character of the output line. The following characters are used to represent the file types.

-ordinary

ddirectory

bblock special file (a block device)

ccharacter special file (a character device)

lsymbolic link

pnamed pipe (FIFO)

ssocket

No, LINUX/UNIX does not support file extensions. Some LINUX/UNIX applications support/require file extensions. For example, C compilers require the .c extension, C++ compilers require the .cpp extension, and the Java compilers require the .java extension.

3.What are special files in LINUX/UNIX? What are character special and block special files? Run the ls /dev | wc -w command to find the number of special files your system has.

Every input/output hardware device in a LINUX/UNIX based system is represented by a file, called a special file. If the device performs input or output (or both) one character at a time, the corresponding special file is known as a character special file. If the device performs input/output in terms of multiple (blocks of) bytes at a time, the corresponding special file is known as a block special file. As stated in problem 2, you can determine the type of a file by using the ls -l command and observing the first character in an output line.

The execution of the ls /dev | wc -w command on our system displayed 2,360. This means that our system has 5,052 special files.

4.Draw the hierarchical file structure, similar to the one given in Figure 2.1, for your LINUX/UNIX machine. Show files and directories at the first two levels. Also show where your home directory is, along with files and directories under your home directory.

Our answer is given in Figure 1.2.

5.Give the command that you can use to list the absolute pathname of your home directory.

Each of the following three commands lists the absolute pathname of your home directory.

echo ~

echo $HOME(orecho $homefor TC shell)

cd ; pwd(Or, just the pwdcommand right after you logon.)

6.What shell are you using? What is the search path for your shell? How did you obtain your answer? What command(s) did you use?

The following session shows the answers to the questions.

$ echo $SHELL

/bin/bash

$ echo $PATH

/usr/local/bin:/bin:/usr/bin:/usr/X11R6/bin:/home3/faculty/msarwar/bin

$

7.Write down the line in the /etc/passwd file on your system that contains information about your login. What are your login shell, user ID, home directory, and group ID? Does your system contain the encrypted password in the /etc/passwd or /etc/shadow file?

See the highlighted portions in the line for different answers.

msarwar:x:608:200:Syed Mansoor Sarwar:/home/faculty/msarwar:/bin/bash

Login shell/bin/bash(Bourne Again shell)

User ID608

Home directory/home/faculty/msarwar

Group ID200

Our system contains the ecrypted passwords in the /etc/shadow file. If your system is on a network with an NIS database, you can display your line in the NIS yppasswd database by using the ypcat passwd command as shown below. Note that the second field in the output is your encrypted password.

$ ypcat passwd | grep msarwar

msarwar:7PH6Ep.yDf4Wk:608:200:Syed Mansoor Sarwar:/home/faculty/msarwar:/bin/bash

$

8.Create a directory called memos in your home directory. Go into this directory and create a file memo.james by using one of the editors we discussed in Appendix A. Give three pathnames for this file.

The command for creating the memos directory is mkdir ~/memos. If your system is not BSD compliant, use the mkdir $HOME/memos command. Then use the cd ~/memos command to make memos your current directory. Use the editor of your choice to create the memo.james file in this directory.

The three pathnames for this file are given below. The first is the absolute pathname for the file and remaining are its relative pathnames.

/home/faculty/sarwar/memos/memo.james

~/memos/memo.james

$HOME/memos/memo.james(or $home/memos/memo.james)

memo.james or ./memo.james(if memos is your current directory)

9.Give a command line for creating a subdirectory personal under the memos directory that you created in problem 8.

mkdir ~/memos/personal

10.Make a copy of the file memo.james and put it in your home directory. Name the copied file temp.memo. Give two commands for accomplishing this task.

cp ~/memos/memo.james ~/temp.memo

cp $home/memos/memo.james $home/temp.memo

cp $HOME/memos/memo.james $HOME/temp.memo

cp memos/memo.james temp.memo(if you are in your home directory)

11.Give the command for deleting the memos directory. How do you know that the directory has been deleted?

rmdir ~/memos (if the memos directory is empty)

rm -r ~/memos

I will make my home directory my current directory and run the ls -l command to make sure that there is no directory called memos in my current (home) directory. You can also run the ls -l ~ command to perform the same task.

12.Why does a shell process terminate when you presses <Ctrl-D> at the beginning of a new line?

The purpose of a shell process is to read standard input file (the keyboard by default) for commands and try to execute them. <Ctrl-D> is the end-of-file character in LINUX/UNIX. When you press <Ctrl-D> on a new line, shell thinks that it has reached the end-of-file of standard input. It, therefore, terminates itself as the file (standard input) it reads commands from has ended.

13.Give a command line to display types of all the files in your ~/linux directory that start with the word chapter, are followed by a digit 1, 2, 6, 8, or 9, and end with .eps or .prn.

ls -l ~/linux/chapter[1,2,6,8,9].{eps, prn}

14.Give a command line to display the types of all the files in the personal directory in your home directory that do not start with letters a, k, G, or Q and the third letter in their name is not a digit and not a letter (uppercase or lowercase).

The following command performs the given task under Bash. Replace the ! chanracter with ^ if the command is to be executed under the TC shell.

ls -ld ~/personal/[!akGQ]?[!0-9A-Za-z]*

15.Give a command line for viewing the sizes (in lines and bytes) of all the files in your present working directory?

wc * or wc ./*

16.List 10 operations that you can perform on LINUX files.

  1. Displaying files
  2. Copying files
  3. Moving files
  4. Comparing files
  5. Removing files
  6. Printing files
  7. Determining file type
  8. Determining file size
  9. Finding differences between files
  10. Displaying repeated or unique lines in a file

17.Give a command line for viewing the sizes (in lines and bytes) of all the files in your present working directory?

Same as 15. Problem repeated by mistake.

18.View the /usr/include/bits/socket.h file with the more (or less) command. What are the values of symbolic constants SOCK_STREAM and SOCK_DGRAM? What are the values of PF_INET and AF_INET?

The value of the given symbolic constants are

SOCK_STREAM = 1

SOCK_DGRAM = 2

PF_INET = AF_INET = 2

PF_INET6 = AF_INET6 = 10

19.What does the tail -10r ../letter.John command do?

The command displays in reverse order the last 10 lines of the ../letter.John file.

20.Give a command for viewing the size of your home directory? Give a command for displaying the sizes of all the files in your home directory.

  • Execute the ls -ld ~ or ls -ld $HOMEorls -ld $home command and look at the file size field to determine the size of your home directory.
  • Execute the wc ~/*orwc $HOME/*orwc $home/* command to display the size of all the files in your home directory

21.Give a command line for displaying all the lines in the Students file, starting with line number 25.

tail +25 Students

22.Give a command line for copying all the files and directories under a directory courses in your home directory. Assume that you are in your home directory. Give another command to accomplish the above task, assuming that you are not in your home directory.

  • cp -r courses .
  • cp -r ~/courses ~(If your system is not BSD compliant, replace ~ with $HOME.)

23.What do the following commands do?

a) rm -f ~/personal/memo*.doc

Force the removal of all the files in ~/personal directory that have .doc extension and whose names start with the string “memo”.

b) rm -f ~/linuxbook/final/ch??.prn

Force the removal of all the files in ~/linuxbook/final directory that have .prn extension and whose names are 4-character long, starting with the string “ch”.

c) rm -f ~/linuxbook/final/*.o

Force the removal of all the files in ~/linuxbook/final directory that have .o extension.

d) rm -f ~/courses/ece446/lab[1-6].[cC]

Force the removal of all the files in the ~/courses/ece345 directory that have .c or .C extension (i.e., C or C++ source files) whose names are 4-character long, starting with the string “lab” and ending with a digit 1-6.

24.Give a command line for moving files lab1, lab2, and lab3 from ~/courses/ece345 directory to a newlabs.ece345 directory in your home directory. If a file already exists in the destination directory, the command should prompt the user for confirmation.

mv -i ~/courses/ece345/lab[123] ~/newlabs.ece345

25.Give a command to display those lines in the ~/personal/Phones directory that are not repeated.

uniq -u ~/perosnal/Phones

26.You have a file in your home directory called tryit&. Rename this file. What command did you use?

Since is a shell metacharacter, we need to escape it in the mv command with the \ character.

mv tryit\& tryit_new

27.Give a command line for displaying attributes of all the file starting with a string prog, followed by zero or more characters and ending with a string .c in the courses/ece345 directory in your home directory.

ls -l ~/courses/ece345/prog*.c

28.Refer to problem 27. Give a command line if file names have two English letters between prog and .c. Can you give another command line to accomplish the same task?

ls -l ~/courses/ece345/prog[aA-zZ][aA-zZ].c

ls -l ~/courses/ece345/prog[a-zA-Z][a-zA-Z].c

29.Give a command line for displaying files called got|cha and M*A*S*H screenful at a time.

Escape the special meaning of the | character by using the \ character.

more got\|cha M*A*S*H

30.Give a command line for displaying the sizes of files that have the .jpg extension and names ending with a digit.

wc *[0-9].jpg

31.What is the file compression? What do the terms compressed files and decompressed files mean? What commands are available for performing the compression and decompression in LINUX? Which are the preferred commands? Why?

Reduction in the size of a file is known as file compression. Compression has both space and time advantages. A compressed file takes less disk space, less time to transmit, and less time to copy.

A compressed file contains the contents of a file after they have been compressed and a decompressed file contains the contents of the original file (after a compressed file has been decompressed).

There are several LINUX commands that can be used for compressing and decompressing files, namely compress, uncompress, gzip, and gunzip. The preferred commands are gzip and gunzip becasue they work better than others. The gzexe command can be used to compress (and uncompress) executable files.

32.Take three large files in your directory structure: a text file, a PostScript file, and a picture file—and compress them by using the compress command. Which file was compressed the most? What was the percentage reduction in file size? Compress the same file by using the gzip command. Which resulted in better compression, compress or gzip? Uncompress the files with the uncompress and gunzip commands. Show your work.

Do it yourself.

33.Use the find command to display the names of all the header files in the /usr/include directory that are larger than 1000 bytes. Write down the command that you used to perform this task.

The following session shows a sample run of the command for performing the given task. Note that the command output includes directories.

$ find /usr/include -size +1000c -print | more

/usr/include

/usr/include/pwdb

/usr/include/pwdb/_pwdb_macros.h

/usr/include/pwdb/pwdb_map.h

/usr/include/pwdb/pwdb_public.h

/usr/include/pwdb/pwdb_radius.h

/usr/include/pwdb/pwdb_shadow.h

/usr/include/pwdb/pwdb_unix.h

/usr/include/pwdb/radius.h

/usr/include/asm

$

The following session shows the command that displays the total number of files in the /usr/include directory that are larger than 1000 bytes.

$ find /usr/include -size +1000c -print | wc -l

2849

$

34.Use the find command to remove all the files in your home directory names core and those having the .bak extension. What command did you use?

The following session shows the command for performing the given task.

$ find ~ \( -name core -o -name ‘*.bak’ \) -print -exec rm {} \;

[ output of the command ]

$

35.Use the whereis command to locate the manual pages for the tar, strcmp, and socket commands/calls. What are the absolute pathnames for the files that contain these manual pages.

The following session shows the command for performing the given task. The absolute pathnames for the manual pages are highlighted.

$ whereis tar strcmp socket

tar: /bin/tar /usr/include/tar.h /usr/share/man/man1/tar.1.gz

strcmp: /usr/share/man/man3/strcmp.3.gz

socket: /usr/share/man/man2/socket.2.gz /usr/share/man/man7/socket.7.gz /usr/share/man/mann/socket.n.gz

$

36.Use the grep command to search the /usr/include/bits/socket.h file and display the lines that contain the string SOCK_. What command line did you use?

The following session shows the command for performing the given task, along with the command output.

$ cat /usr/include/bits/socket.h | grep SOCK_

SOCK_STREAM = 1, /* Sequenced, reliable, connection-based

#define SOCK_STREAM SOCK_STREAM

SOCK_DGRAM = 2, /* Connectionless, unreliable datagrams

#define SOCK_DGRAM SOCK_DGRAM

SOCK_RAW = 3, /* Raw protocol interface. */

#define SOCK_RAW SOCK_RAW

SOCK_RDM = 4, /* Reliably-delivered messages. */

#define SOCK_RDM SOCK_RDM

SOCK_SEQPACKET = 5, /* Sequenced, reliable, connection-based,

#define SOCK_SEQPACKET SOCK_SEQPACKET

SOCK_PACKET = 10 /* Linux specific way of getting packets

#define SOCK_PACKET SOCK_PACKET

$