LinuxBasics.org

The community that helps people to run Linux

rss
Table of Contents

3.3. Manipulating files (continued)

3.3.3. Finding files

3.3.3.1. Using shell features

In the example on moving files we already saw how the shell can manipulate multiple files at once. In that example, the shell finds out automatically what the user means by the requirements between the square braces, [ and ]. The shell can substitute ranges of numbers and upper or lower case characters alike. It also substitutes as many characters as you want with an asterisk, and only one character with a question mark.

All sorts of substitutions can be used simultaneously; the shell is very logical about it. The Bash shell, for instance, has no problem with expressions like ls dirname/*/*/*[2-3].

In other shells, the asterisk is commonly used to minimize the efforts of typing: people would enter cd dir* instead of cd directory. In Bash however, this is not necessary because the GNU shell has a feature called file name completion. It means that you can type the first few characters of a command (anywhere) or a file (in the current directory) and if no confusion is possible, the shell will find out what you mean. For example in a directory containing many files, you can check if there are any files beginning with the letter A just by typing ls A and pressing the Tab key twice, rather than pressing Enter. If there is only one file starting with A, this file will be shown as the argument to ls (or any shell command, for that matter) immediately.

3.3.3.2. Which

A very simple way of looking up executable files is using the which command to look in the directories listed in the user’s search path. which will not find ordinary files or files located outside the user’s search path. The which command is useful when troubleshooting Command not Found problems. In the example below, user tina can’t use the acroread program, while her colleague has no troubles whatsoever on the same system. The problem is similar to the PATH problem in the previous part: Tina’s colleague tells her that he can see the required program in /opt/acroread/bin, but this directory is not in her path:

tina:~> which acroread
/usr/bin/which: no acroread in (/bin:/usr/bin:/usr/bin/X11)

The problem can be solved by giving the full path to the command to run, or by re-exporting the content of the PATH variable:

tina:~> export PATH=$PATH:/opt/acroread/bin

tina:~> echo $PATH
/bin:/usr/bin:/usr/bin/X11:/opt/acroread/bin

Using the which command with the -a switch will find all instances of the command:

tina:~> which -a make
/usr/bin/make
/usr/bin/X11/make

3.3.3.3. Find and locate

These are the real tools, used when searching other paths beside those listed in the search path. The UNIX find tool, very powerful, uses a somewhat complex syntax. GNU find, however, deals better with the difficult syntax. This command not only allows you to search file names, it can also accept file size, date of last change and other file properties as criteria for a search. The most common use is for finding file names:

find path -name searchstring

This can be interpreted as “Look in all files and subdirectories contained in a given path, and print the names of the files containing the search string in their name” (not in their content).

peter:~> find /boot -name menu.lst
/boot/grub/menu.lst

Another application of find is for searching files of a certain size, as in the example below, where user peter wants to find all files in the current directory or one of its subdirectories, that are bigger than 5 MB:

peter:~> find . -size +5M
psychotic_chaos.mp3

If you dig in the man pages, you will see that find can also perform operations on the found files. A common example is removing files. It is best to first test without the -exec option that the correct files are selected, after that the command can be rerun to delete the selected files. Below, we search for files ending in .tmp:

peter:~>  find . -name "*.tmp" -exec rm {} \;

peter:~>

Optimize!

Later on (in 1999 according to the man pages, after 20 years of find), locate was developed. This program is easier to use, but more restricted than find, since its output is based on a file index database that is updated only once every day. On the other hand, a search in the locate database uses less resources than find and therefore shows the results nearly instantly.

Most Linux distributions use slocate these days, security enhanced locate, the modern version of locate that prevents users from getting output they have no right to read. The files in root’s home directory are such an example, these are not normally accessible to the public. A user who wants to find someone who knows about the C-shell may issue the command locate .cshrc, to display all users who have a customized configuration file for the C shell. Supposing the users root and jenny are running C shell, then only the file /home/jenny/.cshrc will be displayed, and not the one in root’s home directory. On most systems, locate is a symbolic link to the slocate program:

billy:~> ls -l /usr/bin/locate
lrwxrwxrwx 1 root slocate  7 Oct 28 14:18 /usr/bin/locate -> slocate*

User tina could have used locate (instead of her previous attempt using which) to find the application she wanted:

tina:~> locate acroread
/usr/share/icons/hicolor/16x16/apps/acroread.png
/usr/share/icons/hicolor/32x32/apps/acroread.png
/usr/share/icons/locolor/16x16/apps/acroread.png
/usr/share/icons/locolor/32x32/apps/acroread.png
/usr/local/bin/acroread
/usr/local/Acrobat4/Reader/intellinux/bin/acroread
/usr/local/Acrobat4/bin/acroread

Directories that don’t contain the name bin can’t contain the program - they don’t contain executable files. There are three possibilities left. The file in /usr/local/bin is the one tina would have wanted: it is a link to the shell script that starts the actual program:

tina:~> file /usr/local/bin/acroread
/usr/local/bin/acroread: symbolic link to ../Acrobat4/bin/acroread

tina:~> file /usr/local/Acrobat4/bin/acroread
/usr/local/Acrobat4/bin/acroread: Bourne shell script text executable

tina:~> file /usr/local/Acrobat4/Reader/intellinux/bin/acroread
/usr/local/Acrobat4/Reader/intellinux/bin/acroread: ELF 32-bit LSB 
executable, Intel 80386, version 1, dynamically linked (uses 
shared libs), not stripped

In order to keep the path as short as possible, so the system doesn’t have to search too long every time a user wants to execute a command, we add /usr/local/bin to the path and not the other directories, which only contain the binary files of one specific program, while /usr/local/bin contains other useful programs as well. To allow the binaries in the other /usr/local directories to run, symbolic links to them are placed in /usr/local/bin.

Again, a description of the full features of find and locate can be found in the Info pages.

3.3.3.4. The grep command

3.3.3.4.1. General line filtering

A simple but powerful program, grep is used for filtering input lines and returning certain patterns to the output. There are literally thousands of applications for the grep program. In the example below, jerry uses grep to see how he did the thing with find:

jerry:~> grep find .bash_history
find . -name userinfo
man find
find ../ -name common.cfg

All UNIXes with just a little bit of decency have an online dictionary. So does Linux. The dictionary is a list of known words in a file named words, located in /usr/share/dict. To quickly check the correct spelling of a word, no graphical application is needed:

william:~> grep pinguin /usr/share/dict/words

william:~> grep penguin /usr/share/dict/words
penguin
penguins

Who is the owner of that home directory next to mine? Hey, there’s his telephone number!

lisa:~> grep gdbruyne /etc/passwd
gdbruyne:x:981:981:Guy Debruyne, tel 203234:/home/gdbruyne:/bin/bash

And what was the E-mail address of Arno again?

serge:~/mail> grep -i arno *
sent-mail: To: <Arno.Hintjens@celeb.com>
sent-mail: On Mon, 24 Dec 2001, Arno.Hintjens@celeb.com wrote:

find and locate are often used in combination with grep to define some serious queries. For more information, see Chapter 5 on I/O redirection.

3.3.3.4.2. Special characters

Characters that have a special meaning to the shell have to be escaped. The escape character in Bash is backslash, as in most shells; this takes away the special meaning of the following character. The shell knows about several special characters, among the most common “/”, “.”, “?” and “*”. A full list can be found in the Info pages and documentation for your shell.

For instance, say that you want to display the file * instead of all the files in a directory, you would have to use

less \*

The same goes for filenames containing a space:

cat This\ File

3.3.4. More ways to view file content

3.3.4.1. General

Apart from cat, which really doesn’t do much more than sending files to the standard output, there are other tools to view file content.

The easiest way of course would be to use graphical tools instead of command line tools. In the introduction we already saw a glimpse of an office application, OpenOffice. Other examples are the GIMP (start up with gimp from the command line), the GNU Image Manipulation Program; xpdf to view Portable Document Format files (PDF); GhostView (gv) for viewing PostScript files; Mozilla/FireFox, lynx and links (two text mode browsers), Konqueror, Opera and many others for web content; XMMS, CDplay and others for multimedia file content; AbiWord, Gnumeric, KOffice etc. for all kinds of office applications and so on. There are thousands of Linux applications; to list them all would take days.

Instead we keep concentrating on shell- or text-mode applications, which form the basics for all other applications. These commands work best in a text environment on files containing text. When in doubt, check first using the file command.

So let’s see what text tools we have that are useful to look inside files.

3.3.4.2. less is more

Undoubtedly you will hear someone say this phrase sooner or later when working in a UNIX environment. A little bit of the UNIX history of pagers explains this:

More information is located in the Info pages.

You already know about pagers by now, because they are used for viewing the man pages.

3.3.4.3. Head and tail

These two commands display the n first/last lines of a file respectively. To see the last ten commands entered:

tony:~> tail -10 .bash_history 
locate configure | grep bin
man bash
cd
xawtv &
grep usable /usr/share/dict/words 
grep advisable /usr/share/dict/words 
info quota
man quota
echo $PATH
frm

head works similarly showing the first lines in a file. The tail command has a handy feature to continuously show the last lines of a file as the file is changing. This -f option is often used by system administrators to check on log files. For example, you use “tail -f /var/log/syslog” in a terminal to watch what happens as you plug in a usb device. More information is located in the system documentation files.

3.3.5. Linking files

3.3.5.1. Link types

Since we know more about files and their representation in the file system, understanding links (or shortcuts) is a piece of cake. A link is nothing more than a way of matching two or more file names to the same set of file data. There are two ways to achieve this:

The two link types behave similarly, but are not the same, as illustrated in the scheme below:

Figure 3-2. Hard and soft link mechanism

Here’s another good reference: Q & A: The difference between hard and soft links

3.3.5.1.1 Soft link Details

Note that removing the target file for a symbolic link makes the link useless.

3.3.5.1.2 Hard link Details

Each regular file is in principle a hard link. Hard links cannot span across partitions, since they refer to inodes, and inode numbers are unique only within a given partition. The number of hard links that exist for a file is displayed by the ls command. The rm command actually removes the hard link, not the file itself. Thus if one hard link to a file is deleted, the others continue to work. Only when the hard link count drops to zero, will the inode itself will be freed.

3.3.5.1.3 Example: Hard links and Soft links

Soft link
stw@laptop:~/LBook$ ln -s file slink
stw@laptop:~/LBook$ ls -l
lrwxrwxrwx 1 stw stw 4 2006-11-08 20:08 slink -> file 
stw@laptop:~/LBook$ ln file hlink
ln: accessing `file': No such file or directory
Hard link
stw@laptop:~/LBook$ echo > file "Hello Linuxbasics.org"
stw@laptop:~/LBook$ ls -l
-rw-r--r-- 1 stw stw 22 2006-11-08 20:12 file
lrwxrwxrwx 1 stw stw  4 2006-11-08 20:08 slink -> file
stw@laptop:~/LBook$ ln file hlink
stw@laptop:~/LBook$ ls -l
-rw-r--r-- 2 stw stw 22 2006-11-08 20:13 file
-rw-r--r-- 2 stw stw 22 2006-11-08 20:13 hlink
lrwxrwxrwx 1 stw stw  4 2006-11-08 20:08 slink -> file
Usage
stw@laptop:~/LBook$ cat file
Hello Linuxbasics.org
stw@laptop:~/LBook$ cat hlink
Hello Linuxbasics.org
stw@laptop:~/LBook$ cat slink
Hello Linuxbasics.org
Breaking the soft link
stw@laptop:~/LBook$ rm file
stw@laptop:~/LBook$ ls -l
-rw-r--r-- 1 stw stw 22 2006-11-08 20:13 hlink
lrwxrwxrwx 1 stw stw  4 2006-11-08 20:08 slink -> file
stw@laptop:~/LBook$ cat slink
cat: slink: No such file or directory
stw@laptop:~/LBook$ cat hlink
Hello Linuxbasics.org

3.3.5.1.4 User-space Links

It may be argued that there is a third kind of link, the user-space link, which is similar to a shortcut in MS Windows. These are files containing meta-data which can only be interpreted by the graphical file manager. To the kernel and the shell these are just normal files. They may end in a .desktop or .lnk suffix; an example can be found in ~/.gnome-desktop:

[dupont@boulot .gnome-desktop]$ cat La\ Maison\ Dupont
[Desktop Entry]
Encoding=Legacy-Mixed
Name=La Maison Dupont
Type=X-nautilus-home
X-Nautilus-Icon=temp-home
URL=file:///home/dupont

This example is from a KDE desktop:

[lena@venus Desktop]$ cat camera
[Desktop Entry]
Dev=/dev/sda1
FSType=auto
Icon=memory
MountPoint=/mnt/camera
Type=FSDevice
X-KDE-Dynamic-Device=true

Creating this kind of link is easy enough using the features of your graphical environment. Should you need help, your system documentation should be your first resort.

In the next section, we will study the creation of UNIX-style symbolic links using the command line.

3.3.5.2. Creating symbolic links

The symbolic link is particularly interesting for beginning users: they are fairly obvious to see and you don’t need to worry about partitions.

The command to make links is ln. In order to create symlinks, you need to use the -s option:

ln -s targetfile linkname

In the example below, user freddy creates a link in a subdirectory of his home directory to a directory on another part of the system:

freddy:~/music> ln -s /opt/mp3/Queen/ Queen

freddy:~/music> ls -l
lrwxrwxrwx  1 freddy  freddy  17 Jan 22 11:07 Queen -> /opt/mp3/Queen

Symbolic links are always very small files, while hard links have the same size as the original file.

The application of symbolic links is widespread. They are often used to save disk space, to make a copy of a file in order to satisfy installation requirements of a new program that expects the file to be in another location, they are used to fix scripts that suddenly have to run in a new environment and can generally save a lot of work. A system admin may decide to move the home directories of the users to a new location, disk2 for instance, but if he wants everything to work like before, like the /etc/passwd file, with a minimum of effort he will create a symlink from /home to the new location /disk2/home.


Prev: Manipulating files
Home
Next: File security


Copyright (c) by the authors.
This section of the wiki is licensed under the terms of the GNU Free Documentation License.
See the LBook-licensing page for details.


Linux® is a registered trademark of Linus Torvalds.


 
  course/book/sect_03_03_03.txt · Last modified: 2008/07/20 19:08

LinuxBasics.org

Start Linux-Course Tutorials Linux Links Security Blog Forum E-mail List Search Online Chat

Site-Info

Help Get in Touch Making of LBo

Wiki-Control

Powered by

Linux Apache DokuWiki Mailman RUTE ht://Dig