Read the Manual: A Guide for the Linux Command Line
Like most people interested in becoming more proficient with computers, I spend a lot of time looking at manuals. But, when it comes to the Linux command line, I often find myself searching the web rather than referencing official documentation. Searches frequently yield answers to my questions, but they do little to improve my proficiency.
In this post, we'll explore the various sources of official documentation, then we'll see how to use that documentation to solve an actual problem. Debian 13 was used for the examples, but most of the tools presented here work the same way no matter what distribution you're using. Some of the output has been truncated for brevity.
This manual gives complete descriptions of all the publicly available features of UNIX. It provides neither a general overview (see "The UNIX Time—sharing System" for that) nor details of the implementation of the system (which remain to be disclosed).
Within the area it surveys, this manual attempts to be as complete and timely as possible. A conscious decision was made to describe each program in exactly the state it was in at the time its manual section was prepared. In particular, the desire to describe something as it should be, not as it is, was resisted. Inevitably, this means that many sections will soon be out of date. (The rate of change of the system is so great that a dismayingly large number of early sections had to be modified while the rest were being written. The unbounded effort required to stay up-to-date is best indicated by the fact that several of the programs described were written specifically to aid in preparation of this manual!)
man
Linux distributions typically come with a manual. To view a command's manual page, pass its name as an argument to man.
user@computer:~$ man ls
NAME
ls - list directory contents
SYNOPSIS
ls [OPTION]... [FILE]...
DESCRIPTION
List information about the FILEs (the current directory by default).
Sort entries alphabetically if none of -cftuvSUX nor --sort is speci‐
fied.
Navigating the manual pages isn't intuitive, but there's built in help which can be accessed by pressing the h key. q returns to the manual page from the help page or to the shell from a manual page.
Many copies of the manual pages for UNIX-like operating systems exist online, however, be careful when referring to these versions. For instance, the OpenBSD ls manual page appears on the first page of search results for "man ls", but that version is completely separate from the GNU coreutils version that ships with most Linux distributions.
info
Speaking of GNU, many of the GNU tools, which form the basis of the Linux command line, are documented using the Texinfo typesetting syntax. This documentation, which is often more detailed than the corresponding manual pages, can be viewed with the info command.
user@computer:~$ info ls
10.1 ‘ls’: List directory contents
==================================
The ‘ls’ program lists information about files (of any type, including
directories). Options and file arguments can be intermixed arbitrarily,
as usual. Later options override earlier options that are incompatible.
For non-option command-line arguments that are directories, by
default ‘ls’ lists the contents of directories, not recursively, and
omitting files with names beginning with ‘.’. For other non-option
arguments, by default ‘ls’ lists just the file name. If no non-option
argument is specified, ‘ls’ operates on the current directory, acting as
if it had been invoked with a single argument of ‘.’.
By default, ‘ls’ lists each directory's contents alphabetically,
according to the locale settings in effect.(1) If standard output is a
terminal, the output is in columns (sorted vertically) and control
characters are output as question marks; otherwise, the output is listed
one per line and control characters are output as-is.
Because ‘ls’ is such a fundamental program, it has accumulated many
options over the years. They are described in the subsections below;
within each section, options are listed alphabetically (ignoring case).
The division of options into the subsections is not absolute, since some
options affect more than one aspect of ‘ls’'s operation.
Unlike man, info may not be installed by default. If info ls outputs an error like -bash: info: command not found, you'll need to install the GNU Info documentation browser. On Debian-based distributions, use apt install info.
Like man, the info browser has built in help. H opens the help, which includes basic navigation, and h starts a tutorial. q quits.
help
Commands built into the Bash shell come with their own documentation.
Builtin commands are contained within the shell itself. When the name of a builtin command is used as the first word of a simple command (see Simple Commands), the shell executes the command directly, without invoking another program. Builtin commands are necessary to implement functionality impossible or inconvenient to obtain with separate utilities.
Use the type command to determine whether or not a command is built into the shell.
user@computer:~$ type ls
ls is aliased to `ls --color=auto'
user@computer:~$ type echo
echo is a shell builtin
The help command prints the documentation for builtin commands.
user@computer:~$ help echo
echo: echo [-neE] [arg ...]
Write arguments to the standard output.
Display the ARGs, separated by a single space character and followed by a
newline, on the standard output.
Options:
-n do not append a newline
-e enable interpretation of the following backslash escapes
-E explicitly suppress interpretation of backslash escapes
whatis, apropos
Each manual page has a name and description. The whatis command displays that information.
user@computer:~$ whatis ls
ls (1) - list directory contents
apropos searches the manual page names and descriptions.
user@computer:~$ apropos directory
basename (1) - strip directory and suffix from filenames
bindtextdomain (3) - set directory containing message catalogs
chroot (8) - run command or interactive shell with special root directory
dbus-cleanup-sockets (1) - clean up leftover sockets in a directory
depmod.d (5) - Configuration directory for depmod
dir (1) - list directory contents
find (1) - search for files in a directory hierarchy
grub-macbless (8) - bless a mac file/directory
grub-mknetdir (1) - prepare a GRUB netboot directory.
helpztags (1) - generate the help tags file for directory
ls (1) - list directory contents
Putting It All Together
Let's see how to use these resources to solve an actual problem. First, I want an overview of disk usage. Then, I want to find all files over a certain size, so they can potentially be removed to free up space. A search for "usage" with apropos helps me determine that the df command is the one I want for the first task.
user@computer:~$ apropos usage
df (1) - report file system space usage
The manual page provides some helpful options, like -h, which prints sizes in human readable form.
user@computer:~$ man df
NAME
df - report file system space usage
SYNOPSIS
df [OPTION]... [FILE]...
DESCRIPTION
This manual page documents the GNU version of df. df displays the
amount of space available on the file system containing each file name
argument. If no file name is given, the space available on all cur‐
rently mounted file systems is shown. Space is shown in 1K blocks by
default, unless the environment variable POSIXLY_CORRECT is set, in
which case 512-byte blocks are used.
If an argument is the absolute file name of a device node containing a
mounted file system, df shows the space available on that file system
rather than on the file system containing the device node. This version
of df cannot show the space available on unmounted file systems, because
on most kinds of systems doing so requires non-portable intimate knowl‐
edge of file system structures.
OPTIONS
-h, --human-readable
print sizes in powers of 1024 (e.g., 1023M)
-t, --type=TYPE
limit listing to file systems of type TYPE
df is part of the GNU coreutils. At the bottom of the manual page, there are instructions to access the full documentation.
SEE ALSO
Full documentation <https://www.gnu.org/software/coreutils/df>
or available locally via: info '(coreutils) df invocation'
For this task, however, the manual page is sufficient. I'm only concerned with ext4 file systems and I'd like the sizes to be shown in a human readable format, so I'll us the -h and -t options described in the manual page.
user@computer:~$ df -h -t ext4
Filesystem Size Used Avail Use% Mounted on
/dev/sda1 30G 22G 6.8G 76% /
df tells me that the file system mounted on the root directory is 76% full.
Now let's try to figure out what's taking up all that space. I already know I can use the find command to find files, but I'm not sure how to specify a size.
user@computer:~$ man find
NAME
find - search for files in a directory hierarchy
SYNOPSIS
find [-H] [-L] [-P] [-D debugopts] [-Olevel] [starting-point...] [ex‐
pression]
The find manual page is quite a bit longer than the one for df, but here's the relevant part about the expression.
EXPRESSION
The part of the command line after the list of starting points is the
expression. This is a kind of query specification describing how we
match files and what we do with the files that were matched. An expres‐
sion is composed of a sequence of things:
Tests Tests return a true or false value, usually on the basis of some
property of a file we are considering. The -empty test for exam‐
ple is true only when the current file is empty.
TESTS
A numeric argument n can be specified to tests (like -amin, -mtime,
-gid, -inum, -links, -size, -uid and -used) as
+n for greater than n,
-n for less than n,
n for exactly n.
Supported tests:
-size n[cwbkMG]
File uses less than, more than or exactly n units of space,
rounding up. The following suffixes can be used:
`b' for 512-byte blocks (this is the default if no suffix is
used)
`c' for bytes
`w' for two-byte words
`k' for kibibytes (KiB, units of 1024 bytes)
`M' for mebibytes (MiB, units of 1024 * 1024 = 1048576 bytes)
`G' for gibibytes (GiB, units of 1024 * 1024 * 1024 =
1073741824 bytes)
The size is simply the st_size member of the struct stat popu‐
lated by the lstat (or stat) system call, rounded up as shown
above. In other words, it's consistent with the result you get
for ls -l. Bear in mind that the `%k' and `%b' format specifiers
of -printf handle sparse files differently. The `b' suffix al‐
ways denotes 512-byte blocks and never 1024-byte blocks, which is
different to the behaviour of -ls.
The + and - prefixes signify greater than and less than, as
usual; i.e., an exact size of n units does not match. Bear in
mind that the size is rounded up to the next unit. Therefore
-size -1M is not equivalent to -size -1048576c. The former only
matches empty files, the latter matches files from 0 to 1,048,575
bytes.
Using the -size test, let's search for files with a size greater than 1GiB. Many files can only be accessed by the root user, so I'll run the find command using sudo.
user@computer:~$ sudo find / -size +1G
/var/log/large.log
Unsurprisingly, since I created it using the fallocate command, there's a large log file taking up most of the space.
user@computer:~$ ls -lh /var/log/large.log
-rw-r--r-- 1 root root 20G Jan 30 14:43 /var/log/large.log
Next time I'm stuck, I can't say I won't search the web for the answer. But I at least intend to make an effort to go to the official documentation to continue improving my own knowledge and proficiency as a system administrator.