Read the Manual: A Guide for the Linux Command Line

Like most people interested in becoming more proficient with computers, I spend a lot of time looking at manuals. But, when it comes to the Linux command line, I often find myself searching the web rather than referencing official documentation. Searches frequently yield answers to my questions, but they do little to improve my proficiency.

In this post, we'll explore the various sources of official documentation, then we'll see how to use that documentation to solve an actual problem. Debian 13 was used for the examples, but most of the tools presented here work the same way no matter what distribution you're using. Some of the output has been truncated for brevity.

This manual gives complete descriptions of all the publicly available features of UNIX. It provides neither a general overview (see "The UNIX Time—sharing System" for that) nor details of the implementation of the system (which remain to be disclosed).

Within the area it surveys, this manual attempts to be as complete and timely as possible. A conscious decision was made to describe each program in exactly the state it was in at the time its manual section was prepared. In particular, the desire to describe something as it should be, not as it is, was resisted. Inevitably, this means that many sections will soon be out of date. (The rate of change of the system is so great that a dismayingly large number of early sections had to be modified while the rest were being written. The unbounded effort required to stay up-to-date is best indicated by the fact that several of the programs described were written specifically to aid in preparation of this manual!)

man

Linux distributions typically come with a manual. To view a command's manual page, pass its name as an argument to man.

user@computer:~$ man ls

NAME
       ls - list directory contents

SYNOPSIS
       ls [OPTION]... [FILE]...

DESCRIPTION
       List  information  about  the  FILEs (the current directory by default).
       Sort entries alphabetically if none of -cftuvSUX nor  --sort  is  speci‐
       fied.

Navigating the manual pages isn't intuitive, but there's built in help which can be accessed by pressing the h key. q returns to the manual page from the help page or to the shell from a manual page.

Many copies of the manual pages for UNIX-like operating systems exist online, however, be careful when referring to these versions. For instance, the OpenBSD ls manual page appears on the first page of search results for "man ls", but that version is completely separate from the GNU coreutils version that ships with most Linux distributions.

info

Speaking of GNU, many of the GNU tools, which form the basis of the Linux command line, are documented using the Texinfo typesetting syntax. This documentation, which is often more detailed than the corresponding manual pages, can be viewed with the info command.

user@computer:~$ info ls

10.1 ‘ls’: List directory contents
==================================

The ‘ls’ program lists information about files (of any type, including
directories).  Options and file arguments can be intermixed arbitrarily,
as usual.  Later options override earlier options that are incompatible.

   For non-option command-line arguments that are directories, by
default ‘ls’ lists the contents of directories, not recursively, and
omitting files with names beginning with ‘.’.  For other non-option
arguments, by default ‘ls’ lists just the file name.  If no non-option
argument is specified, ‘ls’ operates on the current directory, acting as
if it had been invoked with a single argument of ‘.’.

   By default, ‘ls’ lists each directory's contents alphabetically,
according to the locale settings in effect.(1)  If standard output is a
terminal, the output is in columns (sorted vertically) and control
characters are output as question marks; otherwise, the output is listed
one per line and control characters are output as-is.

   Because ‘ls’ is such a fundamental program, it has accumulated many
options over the years.  They are described in the subsections below;
within each section, options are listed alphabetically (ignoring case).
The division of options into the subsections is not absolute, since some
options affect more than one aspect of ‘ls’'s operation.

Unlike man, info may not be installed by default. If info ls outputs an error like -bash: info: command not found, you'll need to install the GNU Info documentation browser. On Debian-based distributions, use apt install info.

Like man, the info browser has built in help. H opens the help, which includes basic navigation, and h starts a tutorial. q quits.

help

Commands built into the Bash shell come with their own documentation.

Builtin commands are contained within the shell itself. When the name of a builtin command is used as the first word of a simple command (see Simple Commands), the shell executes the command directly, without invoking another program. Builtin commands are necessary to implement functionality impossible or inconvenient to obtain with separate utilities.

Use the type command to determine whether or not a command is built into the shell.

user@computer:~$ type ls
ls is aliased to `ls --color=auto'
user@computer:~$ type echo
echo is a shell builtin

The help command prints the documentation for builtin commands.

user@computer:~$ help echo

echo: echo [-neE] [arg ...]
    Write arguments to the standard output.
    
    Display the ARGs, separated by a single space character and followed by a
    newline, on the standard output.
    
    Options:
      -n        do not append a newline
      -e        enable interpretation of the following backslash escapes
      -E        explicitly suppress interpretation of backslash escapes

whatis, apropos

Each manual page has a name and description. The whatis command displays that information.

user@computer:~$ whatis ls
ls (1)               - list directory contents

apropos searches the manual page names and descriptions.

user@computer:~$ apropos directory
basename (1)         - strip directory and suffix from filenames
bindtextdomain (3)   - set directory containing message catalogs
chroot (8)           - run command or interactive shell with special root directory
dbus-cleanup-sockets (1) - clean up leftover sockets in a directory
depmod.d (5)         - Configuration directory for depmod
dir (1)              - list directory contents
find (1)             - search for files in a directory hierarchy
grub-macbless (8)    - bless a mac file/directory
grub-mknetdir (1)    - prepare a GRUB netboot directory.
helpztags (1)        - generate the help tags file for directory
ls (1)               - list directory contents

Putting It All Together

Let's see how to use these resources to solve an actual problem. First, I want an overview of disk usage. Then, I want to find all files over a certain size, so they can potentially be removed to free up space. A search for "usage" with apropos helps me determine that the df command is the one I want for the first task.

user@computer:~$ apropos usage
df (1)               - report file system space usage

The manual page provides some helpful options, like -h, which prints sizes in human readable form.

user@computer:~$ man df

NAME
       df - report file system space usage

SYNOPSIS
       df [OPTION]... [FILE]...

DESCRIPTION
       This  manual  page  documents  the  GNU  version of df.  df displays the
       amount of space available on the file system containing each  file  name
       argument.   If  no  file  name is given, the space available on all cur‐
       rently mounted file systems is shown.  Space is shown in  1K  blocks  by
       default,  unless  the  environment  variable  POSIXLY_CORRECT is set, in
       which case 512-byte blocks are used.

       If an argument is the absolute file name of a device node  containing  a
       mounted  file  system,  df shows the space available on that file system
       rather than on the file system containing the device node.  This version
       of df cannot show the space available on unmounted file systems, because
       on most kinds of systems doing so requires non-portable intimate  knowl‐
       edge of file system structures.

OPTIONS
       -h, --human-readable
              print sizes in powers of 1024 (e.g., 1023M)

       -t, --type=TYPE
              limit listing to file systems of type TYPE

df is part of the GNU coreutils. At the bottom of the manual page, there are instructions to access the full documentation.

SEE ALSO
       Full documentation <https://www.gnu.org/software/coreutils/df>
       or available locally via: info '(coreutils) df invocation'

For this task, however, the manual page is sufficient. I'm only concerned with ext4 file systems and I'd like the sizes to be shown in a human readable format, so I'll us the -h and -t options described in the manual page.

user@computer:~$ df -h -t ext4
Filesystem      Size  Used Avail Use% Mounted on
/dev/sda1        30G   22G  6.8G  76% /

df tells me that the file system mounted on the root directory is 76% full.

Now let's try to figure out what's taking up all that space. I already know I can use the find command to find files, but I'm not sure how to specify a size.

user@computer:~$ man find

NAME
       find - search for files in a directory hierarchy

SYNOPSIS
       find  [-H]  [-L]  [-P] [-D debugopts] [-Olevel] [starting-point...] [ex‐
       pression]

The find manual page is quite a bit longer than the one for df, but here's the relevant part about the expression.

EXPRESSION
       The  part  of  the command line after the list of starting points is the
       expression.  This is a kind of query  specification  describing  how  we
       match files and what we do with the files that were matched.  An expres‐
       sion is composed of a sequence of things:

       Tests  Tests  return a true or false value, usually on the basis of some
              property of a file we are considering.  The -empty test for exam‐
              ple is true only when the current file is empty.
   TESTS
       A numeric argument n can be specified  to  tests  (like  -amin,  -mtime,
       -gid, -inum, -links, -size, -uid and -used) as

       +n     for greater than n,

       -n     for less than n,

       n      for exactly n.

       Supported tests:
       -size n[cwbkMG]
              File uses less than, more than  or  exactly  n  units  of  space,
              rounding up.  The following suffixes can be used:

              `b'    for  512-byte  blocks (this is the default if no suffix is
                     used)

              `c'    for bytes

              `w'    for two-byte words

              `k'    for kibibytes (KiB, units of 1024 bytes)

              `M'    for mebibytes (MiB, units of 1024 * 1024 = 1048576 bytes)

              `G'    for gibibytes  (GiB,  units  of  1024  *  1024  *  1024  =
                     1073741824 bytes)

              The  size  is  simply the st_size member of the struct stat popu‐
              lated by the lstat (or stat) system call,  rounded  up  as  shown
              above.   In  other words, it's consistent with the result you get
              for ls -l.  Bear in mind that the `%k' and `%b' format specifiers
              of -printf handle sparse files differently.  The `b'  suffix  al‐
              ways denotes 512-byte blocks and never 1024-byte blocks, which is
              different to the behaviour of -ls.

              The  +  and  -  prefixes  signify  greater than and less than, as
              usual; i.e., an exact size of n units does not  match.   Bear  in
              mind  that  the  size  is rounded up to the next unit.  Therefore
              -size -1M is not equivalent to -size -1048576c.  The former  only
              matches empty files, the latter matches files from 0 to 1,048,575
              bytes.

Using the -size test, let's search for files with a size greater than 1GiB. Many files can only be accessed by the root user, so I'll run the find command using sudo.

user@computer:~$ sudo find / -size +1G
/var/log/large.log

Unsurprisingly, since I created it using the fallocate command, there's a large log file taking up most of the space.

user@computer:~$ ls -lh /var/log/large.log 
-rw-r--r-- 1 root root 20G Jan 30 14:43 /var/log/large.log

Next time I'm stuck, I can't say I won't search the web for the answer. But I at least intend to make an effort to go to the official documentation to continue improving my own knowledge and proficiency as a system administrator.