What is file globbing in Linux?

File globbing is a feature provided by the UNIX/Linux shell to represent multiple filenames by using special characters called wildcards with a single file name. A wildcard is essentially a symbol which may be used to substitute for one or more characters. Therefore, we can use wildcards for generating the appropriate combination of file names as per our requirement.

What are regular expressions?

Is a sequence of symbols which includes alphabets, numbers, special characters like $,^,*,. etc which can be understood by many programming languages to match different string patterns. This is a bit complex but with examples, you can understand them quickly and easily. Please head to our posts on these regular expressions.

Types of available file globbing:

The bash shell provides three characters to use as wildcards:

  • Asterisk (*) to represent 0 or more characters
  • Question mark (?) to represent exactly one character
  • Square brackets ([]) to represent and match for the character enclosed within the square brackets.

In this article, we’ll go through multiple examples of each of the above-mentioned file globbing options to understand their usage. In order this globbing to work properly you should know shopt command globbing options as well.

Example 1: Using * expand all disk drives file names with /dev/sda as leading characters.

[[email protected] ~]# ls -l /dev/sda*
brw-rw----. 1 root disk 8, 0 Nov 2 18:01 /dev/sda
brw-rw----. 1 root disk 8, 1 Nov 2 18:02 /dev/sda1
brw-rw----. 1 root disk 8, 2 Nov 2 18:01 /dev/sda2

 

Example 2: Using ? disk drives file names with /dev/sda as leading characters not including file name /dev/sda itself.

[[email protected] ~]# ls -l /dev/sda?
brw-rw----. 1 root disk 8, 1 Nov 2 18:02 /dev/sda1
brw-rw----. 1 root disk 8, 2 Nov 2 18:01 /dev/sda2

 

Notice that we got 3 results when using * but only 1 result when using ?.
This is because since * matches 0 or more characters it also matched the file name where there was no (0) character written after /dev/sda.

Related concept:   SAR command: Monitor CPU, Memory, disk and IO in Linux - Part 2

Example 3: Expand all file names in the current working directory.

[[email protected] tmp]# ls *
sahil-test.txt sahil.tmp yum_save_tx-2017-10-28-16-40htmJvm.yumtx
abc:
def ghi
ssh-BzluMR4122:
agent.4122

 

When used with ls the * wildcard prints the names of all files in the current working directory along with the files and subdirectories.

Example 4: Using square brackets to expand file names based on a range of characters within the square brackets.

[[email protected] ~]# ls -l /dev/sd[ab]*
brw-rw----. 1 root disk 8, 0 Nov 2 18:01 /dev/sda
brw-rw----. 1 root disk 8, 1 Nov 2 18:02 /dev/sda1
brw-rw----. 1 root disk 8, 2 Nov 2 18:01 /dev/sda2
brw-rw----. 1 root disk 8, 16 Nov 2 18:01 /dev/sdb

In the above example, we’ve used [ab] in square brackets along with the asterisk symbol. This implies the expansion of all file names that contain /dev/sda or /dev/sdb followed by any number of characters.

As in our second example if we use ? instead of * we’ll only get the file names containing exactly one character in place of the question mark as shown in the next example.

Example 5: Using ? with [] instead of *

[[email protected] ~]# ls -l /dev/sd[ab]?
brw-rw----. 1 root disk 8, 1 Nov 2 18:02 /dev/sda1
brw-rw----. 1 root disk 8, 2 Nov 2 18:01 /dev/sda2

Example 6: Using * to perform a recursive copy operation.

[[email protected] tmp]# cp -Rv /tmp/abc/* /tmp/lmn/
`/tmp/abc/def' -> `/tmp/lmn/def'
`/tmp/abc/def/nop' -> `/tmp/lmn/def/nop'
`/tmp/abc/ghi' -> `/tmp/lmn/ghi'

The above command copies all files and subdirectories within the /tmp/abc directory to /tmp/lmn directory.

Example 7: Using * to perform a recursive delete operation.

[[email protected] tmp]# rm -rfv /tmp/lmn/*
removed `/tmp/lmn/def/nop'
removed directory: `/tmp/lmn/def'
removed `/tmp/lmn/ghi'

Example 8: Using ? to match the exact number of characters.

In this example, we’ll see how we can use the question mark to match exactly 2 characters.

[[email protected] tmp]# df -hTP /test*
Filesystem Type Size Used Avail Use% Mounted on
/dev/mapper/test-lv06 ext4 93M 1.6M 87M 2% /test1
/dev/mapper/vg_cent68-lv_root ext4 13G 2.3G 9.9G 19% /
/dev/mapper/test-lv07 ext4 93M 1.6M 87M 2% /test24

I get 3 results when expanding /test with the * but I’m interested only in /test24.
To get only that in the result we’ll use 2 question marks as shown below:

[[email protected] tmp]# df -hTP /test??
Filesystem Type Size Used Avail Use% Mounted on
/dev/mapper/test-lv07 ext4 93M 1.6M 87M 2% /test24

Example 9: Using [] instead of ? to match the exact number of characters.

Related concept:   8 lsmod, rmmod, modprobe, and modinfo command examples in Linux

In this example, we’ll be using a range of numbers enclosed in square brackets to expand the required result instead of using question marks.

[[email protected] tmp]# df -hTP /test[0-9][0-4]
Filesystem Type Size Used Avail Use% Mounted on
/dev/mapper/test-lv07 ext4 93M 1.6M 87M 2% /test24

To demonstrate the accuracy if I set the range from 0-3 in the square brackets we’ll no longer be getting the desired output.

[[email protected] tmp]# df -hTP /test[0-9][0-3]
df: `/test[0-9][0-3]': No such file or directory
df: no file systems processed

Example 10: Deciding when to use the appropriate wildcard.

Generally, we don’t use more than one wildcard together.
We may however occasionally use ? together with [] but we do not combine * with another wildcard.

In the below output we’ve used ? and [] together with a single expression.

[[email protected] tmp]# df -hTP /?0[0-9]
Filesystem Type Size Used Avail Use% Mounted on
/dev/mapper/test-lv01 ext4 93M 1.6M 87M 2% /u01
/dev/mapper/test-lv02 ext4 93M 1.6M 87M 2% /u02
/dev/mapper/test-lv03 ext4 93M 1.6M 87M 2% /u03
/dev/mapper/test-lv04 ext4 93M 1.6M 87M 2% /u04
/dev/mapper/test-lv05 ext4 93M 1.6M 87M 2% /u05

But if we used * instead of the 0 to match any character then we get the below output:

[[email protected] tmp]# df -hTP /?*[0-9]
Filesystem Type Size Used Avail Use% Mounted on
/dev/mapper/vg_cent68-lv_root ext4 13G 2.3G 9.9G 19% /
/dev/mapper/test-lv06 ext4 93M 1.6M 87M 2% /test1
/dev/mapper/vg_cent68-lv_root ext4 13G 2.3G 9.9G 19% /
/dev/mapper/test-lv07 ext4 93M 1.6M 87M 2% /test24
/dev/mapper/test-lv01 ext4 93M 1.6M 87M 2% /u01
/dev/mapper/test-lv02 ext4 93M 1.6M 87M 2% /u02
/dev/mapper/test-lv03 ext4 93M 1.6M 87M 2% /u03
/dev/mapper/test-lv04 ext4 93M 1.6M 87M 2% /u04
/dev/mapper/test-lv05 ext4 93M 1.6M 87M 2% /u05

When we added the * we removed the consideration for what came after since * matched for zero or more of any character and any number of occurrences of the character.

Related concept:   13 Linux du command examples

A caveat:

Beginners sometimes tend to confuse wildcards with regular expressions when using grep but they are not the same.
Wildcards are a feature provided by the shell to expand file names whereas regular expressions are a text filtering mechanism intended for use with utilities like grep, sed and awk.

If you use * with grep it will work because grep treats * as a quantifier

[[email protected] tmp]# df -hTP | grep '/u0*'
/dev/mapper/test-lv01 ext4 93M 1.6M 87M 2% /u01
/dev/mapper/test-lv02 ext4 93M 1.6M 87M 2% /u02
/dev/mapper/test-lv03 ext4 93M 1.6M 87M 2% /u03
/dev/mapper/test-lv04 ext4 93M 1.6M 87M 2% /u04
/dev/mapper/test-lv05 ext4 93M 1.6M 87M 2% /u05

Using square brackets also works because in regular expressions terminology [] represent character classes.

[[email protected] tmp]# df -hTP | grep '/u0[0-9]'
/dev/mapper/test-lv01 ext4 93M 1.6M 87M 2% /u01
/dev/mapper/test-lv02 ext4 93M 1.6M 87M 2% /u02
/dev/mapper/test-lv03 ext4 93M 1.6M 87M 2% /u03
/dev/mapper/test-lv04 ext4 93M 1.6M 87M 2% /u04
/dev/mapper/test-lv05 ext4 93M 1.6M 87M 2% /u05

But if you use a ? with grep then it won’t work.

[[email protected] tmp]# df -hTP | grep '/u0??'
[[email protected] tmp]#

Conclusion:

In this article, we demonstrated multiple examples to illustrate how we may use wildcards on the Linux command line.
We also clarified with the help of examples how wildcards should not be confused with regular expression syntax.