History of Bash globbing

bash was initially designed in the late 80s as a partial clone of ksh with some interactive features from csh/tcsh.

The origins of globbing have to be found in those earlier shells which it builds upon.

ksh itself is an extension of the Bourne shell. The Bourne shell itself (first released in 1979 in Unix V7) was a clean implementation from scratch, but it did not depart completely from the Thompson shell (the shell of V1 -> V6) and incorporated features from the Mashey shell.

In particular, command arguments were still separated by blanks, | was now the new pipe operator but ^ was still supported as an alternative (and also explains why you do [!a-z] and not [^a-z]), $1 was still the first argument to a script and backslash was still the escape character. So many of the regexp operators (^\|$) have a special meaning of their own in the shell.

The Thompson shell relied on an external utility for globbing. When sh found unquoted *, [ or ?s in the command, it would run the command through glob.

rm *.txt

would end up running glob as:

["glob", "rm", "*.txt"]

and glob would end up running rm with the list of files matching that pattern.

grep a.\*b *.txt

would run glob as:

["glob", "grep", "a.\252b", "*.txt"]

The * above has been quoted by setting the 8th bit on that character, preventing glob from treating it as a wildcard. glob would then remove that bit before calling grep.

To do the equivalent with regexps, that would have been:

regexp rm '\.txt$'

Or:

regexp rm '^[^.].*\.txt$'

to exclude dot-files.

The need to escape the operators as they double as shell special characters, the fact that ., common in filenames is a regexp operator makes it not very appropriate to match filenames and complicated for a beginner. In most cases, all you need is wildcards that can replace either one (?) or any number (*) of characters.

Now, different shells added different globbing operators. Nowadays, the ksh and zsh globs (and to some extent bash -O extglob which implements a subset of ksh globs) are functionally equivalent to regexps with a syntax that is less cumbersome to use with filenames and the current shell syntax. For instance, in zsh (with extendedglob extension), you can do:

echo a#.txt

if you want (unlikely) to match filenames that consist of sequences of a followed by .txt. Easier than echo (^a*\.txt$) (here using braces as a way to isolate the regex operators from the shell operators which could have been one way shells could deal with it).

echo (foo|bar|<1-20>).(#i)mpg

For mpg files (case insensitive) whose basename is foo, bar or a decimal number from 1 to 20...

ksh93 now can also incorporate regexps (basic, extended, perl-like or "augmented") in its globs (though it's quite buggy) and even provides a tool to convert between glob and regexp (printf %R, printf %P):

echo ~(Ei:.*\.txt)

to match (non-hidden) txt files with Extended regular expressions, case-insensitively.


Regular languages were introduced by Kleene in 1956. The seminal paper didn't have the full modern notation for regular expressions, but it did introduce the “Kleen star”: A* meaning “any number of repetitions of A”. In the next decade, some more or less standard notations emerged, in particular . for an arbitrary character and ? to mean that the previous character is optional.

Bash's globbing notation stems from the glob command introduced all the way back in Unix v1 in 1971. At the time, globbing was performed by a separate program; it was later moved into the shell. The early glob command has ? to mean “any one character” and * to mean “any sequence of characters”. I don't know why the characters were chosen; ? is pretty intuitive, and * may have been inspired from the one in regular expressions.

Globbing wasn't intended to be as general as regular expressions, and regular expressions were not very widespread at the time, so there was no call to unify the concepts. From the start, there were syntactic incompatibilities, with ?, . and * meaning different things in file name patterns and in regular expressions.

Modern shells such as bash expand on glob patterns, but it was gradual evolution maintaining backward compatibility. Ksh88 (the 1988 version of the Korn shell) introduced an extended syntax for shell patterns, which could not be the same syntax as usual regular expressions but was strongly inspired by it: *(PATTERN) to mean any number of repetitions of PATTERN, @(PATTERN1|PATTERN2) to mean “PATTERN1 or PATTERN2”, etc.

Modern versions of bash (since 2.02) support ksh88's extended patterns, if you issue shopt -s extglob first.