Python Glob Module

Python Glob Module is Used for filename Matching in Python Programming. Python Glob Module can also be used for finding a specific pattern of file and the most important is it can be used to search directories for files that have a specific pattern by using the wildcard characters. The Python Glob Module is one of the Module of the Python Standard Library. The pattern that can be find with the Python Glob Module are almost similar to the Regular Expression but it is highleverl interface.

What does a Python Glob Module do?

Glob Module can be used to search for a file that have a specific name and come handy when you are reading from serveral files that have similar names. You can concatenate these files ones found and make a single dataframe out of it for further analysis on that file.

Search for a Specific File with Python Glob Module

If we want to look for all HTML files in a directory we can use the Regular Expression (re Module) or we can use Glob. The Regular expression will be hard to write while the glob can just do it with a single line of code. 
in the following code i am looking for all files that are html files in my current directory. 

Code Using Only Regex To find HTML Files



import re
import os
currentdir = os.getcwd()
files = os.listdir(currentdir)
pattern = "^.*html$"
prog = re.compile(pattern)
htmlfiles=[]
for file in files:
    result = prog.findall(file)
    if len(result)!=0:
        htmlfiles.append(result[0])

for htmlfile in htmlfiles:
    print(htmlfile)

Code Using Glob Module to find Html Files


import glob
# path of the current directory
path = '.'
curfiles = glob.glob(path + '/*.html')
for file in curfiles:
    print(file)
    

Output of the Both codes is same

using python glob module

How to recursively search all directory with Glob

You can search all the sub directory with a single method of the Glob Module in python. It is easy by passing another parameter to the glob method. the second parameter that we can pass is the recursive which takes boolean value either true or false. Below is the sampel that will search for all the sub directories for the html files.


import glob

# path to search file
path = '.'
for file in glob.glob(path+"/*.html", recursive=True):
    print(file)


Print all files names in a Drive with Python

To print all the files name present in a file we can use the python glob moudle to do so. Below is the code which will help you how we use the python glob module with wild cards to print the file names of all the files presend in a drive.


import glob

print('Inside current directory')
files = glob.glob("D:\\**",recursive=True)
for item in files:
    print(item)
    

glob.escape() method in Python Glob Module 

This method is used to enable the pattern that includes the special characters is well. these special characters include _,#,$ and many others. both the glob and the escape method can be used at the same time for searching filenames that contains speical characters. The best way to explain this method will be to go with an example.



import glob

files = glob.glob("D:\\**\\*.jpg",recursive=True)
# All jpg files
print(files)

#JPEGs files with special characters in their name
# set of special characters _, $, #
char_seq = "_$#"
for char in char_seq:
    results = "*" + glob.escape(char) + "*" + ".jpg"
    for file in (glob.glob(results)):
        print(file)




Summary and Conclusion:-

We have learned how we can traverse the whole drive and match certain files in the directory. there are alot we can do with the python module Glob. If you are interested in other python tutorials please visit my youtube channel Code with Ali.

Previous Post Next Post