The pathlib
package#
The pathlib
module in Python provides an interface for working with filesystem paths. It offers a simple way to perform common path manipulations. With pathlib
, you can work with paths on different operating systems without changing your code. It’s part of the standard library, so no additional installation is required.
Warning
None of the following code will work on your computer. You will need to provide your own unique paths for your machine.
path objects#
Path objects are the core concept in pathlib
. They represent filesystem paths and provide methods and properties to access the file system. Let’s start by importing Path
from pathlib
:
from pathlib import Path
Here, we create a path object pointing to the Desktop folder for a user named Tyson:
desktop_path = Path('/Users/tyson/Desktop')
desktop_path
PosixPath('/Users/tyson/Desktop')
absolute and relative paths#
An absolute path is a complete path from the root of the filesystem to the desired directory or file, whereas a relative path starts from the current directory.
Here we define an absolute path to the Desktop:
absolute_path = Path('/Users/tyson/Desktop')
absolute_path
PosixPath('/Users/tyson/Desktop')
This is a relative path, indicating the current directory:
relative_path = Path('./')
relative_path
PosixPath('.')
We can check if a path is absolute:
relative_path.is_absolute()
False
To get the full absolute path from a relative path:
# this will create the full absolute path
relative_path_full = relative_path.resolve()
To find out the current working directory:
# this tells me where I am
current_dir = Path().cwd()
constructing paths#
Path
objects can be joined using the /
operator. This makes it easy to build up paths without worrying about the underlying operating system’s path separator:
desktop_path = Path('/Users/tyson/Desktop')
project_data_path = desktop_path / 'my_project' / 'data.csv'
project_data_path
PosixPath('/Users/tyson/Desktop/my_project/data.csv')
creating and deleting folders and files#
pathlib
also makes it easy to create and delete directories and files.
To create a folder:
# create a folder
project_path = Path('/Users/tyson/Desktop/my_project')
project_path.mkdir()
To create a file:
# create a file
project_data_path = Path('/Users/tyson/Desktop/my_project/data.csv')
project_data_path.touch()
Deleting a file:
# delete a file
project_data_path.unlink()
And to delete a folder:
# delete a folder
project_path.rmdir()
checking path properties#
You can also check various properties of paths. For example, you can check if a path exists, is a file, or is a directory.
To check if a path exists:
project_path = absolute_path / 'my_project'
project_path.exists()
False
After creating the directory, check again if it exists:
project_path.mkdir()
project_path.exists()
True
To check if a path is a directory:
# only true if exists & is a directory
project_path.is_dir()
True
Or a file:
project_path.is_file()
False
To get the user’s home directory:
# get the user’s home directory
project_path.home()
PosixPath('/Users/tyson')
To extract different parts of the path:
parent_dir = project_data_path.parent
file_name = project_data_path.name
file_name_stem = project_data_path.stem
file_suffix = project_data_path.suffix
print(parent_dir)
print(file_name)
print(file_name_stem)
print(file_suffix)
/Users/tyson/Desktop/my_project
data.csv
data
.csv
finding files and folders#
pathlib
provides methods to find files and folders. For example, to find all .csv
files in a directory:
# set things up
desktop_path = Path('/Users/tyson/Desktop')
project_path = desktop_path / 'my_project'
project_data_path = Path('/Users/tyson/Desktop/my_project/data.csv')
project_data_path.touch()
# find all of the csv files in the "project_path" folder
list_of_csv_a = list( project_path.glob('*.csv') )
list_of_csv_a
[PosixPath('/Users/tyson/Desktop/my_project/data.csv')]
To recursively find all .csv
files in a directory and its subdirectories:
# recursively find all of the csv files in the "project_path" folder and subfolders
list_of_csv_b = list( project_path.rglob('*.csv') )
list_of_csv_b
[PosixPath('/Users/tyson/Desktop/my_project/data.csv')]
Cleaning up by deleting the file and folder created:
# clean up; delete the file and folder
project_data_path.unlink()
project_path.rmdir()