美文网首页
python os 模块

python os 模块

作者: jerry_c43c | 来源:发表于2019-05-13 09:54 被阅读0次

in this tutorial, you will learn how to work along with Python's os module.

Table of Contents

Introduction

Basic Functions

List Files / Folders in Current Working Directory

Change working Directory

Create Single and Nested Directory Structure

Remove Single and Nested Directory Structure Recursively

Example with Data Processing

Conclusion

Introduction

Python is one of the most frequently used languages in recent times for various tasks such as data processing, data analysis, and website building. In this process, there are various tasks that are operating system dependent. Python allows the developer to use several OS-dependent functionalities with the Python module os. This package abstracts the functionalities of the platform and provides the python functions to navigate, create, delete and modify files and folders. In this tutorial one can expect to learn how to import this package, its basic functionalities and a sample project in python which uses this library for a data merging task.

Some Basic Functions

Let's explore the module with some example code.

Import the library:

importos

Let's get the list of methods that we can use with this module.

print(dir(os)) 

Output:

['DirEntry', 'F_OK', 'MutableMapping', 'O_APPEND', 'O_BINARY', 'O_CREAT', 'O_EXCL', 'O_NOINHERIT', 'O_RANDOM', 'O_RDONLY', 'O_RDWR', 'O_SEQUENTIAL', 'O_SHORT_LIVED', 'O_TEMPORARY', 'O_TEXT', 'O_TRUNC', 'O_WRONLY', 'P_DETACH', 'P_NOWAIT', 'P_NOWAITO', 'P_OVERLAY', 'P_WAIT', 'PathLike', 'R_OK', 'SEEK_CUR', 'SEEK_END', 'SEEK_SET', 'TMP_MAX', 'W_OK', 'X_OK', '_Environ', '__all__', '__builtins__', '__cached__', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__spec__', '_execvpe', '_exists', '_exit', '_fspath', '_get_exports_list', '_putenv', '_unsetenv', '_wrap_close', 'abc', 'abort', 'access', 'altsep', 'chdir', 'chmod', 'close', 'closerange', 'cpu_count', 'curdir', 'defpath', 'device_encoding', 'devnull', 'dup', 'dup2', 'environ', 'errno', 'error', 'execl', 'execle', 'execlp', 'execlpe', 'execv', 'execve', 'execvp', 'execvpe', 'extsep', 'fdopen', 'fsdecode', 'fsencode', 'fspath', 'fstat', 'fsync', 'ftruncate', 'get_exec_path', 'get_handle_inheritable', 'get_inheritable', 'get_terminal_size', 'getcwd', 'getcwdb', 'getenv', 'getlogin', 'getpid', 'getppid', 'isatty', 'kill', 'linesep', 'link', 'listdir', 'lseek', 'lstat', 'makedirs', 'mkdir', 'name', 'open', 'pardir', 'path', 'pathsep', 'pipe', 'popen', 'putenv', 'read', 'readlink', 'remove', 'removedirs', 'rename', 'renames', 'replace', 'rmdir', 'scandir', 'sep', 'set_handle_inheritable', 'set_inheritable', 'spawnl', 'spawnle', 'spawnv', 'spawnve', 'st', 'startfile', 'stat', 'stat_float_times', 'stat_result', 'statvfs_result', 'strerror', 'supports_bytes_environ', 'supports_dir_fd', 'supports_effective_ids', 'supports_fd', 'supports_follow_symlinks', 'symlink', 'sys', 'system', 'terminal_size', 'times', 'times_result', 'truncate', 'umask', 'uname_result', 'unlink', 'urandom', 'utime', 'waitpid', 'walk', 'write']

Now, using the getcwd method, we can retrieve the path of the current working directory.

print(os.getcwd()) 

Output:

C:\Users\hpandya\OneDrive\work\StackAbuse\os_python\os_python\Project 

List Folders and Files

Let's list the folders/files in the current directory using listdir:

print(os.listdir()) 

Output:

['Data', 'Population_Data', 'README.md', 'tutorial.py', 'tutorial_v2.py']

As you can see, I have 2 folders: Data and Population_Data. I also have 3 files: README.mdmarkdown file, and two Python files namely, tutorial.py and tutorial_v2.py.

In order to get the entire tree structure of my project folder, let's write a function and then use os.walk() to iterate over all the files in each folder of the current directory.

# function to list files in each folder of the current working directorydeflist_files(startpath):forroot, dirs, filesinos.walk(startpath):# print(dirs)ifdir!='.git':            level = root.replace(startpath,'').count(os.sep)            indent =' '*4* (level)            print('{}{}/'.format(indent, os.path.basename(root)))            subindent =' '*4* (level +1)forfinfiles:                print('{}{}'.format(subindent, f))

Call this function using the current working directory path, which we saw how to do earlier:

startpath = os.getcwd() 

list_files(startpath) 

Output:

Project/ 

    README.md

    tutorial.py

    tutorial_v2.py

    Data/

        uscitiesv1.4.csv

    Population_Data/

        Alabama/

            Alabama_population.csv

        Alaska/

            Alaska_population.csv

        Arizona/

            Arizona_population.csv

        Arkansas/

            Arkansas_population.csv

        California/

            California_population.csv

        Colorado/

            Colorado_population.csv

        Connecticut/

            Connecticut_population.csv

        Delaware/

            Delaware_population.csv

        ...

Note: The output has been truncated for brevity.

As seen from the output, the folders' names are ended with a / and the files within the folders have been indented four spaces to the right. The Data folder has one csv file named uscitiesv1.4.csv. This file has data about population for each city in the United States. The folder Population_Datahas folders for States, containing separated csv files for population data for each state, extracted from uscitiesv1.4.csv.

Change Working Directory

Let's change the working directory and enter into the directory of data with the state of "New York".

os.chdir('Population_Data/New York')

Now let's run the list_files method again, but in this directory.

list_files(os.getcwd()) 

Output:

NewYork/NewYork_population.csv

As you can see, we have entered the New York folder under Population_Data folder.

Create Single and Nested Directory Structure

Now, let's create a new directory called testdir in this directory.

os.mkdir('testdir')

list_files(os.getcwd()) 

Output:

NewYork/NewYork_population.csv    testdir/

As you can see, it creates the new directory in the current working directory.

Let's create a nested directory with 2 levels.

os.mkdir('level1dir/level2dir')

Output:

Traceback (most recentcalllast):File"<ipython-input-12-ac5055572301>", line1,in    os.mkdir('level1dir/level2dir')FileNotFoundError: [WinError3] Thesystemcannot find thepathspecified:'level1dir/level2dir'

Subscribe to our Newsletter

Get occassional tutorials, guides, and reviews in your inbox. No spam ever. Unsubscribe at any time.

Subscribe

We get an Error. To be specific, we get a FileNotFoundError. You might wonder, why a FileNotFound error when we are trying to create a directory.

The reason: the Python module looks for a directory called level1dir to create the directory level2dir. Since level1dir does not exist, in the first place, it throws a FileNotFoundError.

For purposes like this, the mkdirs() function is used instead, which can create multiple directories recursively.

os.makedirs('level1dir/level2dir')

Check the current directory tree,

list_files(os.getcwd()) 

Output:

NewYork/NewYork_population.csv    level1dir/        level2dir/    testdir/

As we can see, now we have two subdirectories under New York folder. testdir and level1dir. level1dir has a directory underneath called level2dir.

Remove Single and Multiple Directories Recursively

The os module also had methods to modify or remove directories, which I'll show here.

Now, let's remove the directories we just created using rmdir:

os.rmdir('testdir')

Check the current directory tree to verify that the directory no longer exists:

list_files(os.getcwd()) 

Output:

NewYork/NewYork_population.csv    level1dir/        level2dir/

As you can see, testdir has been deleted.

Let's try and delete the nested directory structure of level1dir and level2dir.

os.rmdir('level1dir')

Output:

OSError  Traceback (most recentcalllast)  in()----> 1 os.rmdir('level1dir')OSError: [WinError145] Thedirectoryisnotempty:'level1dir'

As seen, this throws a OSError and rightly so. It says level1dir directory is not empty. That is correct because it has level2dir underneath it.

With the rmdir method it is not possible to remove a non-empty directory, similar to the Unix command-line version.

Just like the makedirs() method, let's try rmdirs(), which recursively removes directories in a tree structure.

os.removedirs('level1dir/level2dir')

Let's see the directory tree structure again:

list_files(os.getcwd()) 

Output:

NewYork/NewYork_population.csv

This brings us to the previous state of the directory.

Example with Data Processing

So far we have explored how to view, create, and remove a nested directory structure. Now let's see an example of how the os module helps in data processing.

For that let's go one level up in the directory structure.

os.chdir('../')

With that, let's again view the directory tree structure.

list_files(os.getcwd()) 

Output:

Population_Data/ 

    Alabama/

        Alabama_population.csv

    Alaska/

        Alaska_population.csv

    Arizona/

        Arizona_population.csv

    Arkansas/

        Arkansas_population.csv

    California/

        California_population.csv

    Colorado/

        Colorado_population.csv

    Connecticut/

        Connecticut_population.csv

    Delaware/

        Delaware_population.csv

...

Note: The output has been truncated for brevity.

Let's merge the data from all of the states, iterating over the directory of each state and merging the CSV files likewise.

importosimportpandasaspd# create a list to hold the data from each statelist_states = []# iteratively loop over all the folders and add their data to the listforroot, dirs, filesinos.walk(os.getcwd()):iffiles:        list_states.append(pd.read_csv(root+'/'+files[0], index_col=None))# merge the dataframes into a single dataframe using Pandas librarymerge_data = pd.concat(list_states[1:], sort=False)

Thanks in part to the os module we were able to create merge_data, which is a dataframe containing population data from every state.

Conclusion

In this article, we briefly explored different capabilities of Python's built-in os module. We also saw a brief example of how the module can be used in the world of Data Science and Analytics. It is important to understand that os has a lot more to offer, and based on the need of the developer a much more complex logic can be constructed.

相关文章

网友评论

      本文标题:python os 模块

      本文链接:https://www.haomeiwen.com/subject/fagkaqtx.html