文件树构造-使用python输出当前目录下的文件树结构并描述-p

作者: 一车小面包人 | 来源:发表于2023-09-06 15:29 被阅读0次

python办公笔记
Python常见命令行
Docker安装ElasticSearch + Kibana
find 查找命令 + 模糊匹配
windows命令行常用命令
Linux cp scp命令详解
2020-03-12-Java
python——文件的打开及其常用方法
删除.user.ini
Mac 配置多个版本的 python

背景：将当前目录下的文件结构以树的形式输出，并在文件后面加上描述性信息。

刚接到这个需求的时候，我想这不是一个bash命令tree就可以了嘛，like this:

cd ~/04.workflow/08.scRNA_yanyt/03.reports_out/src #进入需要展示文件结构的路径
tree results/ #'展示文件结构
tree results >test.txt #'将文件结构保存到文件中

tree.png

后来小胖打开别人做的文件展示页面，告诉我每一个文件后面还需要有对应的描述性信息，那可能就需要写代码来做一下。这里我用python来实现：

import os
import os.path
import pandas as pd
import re
def dfs_showdir(path, depth,annoText):
    if depth == 0:
        print("|--"+path)
    for item in os.listdir(path):
        if item in ['.git', '.idea', '__pycache__']:
            continue  #'如果文件以.git等结尾，那么跳过
#'正则又来了，删掉文件中的数字，因为小胖的文件夹有很多一模一样like fplot1.png/flot2.png...
        pattern_item=re.sub("[0-9]","",item)  
#'输出文件结构
        print((" "*(depth+2))+"|--" +item+" "*4+annoText[annoText["files"]==pattern_item.split(".")[0]]["description"].tolist()[0])
        new_item = path + '/' + item
#'递归
        if os.path.isdir(new_item):
            dfs_showdir(new_item, depth + 1)


if __name__ == '__main__':
    #'构造文件说明数据框
    annoText_1=pd.DataFrame()
    annoText_1["files"]=["Feature_ber","hist","pearplot","Variable-ex","cycleplot"]
    annoText_1["description"]=["基因数目和测序深度相关性文件","测序深度分布文件","质控信息文件","高变基因可视化文件","细胞周期文件"]
    #'第二个文件信息构建
    annoText_2=pd.DataFrame()
    annoText_2["files"]=["umap","group","person","elbowplot","jackstrawplot","type_heatmap","cell_type","fplot","vln","celltype_singleR","Rplots","clusterplot","allmarker"]
    annoText_2["description"]=["umap聚类可视化文件","umap聚类可视化文件","umap聚类可视化文件","PCA降维PC可视化文件","PCA降维相关文件","细胞相似性文件","细胞类型注释可视化文件","聚类差异基因表达可视化（umap）文件","聚类差异基因表达可视化（小提琴图）文件","singleR注释细
胞类型与聚类类型对照表文件","***","***","差异基因文件"]
#'第三个文件信息构建
    annoText_3=pd.DataFrame()
    annoText_3["files"]=["sig_dge_all"]
    annoText_3["description"]=["***"]
    my_path="./03.reports_out/src/results/"
    print("流程的结果文件是{}，包含文件有{}。".format(my_path, os.listdir(my_path)))
    print("root:[" + my_path + "]")
    for i in ["01.Data_filter","02.cell_cluster","03.DEG_enrichment"]:
        the_path=my_path+i
        if i=="01.Data_filter":
            dfs_showdir(the_path, 0,annoText_1)
        if i=="02.cell_cluster":
            dfs_showdir(the_path, 0,annoText_2)
        if i=="03.DEG_enrichment":
            dfs_showdir(the_path, 0,annoText_3)

展示一下吧：

tree.png

需求升级，不仅要描述文件信息，还要配置不同的颜色，首先使用python自带的print输出颜色，用法如下：

{
print("\33[31m"+"this is a test"+"\33[0m") #31代表红色
print("\33[33m"+"this is a test"+"\33[0m") #33黄色
print("\33[34m"+"this is a test"+"\33[0m") #34蓝色
print("\33[32m"+"this is a test"+"\33[0m") #32绿色
}

test.png
但是这个只能在python终端里输出，不能保存到文件里，直接在linux的bash命令行输入python test.py可以改颜色：

bash_python.png
但如果运行python test.py>test.txt保存到文件里就会失效：

txt_python.png
此时只能选择方法二：安装特定的模块pip install colorama

from colorama import Fore, Back, Style
Style.RESET_ALL #清空设置，回到默认颜色

colorama.png
参数是：

参数.png
爷彻底悟了，这和python不python 没有关系，文本能在终端显示颜色是靠linux终端来控制的，比如我直接在终端输入

echo -e "\e[34m流程的结果文件是./03.reports_out/src/results/，包含文件有['04.Trajectory', '03.DEG_enrichment', '01.Data_filter', 'sce.rds', '02.cell_cluster']。\e[0m"

linux.png
文本颜色就直接改变了，python只是在字符串中加入了linux终端识别颜色的特点字符\e[34m等信息，那为啥输出到txt文件中linux终端就识别不了了呢？这和linux终端自己的查看命令有关系，当使用less test.txt时它无法识别字符中的颜色标识：

less.png
而当使用cat test.txt时它又认识了：

cat.png
那么问题来了，如何使用less命令也能认识颜色呢？能不能使用某种方法骗过linux终端，让它在使用less的时候以为自己在使用cat标准输出呢？我也搞不明白，付上一个别人的链接，我不敢试：
怎样把Linux命令行带颜色的输出保存到文件？ - 知乎 (zhihu.com)
爷很无助...
秉持着不相信这个需求没办法实现的想法，我使用python构造html网页来展示：

#!/usr/bin/env python
#-*- coding: utf-8 -*-
import pandas as pd
from dominate.tags import *
import dominate
import pandas as pd
import os
import re
from dominate.util import text
def dfs_showdir(path, depth,annoText):
    if depth == 0:
        with li():
            span(path,cls="folder",style="color:blue")
            with ul():
                for item in os.listdir(path):
                    if item in ['.git', '.idea', '__pycache__']:
                        continue
                    pattern_item=re.sub("[0-9]","",item)
                    if pattern_item.split(".")[0] in annoText["files"].tolist():
                        with li():
                            span((" "*(depth+2))+"|--" +item+" "*4+annoText[annoText["files"]==pattern_item.split(".")[0]]["description"].tolist()[0],cls="file",style="color:red")
                    else:
                        with li():
                            span((" "*(depth+2))+"|--" +item+" "*4+"***nodescription",cls="files",style="color:yellow")
def main():
    #'构造信息表格
    if True:
        annoText_1 = pd.DataFrame()
        annoText_1["files"] = ["Feature_ber", "hist", "pearplot", "Variable-ex", "cycleplot"]
        annoText_1["description"] = ["基因数目和测序深度相关性文件", "测序深度分布文件", "质控信息文件", "高变基因可视化文件", "细胞周期文件"]
        # '第二个文件信息构建
        annoText_2 = pd.DataFrame()
        annoText_2["files"] = ["umap", "group", "person", "elbowplot", "jackstrawplot", "type_heatmap", "cell_type",
                            "fplot", "vln", "celltype_singleR", "Rplots", "clusterplot", "allmarker"]
        annoText_2["description"] = ["umap聚类可视化文件", "umap聚类可视化文件", "umap聚类可视化文件", "PCA降维PC可视化文件", "PCA降维相关文件", "细胞相似性文件",
                                     "细胞类型注释可视化文件", "亚群特异性高表达基因图（umap）文件", "亚群特异性高表达基因图（小提琴图）文件", "singleR注释细胞类型与聚类类型对照表文件"," ** * "," ** * ","差异基因文件"]
        # '第三个文件信息构建
        annoText_3 = pd.DataFrame()
        annoText_3["files"] = ["sig_dge_all"]
        annoText_3["description"] = ["***"]
        # '第四个文件信息构建
        annoText_4 = pd.DataFrame()
        annoText_4["files"] = ["test"]
        annoText_4["description"] = ["***"]

    html_root = dominate.document(lang="en", doctype="<!DOCTYPE html>", title="this is a test")
    with html_root.head:
        meta(name="viewport",content="width=device-width, initial-scale=1.0")
        link(rel="stylesheet",href="css/jquery.treeview.css")
        script(src="js/jquery.min.js")
        script(src="js/jquery.treeview.js",type="text/javascript")
        script(type="text/javascript",src="js/myjs1.js")
    my_path="./src/results/"
    with html_root.body:
        with div(id="main"):
            with ul(id="treeview",cls="filetree"):
                for i in ["01.Data_filter", "02.cell_cluster", "03.DEG_enrichment", "04.Trajectory"]:
                    the_path = my_path + i
                    if i == "01.Data_filter":
                        dfs_showdir(the_path, 0, annoText_1)
                    if i == "02.cell_cluster":
                        dfs_showdir(the_path, 0, annoText_2)
                    if i == "03.DEG_enrichment":
                        dfs_showdir(the_path, 0, annoText_3)
                    if i == "04.Trajectory":
                        dfs_showdir(the_path, 0, annoText_4)
    with open('E:/***/01.资料/05.html_test/05.files_test/src/test.html','w') as f:
        f.write(html_root.render())
if __name__ =='__main__':
    main()

打开生成的网页看看吧：

html.png

诶，点击图标还能收缩：

html.png

总结：文件名字要规范，通常是字母_数字.文件扩展名，养成好习惯很重要。

python办公笔记
python 办公一、文件处理 1、输出目录下所有文件及文件夹获取当前python程序运行目录import o...
Python常见命令行
查询 ls输出当前目录下文件名ls -l输出当前目录下可见文件的详细信息ls -al输出当前目录下可见+隐藏的文件...
Docker安装ElasticSearch + Kibana
本文介绍的是使用docker yaml文件安装ElasticSearch和kibana 先看看目录树结构：当前目...
find 查找命令 + 模糊匹配
1、在当前目录下搜索指定文件： 2、在当前目录下模糊搜索文件： 3、在当前目录下搜索特定属性的文件： 4、在当前目...
windows命令行常用命令
查看当前目录下的文件和文件夹：查看当前目录下的文件树磁盘切换打开当前目录下的文件夹返回上一级目录删除当前...
Linux cp scp命令详解
名称：cp 使用方式：说明：实战：将档案复制(已存在)，并命名将当前目录下所有.log 文件 /var/p...
2020-03-12-Java
1.FileOutputStream 文件输出流。文件输出是指写入信息到磁盘文件。步骤：（1）使用构造方法，来构造...
python——文件的打开及其常用方法
打开文件 1，在Python中，是用open()打开文件并返回文件对象的这里注意一点：假如文件存在当前目录下，就...
删除.user.ini
使用宝塔面板创建网站根目录后，目录下会创建.user.ini文件，使用lsattr命令查看当前文件输出 i 代表...
Mac 配置多个版本的 python
打开当前用户的根目录下的 .bash_profile 文件，添加 python 的路径，并导出：添加别名，很重...