美文网首页
Python实现搜索关键字定位文件02

Python实现搜索关键字定位文件02

作者: 橙子丨Sunty | 来源:发表于2019-02-16 23:03 被阅读0次

    上一篇已经介绍了如何通过终端命令检索出包含关键字的文件和所处段落内容,但终端模式毕竟不符合用户习惯,所以基于Django框架快速构建一个网站来实现该功能。



    Django是一个开放源代码的Web应用框架,由Python写成。Django是一个基于MVC构造的框架。但是在Django中,控制器接受用户输入的部分由框架自行处理,所以 Django 里更关注的是模型(Model)、模板(Template)和视图(Views),称为 MTV模式。它们各自的职责如下:

    层次 职责
    模型(Model),即数据存取层 处理与数据相关的所有事务: 如何存取、如何验证有效性、包含哪些行为以及数据之间的关系等。
    模板(Template),即表现层 处理与表现相关的决定: 如何在页面或其他类型文档中进行显示。
    视图(View),即业务逻辑层 存取模型及调取恰当模板的相关逻辑。模型与模板的桥梁。

    想要深入了解Django,可以进入Django文档学习,这里就不做详细介绍了。由于没有用到数据库,我们用不到Model,主要是View、Template和配置路由urls。
    1、编写view代码
    这里我新建项目名称为myword、app名称为findwords。目录结构如下:


    还是基于上一篇思想方法,将获取文件内容以及检索文件的逻辑写进views.py文件中:
    #-*- coding: UTF-8 -*-
    from django.shortcuts import render
    from django.template.loader import get_template
    from django.http import HttpResponse, JsonResponse
    from django.utils import timezone
    from django.views.decorators.csrf import csrf_exempt
    from     import Document
    import os,sys,datetime
    import logging
    import json
    
    logging = logging.getLogger('reso_logger')
    # Create your views here.
    
    # 解决datetime、date格式数据无法json序列化问题
    class DateEncoder(json.JSONEncoder):  
        def default(self, obj):  
            if isinstance(obj, datetime.datetime):  
                return obj.strftime('%Y-%m-%d %H:%M:%S')
            elif isinstance(obj, datetime.date):
                return obj.strftime("%Y-%m-%d")  
            else:  
                return json.JSONEncoder.default(self, obj) 
    
    def index(request):
        return HttpResponse("Hello, world. You're at the findwords index.")
    
    def search_string(filename,string):
        #打开文档
        document = Document(filename)
        # document = Document(r'C:\Users\Cheng\Desktop\kword\words\wind.docx')
        print filename
        #读取每段资料
        l = [ paragraph.text for paragraph in document.paragraphs];
        # l = [ paragraph.text.encode('gb2312') for paragraph in document.paragraphs];
        #输出并观察结果,也可以通过其他手段处理文本即可
        fileword = []
        for i in l:
            i=i.strip()
            # print i
            if i.find(string)!=-1:
                changetime = datetime.datetime.fromtimestamp(os.path.getmtime(filename)).strftime('%Y-%m-%d %H:%M:%S')
                logging.info(changetime)
                # print filename, i
                fword = filename+">>>>>"+changetime+">>>>>"+i
                fileword.append(fword)
        # logging.info(fileword)
        return fileword
    
    #遍历该目录下的所有文件,返回‘目录+文件名’列表
    def get_process_files(root_dir):
        """process all files in directory"""
        cur_dir=os.path.abspath(root_dir)
        file_list = []
        for file in os.listdir(cur_dir):
            u_file = file.decode('gbk')
            file_list.append(u_file)
        logging.info(file_list)
        process_list=[]
        for file in file_list:
            fullfile=cur_dir+"\\"+file
            if os.path.isfile(fullfile):
                process_list.append(fullfile)
            elif os.path.isdir(fullfile):
                dir_extra_list=get_process_files(fullfile)
                if len(dir_extra_list)!=0:
                    for x in dir_extra_list:
                        process_list.append(x)
        # print process_list
        return process_list
    
    def count_files(root_dir,string):
        process_list=get_process_files(root_dir)
        logging.info(process_list)
        f_result = []
        for files in process_list:
            f_result.append(search_string(files, string))
        return f_result
    
    def findwords(requset):
        return render(requset, 'findwords/search.html')
    
    def searchwords(request):
        return render(request, 'findwords/findwords.html', locals())
    
    @csrf_exempt
    def searesult(request):
        if request.method == 'POST':
            try:
                word = request.POST.get('keyword')
                logging.info(word)
                root_dir="..\\words" #目录
                string = word #要搜索的字符串
                try:
                    f_result = count_files(root_dir,string)
                    logging.info(f_result)
                    f_result_list = []
                    for result in f_result:
                        for res in result:
                            result_list = res.split('>>>>>')
                            result_dict = {"filename":result_list[0],"checktime":result_list[1],"contents":result_list[2]}
                            # logging.info(result_dict)
                            f_result_list.append(result_dict)
                    logging.info(f_result_list)
                    json_data = json.dumps(f_result_list, cls=DateEncoder, ensure_ascii=False)
                    return HttpResponse(json_data, content_type="application/json")
                except:
                    return render(request, 'findwords/findwords.html', locals())
            except:
                return render(request, 'findwords/findwords.html', locals())
        else:
            return render(request, 'findwords/findwords.html', locals())
    

    这里的root_dir="..\\words"设置了读取的目录为项目根目录下的words文件夹内的.docx文件,我们可以新建一个文件测试。
    2、编写路由
    编写根目录主路由urls.py

    from django.conf.urls import url,include
    from django.contrib import admin
    
    urlpatterns = [
        url(r'^admin/', admin.site.urls),
        url(r'^find/', include('findwords.urls')),
    ]
    

    编写app路由urls.py

    from django.conf.urls import url
    from . import views
    
    urlpatterns = [
        url(r'^$', views.index, name='index'),
        url(r'^findwords$', views.findwords, name='findwords'),
        url(r'^searchwords$', views.searchwords, name='searchwords'),
        url(r'^searesult$', views.searesult, name='searesult'),
    ]
    

    3、编写相应的HTML文件
    编写搜索界面search.html

    <!doctype html> {%load staticfiles%}
    <html lang="zh">
    
    <head>
        <meta charset="UTF-8">
        <meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1">
        <meta name="viewport" content="width=device-width, initial-scale=1.0, maximum-scale=1.0, user-scalable=no" />
        <title>Find世界</title>
        <link rel="stylesheet" type="text/css" href="{% static 'css/default.css'%}">
        <link rel="stylesheet" type="text/css" href="{% static 'css/search-form.css'%}">
        <script type="text/javascript" src="{%static 'js/jquery-3.2.1.min.js'%}"></script>
    </head>
    <style type="text/css">
    .comments {
        width: 95%;
        /*自动适应父布局宽度*/
        height: 400px;
        overflow: auto;
        word-break: break-all;
        /*在ie中解决断行问题(防止自动变为在一行显示,主要解决ie兼容问题,ie8中当设宽度为100%时,文本域类容超过一行时,  
    当我们双击文本内容就会自动变为一行显示,所以只能用ie的专有断行属性“word-break或word-wrap”控制其断行)*/
    }
    </style>
    
    <body>
        <form onsubmit="submitFn(this, event);">
            <div class="search-wrapper">
                <div class="input-holder">
                    <input type="text" class="search-input" placeholder="Search words !" />
                    <button class="search-icon" onclick="searchToggle(this, event);"><span></span></button>
                </div>
                <span class="close" onclick="searchToggle(this, event);"></span>
                <div class="result-container">
                    <span></span>
                </div>
                <div id='s_result' class="comments">
                </div>
            </div>
        </form>
        <div style="text-align:center;margin:50px 0; font:normal 14px/24px 'MicroSoft YaHei';">
        </div>
    </body>
    <script type="text/javascript">
    function searchToggle(obj, evt) {
        var container = $(obj).closest('.search-wrapper');
    
        if (!container.hasClass('active')) {
            container.addClass('active');
            evt.preventDefault();
        } else if (container.hasClass('active') && $(obj).closest('.input-holder').length == 0) {
            container.removeClass('active');
            // clear input
            container.find('.search-input').val('');
            // clear and hide result container when we press close
            container.find('.result-container').fadeOut(100, function() {
                $(this).empty();
            });
        }
    }
    
    function submitFn(obj, evt) {
        value = $(obj).find('.search-input').val().trim();
        var url = '/find/result';
        var word = value
        console.log(word)
        $.get(
            url, {
                word: word
            },
            function(data) {
                $('#s_result').html(data);
            }
        );
        evt.preventDefault();
    }
    </script>
    
    </html>
    

    这里我用到了Layui框架使界面简洁美观点,为了测试默认显示两段内容,效果如下:



    编写搜索结果展示界面findwords.html

    <!DOCTYPE html> {% load staticfiles %}
    <html>
    
    <head>
        <meta charset="utf-8">
        <title>FindWorld</title>
        <meta name="renderer" content="webkit">
        <meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1">
        <meta name="viewport" content="width=device-width, initial-scale=1, maximum-scale=1">
        <meta name="apple-mobile-web-app-status-bar-style" content="black">
        <meta name="apple-mobile-web-app-capable" content="yes">
        <meta name="format-detection" content="telephone=no">
        <link rel="stylesheet" type="text/css" href="{% static 'layui/css/layui.css' %}" media="all" />
        <link rel="stylesheet" href="{% static 'css/public.css' %}" media="all" />
    </head>
    
    <body class="childrenBody">
        <form class="layui-form">
            <br>
            <blockquote class="layui-elem-quote quoteBox">
                <div class="layui-row">
                    <div class="layui-col-md3">
                        <div class="grid-demo grid-demo-bg1">&nbsp;</div>
                    </div>
                    <div class="layui-col-md6">
                        <form class="layui-form">
                            <div class="layui-row grid-demo grid-demo-bg1">
                                <div class="layui-col-md9">
                                    <div class="layui-block">
                                        <input type="text" name="keyword" id="keyword" lay-verify="title" autocomplete="off" placeholder="请输入搜索的关键字" class="layui-input searchVal">
                                    </div>
                                </div>
                                <div class="layui-col-md3">
                                    <!-- <a class="layui-btn search_btn" type="button" data-type="reload" lay-submit="" id="search_btn" lay-filter="search_btn">搜索</a> -->
                                    <button class="layui-btn" lay-submit lay-filter="search_btn" id="search_btn">搜索</button>
                                    <a class="layui-btn layui-btn-normal dir_btn" type="button" >目录</a>
                                </div>
                            </div>
                        </form>
                    </div>
                    <div class="layui-col-md3">
                        <div class="grid-demo">&nbsp;</div>
                    </div>
                </div>
            </blockquote>
            <hr>
            <table id="userList" lay-filter="userList"></table>
            <!--操作-->
            <script type="text/html" id="userListBar">
                <a class="layui-btn layui-btn-xs" lay-event="look" type="button" >详细</a>
                <a class="layui-btn layui-btn-xs layui-btn-danger" lay-event="del" type="button" >删除</a>
            </script>
        </form>
        <div id="lookResult" style="padding: 40px; line-height: 15px; font-weight: 300; display:none; ">
            <form class="layui-form">
                <div class="layui-form-item layui-row layui-col-xs12">
                    <label class="layui-form-label">文件名称:</label>
                    <div class="layui-input-block">
                        <input type="text" class="layui-input" id="logsort" placeholder="请输入操作类型" readonly />
                    </div>
                </div>
                <div class="layui-form-item layui-row layui-col-xs12">
                    <label class="layui-form-label">修改时间:</label>
                    <div class="layui-input-block">
                        <input type="text" class="layui-input" id="lookedate" readonly />
                    </div>
                </div>
                <div class="layui-form-item layui-row layui-col-xs12">
                    <label class="layui-form-label">搜索结果:</label>
                    <div class="layui-input-block">
                        <textarea placeholder="请输入日志内容" style="min-height: 170px;" class="layui-textarea userDesc" name="pcloudcontent" id="pcloudcontent" readonly></textarea>
                    </div>
                </div>
            </form>
        </div>
        <script type="text/javascript" src="{% static 'layui/layui.js' %}"></script>
        <script type="text/javascript" src="{% static 'js/userList.js' %}"></script>
        <script type="text/javascript" src="{% static 'js/jquery-3.2.1.min.js' %}"></script>
        <script type="text/javascript">
        var json_data = {{ json_data | safe }};
        console.log(json_data);
        </script>
    </body>
    
    </html>
    

    4、我们来测试一下效果
    我在words文件夹里新建了一个测试文件“倚天屠龙记.docx”,下面我搜索两个关键词‘张无忌’、‘张三丰’,结果如下:



    几乎是秒搜索,点击详情还可以看到整段内容

    经过大量测试,可以支持二三十个文件关键字检索,后续还需进行优化。

    如果你喜欢本文章,还请点个关注和喜欢,我会为大家不断地带来Python学习笔记。

    相关文章

      网友评论

          本文标题:Python实现搜索关键字定位文件02

          本文链接:https://www.haomeiwen.com/subject/axtweqtx.html