美文网首页
Oracle jdbc获取元数据过程优化

Oracle jdbc获取元数据过程优化

作者: hemny | 来源:发表于2018-11-09 15:51 被阅读40次

    一、背景

    因业务需要,需要使用java程序自动获取oracle数据库的元数据信息(包括表、字段、索引、约束。。。)。
    在个别客户环境中,获取元数据信息的效率低下,获取速度低至每分钟6个表(还是4线程同时在工作,换算为单线程,平均每个数据表的元数据需要接近40s),在如此慢的效率下,如果客户单个数据库有20万个表,全部获取完成需要消耗时间:200 000(张表) * 单个线程32秒每张表 / 4线程 = 1600000 秒 = 444.44 小时 = 18.52 天。
    仅仅获取全库20W的元数据就要消耗18天,如此效率会影响到其他业务或者工作的开展。

    二、原因分析

    因客户无法接受这么慢的处理效率,配合我们分析慢的原因,采集了一份元数据扫描处理过程中的Oracle AWR 报告,供我们分析原因和解决问题。

    一份完整的AWR报告,包含oracle数据库中大部分处理信息


    image.png

    分析SQL统计 —— SQL Statistics


    image.png

    分析SQL执行耗时—— SQL ordered by Elapsed Time


    image.png

    这里可以看到最耗时的一个SQL,一个小时内执行了450次,执行一次平均需要29秒。
    查看详细的SQL:


    image.png
    SELECT NULL AS TABLE_CAT, T.OWNER AS TABLE_SCHEM, T.TABLE_NAME AS TABLE_NAME, T.COLUMN_NAME AS COLUMN_NAME, DECODE( (SELECT A.TYPECODE FROM ALL_TYPES A WHERE A.TYPE_NAME = T.DATA_TYPE), 'OBJECT', 2002, 'COLLECTION', 2003, DECODE(SUBSTR(T.DATA_TYPE, 1, 9), 'TIMESTAMP', DECODE(SUBSTR(T.DATA_TYPE, 10, 1), '(', DECODE(SUBSTR(T.DATA_TYPE, 19, 5), 'LOCAL', -102, 'TIME ', -101, 93), DECODE(SUBSTR(T.DATA_TYPE, 16, 5), 'LOCAL', -102, 'TIME ', -101, 93)), 'INTERVAL ', DECODE(SUBSTR(T.DATA_TYPE, 10, 3), 'DAY', -104, 'YEA', -103), DECODE(T.DATA_TYPE, 'BINARY_DOUBLE', 101, 'BINARY_FLOAT', 100, 'BFILE', -13, 'BLOB', 2004, 'CHAR', 1, 'CLOB', 2005, 'COLLECTION', 2003, 'DATE', 93, 'FLOAT', 6, 'LONG', -1, 'LONG RAW', -4, 'NCHAR', -15, 'NCLOB', 2011, 'NUMBER', 2, 'NVARCHAR', -9, 'NVARCHAR2', -9, 'OBJECT', 2002, 'OPAQUE/XMLTYPE', 2009, 'RAW', -3, 'REF', 2006, 'ROWID', -8, 'SQLXML', 2009, 'UROWID', -8, 'VARCHAR2', 12, 'VARRAY', 2003, 'XMLTYPE', 2009, 1111))) AS DATA_TYPE, T.DATA_TYPE AS TYPE_NAME, DECODE (T.DATA_PRECISION, NULL, DECODE(T.DATA_TYPE, 'NUMBER', DECODE(T.DATA_SCALE, NULL, 0 , 38), DECODE (T.DATA_TYPE, 'CHAR', T.CHAR_LENGTH, 'VARCHAR', T.CHAR_LENGTH, 'VARCHAR2', T.CHAR_LENGTH, 'NVARCHAR2', T.CHAR_LENGTH, 'NCHAR', T.CHAR_LENGTH, 'NUMBER', 0, T.DATA_LENGTH) ), T.DATA_PRECISION) AS COLUMN_SIZE, 0 AS BUFFER_LENGTH, DECODE (T.DATA_TYPE, 'NUMBER', DECODE(T.DATA_PRECISION , NULL, DECODE(T.DATA_SCALE, NULL, -127 , T.DATA_SCALE), T.DATA_SCALE), T.DATA_SCALE) AS DECIMAL_DIGITS, 10 AS NUM_PREC_RADIX, DECODE (T.NULLABLE, 'N', 0, 1) AS NULLABLE, C.COMMENTS AS REMARKS, T.DATA_DEFAULT AS COLUMN_DEF, 0 AS SQL_DATA_TYPE, 0 AS SQL_DATETIME_SUB, T.DATA_LENGTH AS CHAR_OCTET_LENGTH, T.COLUMN_ID AS ORDINAL_POSITION, DECODE (T.NULLABLE, 'N', 'NO', 'YES') AS IS_NULLABLE, NULL AS SCOPE_CATALOG, NULL AS SCOPE_SCHEMA, NULL AS SCOPE_TABLE, NULL AS SOURCE_DATA_TYPE, 'NO' AS IS_AUTOINCREMENT, NULL AS IS_GENERATEDCOLUMN FROM ALL_TAB_COLUMNS T, ALL_COL_COMMENTS C WHERE T.OWNER LIKE :B3 ESCAPE '/' AND T.TABLE_NAME LIKE :B2 ESCAPE '/' AND T.COLUMN_NAME LIKE :B1 ESCAPE '/' AND T.OWNER = C.OWNER (+) AND T.TABLE_NAME = C.TABLE_NAME (+) AND T.COLUMN_NAME = C.COLUMN_NAME (+) ORDER BY TABLE_SCHEM, TABLE_NAME, ORDINAL_POSITION
    

    格式化之后

    SELECT NULL AS TABLE_CAT, T.OWNER AS TABLE_SCHEM, T.TABLE_NAME AS TABLE_NAME, T.COLUMN_NAME AS COLUMN_NAME, DECODE( (SELECT A.TYPECODE FROM ALL_TYPES A WHERE A.TYPE_NAME = T.DATA_TYPE), 'OBJECT', 2002, 'COLLECTION', 2003, 
             DECODE(SUBSTR(T.DATA_TYPE, 1, 9), 'TIMESTAMP', DECODE(SUBSTR(T.DATA_TYPE, 10, 1), '(', DECODE(SUBSTR(T.DATA_TYPE, 19, 5), 'LOCAL', -102, 'TIME ', -101, 93), 
             DECODE(SUBSTR(T.DATA_TYPE, 16, 5), 'LOCAL', -102, 'TIME ', -101, 93)), 'INTERVAL ', DECODE(SUBSTR(T.DATA_TYPE, 10, 3), 'DAY', -104, 'YEA', -103), 
             DECODE(T.DATA_TYPE, 'BINARY_DOUBLE', 101, 'BINARY_FLOAT', 100, 'BFILE', -13, 'BLOB', 2004, 'CHAR', 1, 'CLOB', 2005, 'COLLECTION', 2003, 'DATE', 93, 'FLOAT', 6, 'LONG', -1, 'LONG RAW', -4, 'NCHAR', -15, 'NCLOB', 2011, 'NUMBER', 2, 'NVARCHAR', -9, 'NVARCHAR2', -9, 'OBJECT', 2002, 'OPAQUE/XMLTYPE', 2009, 'RAW', -3, 'REF', 2006, 'ROWID', -8, 'SQLXML', 2009, 'UROWID', -8, 'VARCHAR2', 12, 'VARRAY', 2003, 'XMLTYPE', 2009, 1111))) AS DATA_TYPE, T.DATA_TYPE AS TYPE_NAME, 
             DECODE (T.DATA_PRECISION, NULL, DECODE(T.DATA_TYPE, 'NUMBER', DECODE(T.DATA_SCALE, NULL, 0 , 38), DECODE (T.DATA_TYPE, 'CHAR', T.CHAR_LENGTH, 'VARCHAR', T.CHAR_LENGTH, 'VARCHAR2', T.CHAR_LENGTH, 'NVARCHAR2', T.CHAR_LENGTH, 'NCHAR', T.CHAR_LENGTH, 'NUMBER', 0, T.DATA_LENGTH) ), T.DATA_PRECISION) AS COLUMN_SIZE, 0 AS BUFFER_LENGTH, 
             DECODE (T.DATA_TYPE, 'NUMBER', DECODE(T.DATA_PRECISION , NULL, 
             DECODE(T.DATA_SCALE, NULL, -127 , T.DATA_SCALE), T.DATA_SCALE), T.DATA_SCALE) AS DECIMAL_DIGITS, 10 AS NUM_PREC_RADIX, DECODE (T.NULLABLE, 'N', 0, 1) AS NULLABLE, C.COMMENTS AS REMARKS, T.DATA_DEFAULT AS COLUMN_DEF, 0 AS SQL_DATA_TYPE, 0 AS SQL_DATETIME_SUB, T.DATA_LENGTH AS CHAR_OCTET_LENGTH, T.COLUMN_ID AS ORDINAL_POSITION, DECODE (T.NULLABLE, 'N', 'NO', 'YES') AS IS_NULLABLE, NULL AS SCOPE_CATALOG, NULL AS SCOPE_SCHEMA, NULL AS SCOPE_TABLE, NULL AS SOURCE_DATA_TYPE, 'NO' AS IS_AUTOINCREMENT, NULL AS IS_GENERATEDCOLUMN 
    FROM ALL_TAB_COLUMNS T, ALL_COL_COMMENTS C 
    WHERE T.OWNER LIKE :B3 ESCAPE '/' AND T.TABLE_NAME LIKE :B2 ESCAPE '/'  AND T.COLUMN_NAME LIKE :B1 ESCAPE '/' AND T.OWNER = C.OWNER (+) AND T.TABLE_NAME = C.TABLE_NAME (+) AND T.COLUMN_NAME = C.COLUMN_NAME (+) 
    ORDER BY TABLE_SCHEM, TABLE_NAME, ORDINAL_POSITION;
    

    SQL性能分析:

    • 两个视图之间外关联
    • 子查询
    • 模糊查询
    • 大量调用 oracle的DECODE函数
    • 3个字段的排序

    SQL出处
    分析SQL查询结果,是为了查询一个表的字段信息,但是并非我们代码中写的SQL。
    回归程序,找到获取oracle表字段信息的代码:

    ...
        /**
         * 获取字段拓展信息
         * 
         * @param catalog
         *            数据库名
         * @param schema
         *            用户名
         * @param table
         *            表名
         * @return 字段拓展信息map对象
         */
        protected Map<String, Map> getColumnExt(Connection conn, String catalog, String schema, String tabname) {
            Map<String, Map> m_ext = new HashMap<String, Map>();
            String sql = "select column_name colname,data_type coltype,data_precision prec,data_scale scale,virtual_column virtualcolumn,data_length,char_length,char_used"
                    + " from all_tab_cols where HIDDEN_COLUMN='NO' and owner=? and table_name=?";
            String colname, coltype, virtualCol, charUsed;
            long charLength, dataLength;
            boolean isPrecNull, isScaleNull;
            try (PreparedStatement pstmt = conn.prepareStatement(sql)) {
                pstmt.setFetchSize(FETCH_SIZE);
                pstmt.setString(1, schema);
                pstmt.setString(2, tabname);
                try (ResultSet rs = pstmt.executeQuery()) {
                    while (rs.next()) {
                        ...
                    }
                }
            } catch (SQLException ex) {
                ...
                throw new RuntimeException(ex);
            }
            return m_ext;
        }
    
        // 获取表中的所有字段以及定义信息
        protected List<Document> getColumns(Connection conn, DatabaseMetaData metadata, String catalog, String schemaname, String tabname, String tabid, String dbcharset) {
            ...
            ResultSet rs = null;
            try {
                ...
                Map<String, Map> m_ext = this.getColumnExt(conn, catalog, schemaname, tabname);
    
                ...
                rs = metadata.getColumns(catalog, schema_tmp, table_tmp, null);
                rs.setFetchSize(1);
                ...
                while (rs.next()) {
                    ...
                }
            } catch (SQLException e) {
                ...
                throw new RuntimeException(e);
            } finally {
                if (rs != null) { // 关闭结果集
                    try {
                        rs.close();
                    } catch (SQLException e) {
                        logger.error("Resultset close failed with " + e.getMessage(), e);
                    }
                }
            }
            return cols;
        }
    ...
    

    刚才那个执行耗时最多的SQL,是以下命令发出的,jdbc驱动内部的SQL。

    rs = metadata.getColumns(catalog, schema_tmp, table_tmp, null);
    

    原来获取一个表的信息执行了这个SQL外,还要执行以下SQL。

    select column_name colname,data_type coltype,data_precision prec,data_scale scale,virtual_column virtualcolumn,data_length,char_length,char_used from all_tab_cols where HIDDEN_COLUMN='NO' and owner=? and table_name=?
    

    小结
    获取一个表的字段信息,需要查询两次数据库,并且有一个查询的效率非常低,平均耗时接近30秒。

    三、优化过程

    分解目标:

    • 取消模糊查询:一般情况下,oracle的模糊查询需要全表扫描+模糊匹配,耗时非常大
    • 取消子查询:子查询效率低
    • 取消 DECODE 函数 : 尽量减少函数调用,减轻数据库处理消耗
    • 减少oracle数据库数据排序
    • 尽量减少 表间关联查询
    • 减少 查询次数

    最终目标:

    • 减少字段信息获取的时间及资源开销,提升处理效率

    但是这个SQL是oracle jdbc驱动内部实现jdbc元数据接口的函数,为了优化SQL提升性能,首先就要放弃使用jdbc标准的元数据接口。

    以下两个查询:

    #查询1
    select column_name colname,data_type coltype,data_precision prec,data_scale scale,virtual_column virtualcolumn,data_length,char_length,char_used from all_tab_cols where HIDDEN_COLUMN='NO' and owner=? and table_name=? ;
    
    #查询2
    SELECT NULL AS TABLE_CAT, T.OWNER AS TABLE_SCHEM, T.TABLE_NAME AS TABLE_NAME, T.COLUMN_NAME AS COLUMN_NAME, DECODE( (SELECT A.TYPECODE FROM ALL_TYPES A WHERE A.TYPE_NAME = T.DATA_TYPE), 'OBJECT', 2002, 'COLLECTION', 2003, 
             DECODE(SUBSTR(T.DATA_TYPE, 1, 9), 'TIMESTAMP', DECODE(SUBSTR(T.DATA_TYPE, 10, 1), '(', DECODE(SUBSTR(T.DATA_TYPE, 19, 5), 'LOCAL', -102, 'TIME ', -101, 93), 
             DECODE(SUBSTR(T.DATA_TYPE, 16, 5), 'LOCAL', -102, 'TIME ', -101, 93)), 'INTERVAL ', DECODE(SUBSTR(T.DATA_TYPE, 10, 3), 'DAY', -104, 'YEA', -103), 
             DECODE(T.DATA_TYPE, 'BINARY_DOUBLE', 101, 'BINARY_FLOAT', 100, 'BFILE', -13, 'BLOB', 2004, 'CHAR', 1, 'CLOB', 2005, 'COLLECTION', 2003, 'DATE', 93, 'FLOAT', 6, 'LONG', -1, 'LONG RAW', -4, 'NCHAR', -15, 'NCLOB', 2011, 'NUMBER', 2, 'NVARCHAR', -9, 'NVARCHAR2', -9, 'OBJECT', 2002, 'OPAQUE/XMLTYPE', 2009, 'RAW', -3, 'REF', 2006, 'ROWID', -8, 'SQLXML', 2009, 'UROWID', -8, 'VARCHAR2', 12, 'VARRAY', 2003, 'XMLTYPE', 2009, 1111))) AS DATA_TYPE, T.DATA_TYPE AS TYPE_NAME, 
             DECODE (T.DATA_PRECISION, NULL, DECODE(T.DATA_TYPE, 'NUMBER', DECODE(T.DATA_SCALE, NULL, 0 , 38), DECODE (T.DATA_TYPE, 'CHAR', T.CHAR_LENGTH, 'VARCHAR', T.CHAR_LENGTH, 'VARCHAR2', T.CHAR_LENGTH, 'NVARCHAR2', T.CHAR_LENGTH, 'NCHAR', T.CHAR_LENGTH, 'NUMBER', 0, T.DATA_LENGTH) ), T.DATA_PRECISION) AS COLUMN_SIZE, 0 AS BUFFER_LENGTH, 
             DECODE (T.DATA_TYPE, 'NUMBER', DECODE(T.DATA_PRECISION , NULL, 
             DECODE(T.DATA_SCALE, NULL, -127 , T.DATA_SCALE), T.DATA_SCALE), T.DATA_SCALE) AS DECIMAL_DIGITS, 10 AS NUM_PREC_RADIX, DECODE (T.NULLABLE, 'N', 0, 1) AS NULLABLE, C.COMMENTS AS REMARKS, T.DATA_DEFAULT AS COLUMN_DEF, 0 AS SQL_DATA_TYPE, 0 AS SQL_DATETIME_SUB, T.DATA_LENGTH AS CHAR_OCTET_LENGTH, T.COLUMN_ID AS ORDINAL_POSITION, DECODE (T.NULLABLE, 'N', 'NO', 'YES') AS IS_NULLABLE, NULL AS SCOPE_CATALOG, NULL AS SCOPE_SCHEMA, NULL AS SCOPE_TABLE, NULL AS SOURCE_DATA_TYPE, 'NO' AS IS_AUTOINCREMENT, NULL AS IS_GENERATEDCOLUMN 
    FROM ALL_TAB_COLUMNS T, ALL_COL_COMMENTS C 
    WHERE T.OWNER LIKE :B3 ESCAPE '/' AND T.TABLE_NAME LIKE :B2 ESCAPE '/'  AND T.COLUMN_NAME LIKE :B1 ESCAPE '/' AND T.OWNER = C.OWNER (+) AND T.TABLE_NAME = C.TABLE_NAME (+) AND T.COLUMN_NAME = C.COLUMN_NAME (+) 
    ORDER BY TABLE_SCHEM, TABLE_NAME, ORDINAL_POSITION;
    

    涉及的表或试图:

    • all_tab_cols
    • ALL_TAB_COLUMNS
    • ALL_COL_COMMENTS
    • ALL_TYPES (子查询)
    目标逐个实现:
    1. 取消模糊查询
    根据业务需要,对复杂SQL调整,取消模糊查询
    
    ...
    WHERE T.OWNER LIKE :B3 ESCAPE '/' AND T.TABLE_NAME LIKE :B2 ESCAPE '/'  AND T.COLUMN_NAME LIKE :B1 ESCAPE '/' AND T.OWNER = C.OWNER (+) AND T.TABLE_NAME = C.TABLE_NAME (+) AND T.COLUMN_NAME = C.COLUMN_NAME (+) 
    ...
    改为
    ...
    WHERE T.OWNER = :B3 AND T.TABLE_NAME = :B2 AND T.OWNER = C.OWNER (+) AND T.TABLE_NAME = C.TABLE_NAME (+) AND T.COLUMN_NAME = C.COLUMN_NAME (+) 
    ...
    
    2. 取消子查询,同时取消 DECODE 函数
    在复杂SQL中,存在以下子查询,并使用很多decode函数。
    
    DECODE( (SELECT A.TYPECODE FROM ALL_TYPES A WHERE A.TYPE_NAME = T.DATA_TYPE), 'OBJECT', 2002, 'COLLECTION', 2003, 
             DECODE(SUBSTR(T.DATA_TYPE, 1, 9), 'TIMESTAMP', DECODE(SUBSTR(T.DATA_TYPE, 10, 1), '(', DECODE(SUBSTR(T.DATA_TYPE, 19, 5), 'LOCAL', -102, 'TIME ', -101, 93), 
             DECODE(SUBSTR(T.DATA_TYPE, 16, 5), 'LOCAL', -102, 'TIME ', -101, 93)), 'INTERVAL ', DECODE(SUBSTR(T.DATA_TYPE, 10, 3), 'DAY', -104, 'YEA', -103), 
             DECODE(T.DATA_TYPE, 'BINARY_DOUBLE', 101, 'BINARY_FLOAT', 100, 'BFILE', -13, 'BLOB', 2004, 'CHAR', 1, 'CLOB', 2005, 'COLLECTION', 2003, 'DATE', 93, 'FLOAT', 6, 'LONG', -1, 'LONG RAW', -4, 'NCHAR', -15, 'NCLOB', 2011, 'NUMBER', 2, 'NVARCHAR', -9, 'NVARCHAR2', -9, 'OBJECT', 2002, 'OPAQUE/XMLTYPE', 2009, 'RAW', -3, 'REF', 2006, 'ROWID', -8, 'SQLXML', 2009, 'UROWID', -8, 'VARCHAR2', 12, 'VARRAY', 2003, 'XMLTYPE', 2009, 1111))) AS DATA_TYPE
    

    分析:
    子查询涉及表ALL_TYPES ,这是oracle的类型表,然后使用多层decode函数对DATA_TYPE、TYPECODE的值进行分支判断,返回jdbc对应的DATA_TYPE。

    优化方案:

    • 对应DECODE函数,采用java函数处理代替oracle decode函数。
    • 对应ALL_TYPES表采用缓存,避免对其关联查询找出额外的数据库资源开销。
    3. 减少oracle数据库数据排序

    目前是每次获取一个表

    ORDER BY TABLE_SCHEM, TABLE_NAME, ORDINAL_POSITION;
    
    改为
    
     ORDER BY ORDINAL_POSITION;
    

    还可以直接取消数据库排序,采用java函数排序。(目前没有做到这点)。

    4. 减少查询次数

    将上面两个涉及到字段的查询,合成一个查询,避免多次查询数据库。

    4. 减少 表外关联查询

    以上查询涉及表或视图

    • all_tab_cols
    • ALL_TAB_COLUMNS
    • ALL_COL_COMMENTS
    • ALL_TYPES (子查询,前面已经对其缓存)

    为了尽可能减少外关联查询造成的开销,对all_tab_cols 、ALL_TAB_COLUMNS、ALL_COL_COMMENTS 这3个视图进一步了解。其中ALL_TAB_COLUMNS是all_tab_cols 的视图。

    CREATE VIEW
        ALL_TAB_COLUMNS
        (
            OWNER,
            TABLE_NAME,
            COLUMN_NAME,
            DATA_TYPE,
            DATA_TYPE_MOD,
            DATA_TYPE_OWNER,
            DATA_LENGTH,
            DATA_PRECISION,
            DATA_SCALE,
            NULLABLE,
            COLUMN_ID,
            DEFAULT_LENGTH,
            DATA_DEFAULT,
            NUM_DISTINCT,
            LOW_VALUE,
            HIGH_VALUE,
            DENSITY,
            NUM_NULLS,
            NUM_BUCKETS,
            LAST_ANALYZED,
            SAMPLE_SIZE,
            CHARACTER_SET_NAME,
            CHAR_COL_DECL_LENGTH,
            GLOBAL_STATS,
            USER_STATS,
            AVG_COL_LEN,
            CHAR_LENGTH,
            CHAR_USED,
            V80_FMT_IMAGE,
            DATA_UPGRADED,
            HISTOGRAM,
            DEFAULT_ON_NULL,
            IDENTITY_COLUMN,
            EVALUATION_EDITION,
            UNUSABLE_BEFORE,
            UNUSABLE_BEGINNING
        ) AS
    SELECT
        OWNER,
        TABLE_NAME,
        COLUMN_NAME,
        DATA_TYPE,
        DATA_TYPE_MOD,
        DATA_TYPE_OWNER,
        DATA_LENGTH,
        DATA_PRECISION,
        DATA_SCALE,
        NULLABLE,
        COLUMN_ID,
        DEFAULT_LENGTH,
        DATA_DEFAULT,
        NUM_DISTINCT,
        LOW_VALUE,
        HIGH_VALUE,
        DENSITY,
        NUM_NULLS,
        NUM_BUCKETS,
        LAST_ANALYZED,
        SAMPLE_SIZE,
        CHARACTER_SET_NAME,
        CHAR_COL_DECL_LENGTH,
        GLOBAL_STATS,
        USER_STATS,
        AVG_COL_LEN,
        CHAR_LENGTH,
        CHAR_USED,
        V80_FMT_IMAGE,
        DATA_UPGRADED,
        HISTOGRAM,
        DEFAULT_ON_NULL,
        IDENTITY_COLUMN,
        EVALUATION_EDITION,
        UNUSABLE_BEFORE,
        UNUSABLE_BEGINNING
    FROM
        ALL_TAB_COLS
    WHERE
        USER_GENERATED = 'YES';
    

    所以,所有查询需要的信息都在all_tab_cols、ALL_COL_COMMENTS 两个视图中,取消对ALL_TAB_COLUMNS的关联,根据业务需要调整SQL:

     SELECT T.DATA_DEFAULT, T.COLUMN_ID,  T.COLUMN_NAME,  T.DATA_TYPE, T.DATA_TYPE_OWNER, T.DATA_LENGTH, T.DATA_PRECISION, T.DATA_SCALE, T.NULLABLE, T.CHAR_LENGTH, T.CHAR_USED, C.COMMENTS, T.VIRTUAL_COLUMN 
     FROM ALL_TAB_COLS T LEFT JOIN ALL_COL_COMMENTS C ON T.OWNER = C.OWNER  AND T.TABLE_NAME = C.TABLE_NAME  AND T.COLUMN_NAME = C.COLUMN_NAME 
     WHERE  T.USER_GENERATED = 'YES' AND  T.OWNER=? AND T.TABLE_NAME=? 
     ORDER BY T.COLUMN_ID ASC 
    

    以上SQL涵盖我们需要的信息。

    让客户做SQL优化对比测试:
    schema:ABM
    table:ABM_IMP_BILL_LINES

    -------优化前--------
    SELECT NULL AS TABLE_CAT, T.OWNER AS TABLE_SCHEM, T.TABLE_NAME AS TABLE_NAME, T.COLUMN_NAME AS COLUMN_NAME, DECODE( (SELECT A.TYPECODE FROM ALL_TYPES A WHERE A.TYPE_NAME = T.DATA_TYPE), 'OBJECT', 2002, 'COLLECTION', 2003, 
             DECODE(SUBSTR(T.DATA_TYPE, 1, 9), 'TIMESTAMP', DECODE(SUBSTR(T.DATA_TYPE, 10, 1), '(', DECODE(SUBSTR(T.DATA_TYPE, 19, 5), 'LOCAL', -102, 'TIME ', -101, 93), 
             DECODE(SUBSTR(T.DATA_TYPE, 16, 5), 'LOCAL', -102, 'TIME ', -101, 93)), 'INTERVAL ', DECODE(SUBSTR(T.DATA_TYPE, 10, 3), 'DAY', -104, 'YEA', -103), 
             DECODE(T.DATA_TYPE, 'BINARY_DOUBLE', 101, 'BINARY_FLOAT', 100, 'BFILE', -13, 'BLOB', 2004, 'CHAR', 1, 'CLOB', 2005, 'COLLECTION', 2003, 'DATE', 93, 'FLOAT', 6, 'LONG', -1, 'LONG RAW', -4, 'NCHAR', -15, 'NCLOB', 2011, 'NUMBER', 2, 'NVARCHAR', -9, 'NVARCHAR2', -9, 'OBJECT', 2002, 'OPAQUE/XMLTYPE', 2009, 'RAW', -3, 'REF', 2006, 'ROWID', -8, 'SQLXML', 2009, 'UROWID', -8, 'VARCHAR2', 12, 'VARRAY', 2003, 'XMLTYPE', 2009, 1111))) AS DATA_TYPE, T.DATA_TYPE AS TYPE_NAME, 
             DECODE (T.DATA_PRECISION, NULL, DECODE(T.DATA_TYPE, 'NUMBER', DECODE(T.DATA_SCALE, NULL, 0 , 38), DECODE (T.DATA_TYPE, 'CHAR', T.CHAR_LENGTH, 'VARCHAR', T.CHAR_LENGTH, 'VARCHAR2', T.CHAR_LENGTH, 'NVARCHAR2', T.CHAR_LENGTH, 'NCHAR', T.CHAR_LENGTH, 'NUMBER', 0, T.DATA_LENGTH) ), T.DATA_PRECISION) AS COLUMN_SIZE, 0 AS BUFFER_LENGTH, 
             DECODE (T.DATA_TYPE, 'NUMBER', DECODE(T.DATA_PRECISION , NULL, 
             DECODE(T.DATA_SCALE, NULL, -127 , T.DATA_SCALE), T.DATA_SCALE), T.DATA_SCALE) AS DECIMAL_DIGITS, 10 AS NUM_PREC_RADIX, DECODE (T.NULLABLE, 'N', 0, 1) AS NULLABLE, C.COMMENTS AS REMARKS, T.DATA_DEFAULT AS COLUMN_DEF, 0 AS SQL_DATA_TYPE, 0 AS SQL_DATETIME_SUB, T.DATA_LENGTH AS CHAR_OCTET_LENGTH, T.COLUMN_ID AS ORDINAL_POSITION, DECODE (T.NULLABLE, 'N', 'NO', 'YES') AS IS_NULLABLE, NULL AS SCOPE_CATALOG, NULL AS SCOPE_SCHEMA, NULL AS SCOPE_TABLE, NULL AS SOURCE_DATA_TYPE, 'NO' AS IS_AUTOINCREMENT, NULL AS IS_GENERATEDCOLUMN 
    FROM ALL_TAB_COLUMNS T, ALL_COL_COMMENTS C 
    WHERE T.OWNER LIKE 'ABM' ESCAPE '/' AND T.TABLE_NAME LIKE 'ABM/_IMP/_BILL/_LINES' ESCAPE '/' AND T.OWNER = C.OWNER (+) AND T.TABLE_NAME = C.TABLE_NAME (+) AND T.COLUMN_NAME = C.COLUMN_NAME (+) 
    ORDER BY TABLE_SCHEM, TABLE_NAME, ORDINAL_POSITION;
    
    Elapsed: 00:00:26.14
    
    -------优化后--------
    --oracle 10/11g
    SELECT T.DATA_DEFAULT, T.COLUMN_ID,  T.COLUMN_NAME,  T.DATA_TYPE, T.DATA_TYPE_OWNER, T.DATA_LENGTH, T.DATA_PRECISION, T.DATA_SCALE, T.NULLABLE, T.CHAR_LENGTH, T.CHAR_USED, C.COMMENTS, T.VIRTUAL_COLUMN 
     FROM ALL_TAB_COLS T 
     LEFT JOIN ALL_COL_COMMENTS C ON T.OWNER = C.OWNER  AND T.TABLE_NAME = C.TABLE_NAME  AND T.COLUMN_NAME = C.COLUMN_NAME 
     WHERE  T.HIDDEN_COLUMN = 'NO' AND  T.OWNER='ABM' AND T.TABLE_NAME='ABM_IMP_BILL_LINES' 
     ORDER BY T.COLUMN_ID ASC ;
     
    Elapsed: 00:00:00.16
    
    

    这里就完成了耗时最多的SQL优化,每个表获取字段信息由原来的耗时 29s 优化为 0.15s

    四、其他优化

    以上只是最耗时的SQL,因采集AWR报告时,已经屏蔽了其它信息采集,其中有两个相对比较耗时的信息采集:

    • 表数据量
    • 表记录数
      在优化和内部测试过程中,发现这两个相对比较耗时,为了提升性能同时保留采集信息完整性。对处理过程进行优化:
    • 在开始采集一个schem的元数据时,一次性把整个schema的表记录数和数据量查出来并缓存。
    • 后面采集过程中,需要 表记录数和数据量 。就直接在缓存中取。

    五、性能提升

    对于公司内部测试(oracle 12c 整库20W张表,并使用4线程扫描):
    优化之前每秒扫描6~10个表的元数据
    优化后平均每秒扫描54个表的元数据,20万张表在一个小时左右完成

    客户环境(HR库,3W+张表):
    优化前耗时(3小时49分)


    image.png

    优化后耗时(4分35秒)


    image.png

    五、进一步优化

    虽然对客户环境的数据库优化,已经让客户满意,但是根据数据分析,客户环境扫描优化还能进一步优化。

    所以,对耗时较多的扫描查询,也采用与表记录数、表数据量相同的优化方案:

    • check约束
    • 主键约束
    • 外键约束
    • 唯一约束
    • 其他(表其他信息、临时表标志。。。)

    对于公司内部测试(oracle 12c 整库20W张表,并使用4线程扫描):
    每秒扫描54张表提升到 每秒扫描65张表。

    客户环境中(客户oracle 10.2 整个库20W表):
    预计可以将扫描时间降到0.2秒/每表(原来每个表32秒)。

    附录:生成AWR的过程:

    1、执行两次生成快照的过程,中间需要间隔一定时间

    SQL> exec dbms_workload_repository.create_snapshot();
    

    2、执行生成AWR报告的脚本

     @/app/oracle/product/12c/db_1/rdbms/admin/awrrpt.sql
    

    输入报告的类型、报告开始快照id及结束快照id、快照文件名,就会生成相应的报告文件。

    具体参考:https://blog.csdn.net/defonds/article/details/52958303

    相关文章

      网友评论

          本文标题:Oracle jdbc获取元数据过程优化

          本文链接:https://www.haomeiwen.com/subject/demnxqtx.html