美文网首页
Oracle jdbc获取元数据过程优化

Oracle jdbc获取元数据过程优化

作者: hemny | 来源:发表于2018-11-09 15:51 被阅读40次

一、背景

因业务需要,需要使用java程序自动获取oracle数据库的元数据信息(包括表、字段、索引、约束。。。)。
在个别客户环境中,获取元数据信息的效率低下,获取速度低至每分钟6个表(还是4线程同时在工作,换算为单线程,平均每个数据表的元数据需要接近40s),在如此慢的效率下,如果客户单个数据库有20万个表,全部获取完成需要消耗时间:200 000(张表) * 单个线程32秒每张表 / 4线程 = 1600000 秒 = 444.44 小时 = 18.52 天。
仅仅获取全库20W的元数据就要消耗18天,如此效率会影响到其他业务或者工作的开展。

二、原因分析

因客户无法接受这么慢的处理效率,配合我们分析慢的原因,采集了一份元数据扫描处理过程中的Oracle AWR 报告,供我们分析原因和解决问题。

一份完整的AWR报告,包含oracle数据库中大部分处理信息


image.png

分析SQL统计 —— SQL Statistics


image.png

分析SQL执行耗时—— SQL ordered by Elapsed Time


image.png

这里可以看到最耗时的一个SQL,一个小时内执行了450次,执行一次平均需要29秒。
查看详细的SQL:


image.png
SELECT NULL AS TABLE_CAT, T.OWNER AS TABLE_SCHEM, T.TABLE_NAME AS TABLE_NAME, T.COLUMN_NAME AS COLUMN_NAME, DECODE( (SELECT A.TYPECODE FROM ALL_TYPES A WHERE A.TYPE_NAME = T.DATA_TYPE), 'OBJECT', 2002, 'COLLECTION', 2003, DECODE(SUBSTR(T.DATA_TYPE, 1, 9), 'TIMESTAMP', DECODE(SUBSTR(T.DATA_TYPE, 10, 1), '(', DECODE(SUBSTR(T.DATA_TYPE, 19, 5), 'LOCAL', -102, 'TIME ', -101, 93), DECODE(SUBSTR(T.DATA_TYPE, 16, 5), 'LOCAL', -102, 'TIME ', -101, 93)), 'INTERVAL ', DECODE(SUBSTR(T.DATA_TYPE, 10, 3), 'DAY', -104, 'YEA', -103), DECODE(T.DATA_TYPE, 'BINARY_DOUBLE', 101, 'BINARY_FLOAT', 100, 'BFILE', -13, 'BLOB', 2004, 'CHAR', 1, 'CLOB', 2005, 'COLLECTION', 2003, 'DATE', 93, 'FLOAT', 6, 'LONG', -1, 'LONG RAW', -4, 'NCHAR', -15, 'NCLOB', 2011, 'NUMBER', 2, 'NVARCHAR', -9, 'NVARCHAR2', -9, 'OBJECT', 2002, 'OPAQUE/XMLTYPE', 2009, 'RAW', -3, 'REF', 2006, 'ROWID', -8, 'SQLXML', 2009, 'UROWID', -8, 'VARCHAR2', 12, 'VARRAY', 2003, 'XMLTYPE', 2009, 1111))) AS DATA_TYPE, T.DATA_TYPE AS TYPE_NAME, DECODE (T.DATA_PRECISION, NULL, DECODE(T.DATA_TYPE, 'NUMBER', DECODE(T.DATA_SCALE, NULL, 0 , 38), DECODE (T.DATA_TYPE, 'CHAR', T.CHAR_LENGTH, 'VARCHAR', T.CHAR_LENGTH, 'VARCHAR2', T.CHAR_LENGTH, 'NVARCHAR2', T.CHAR_LENGTH, 'NCHAR', T.CHAR_LENGTH, 'NUMBER', 0, T.DATA_LENGTH) ), T.DATA_PRECISION) AS COLUMN_SIZE, 0 AS BUFFER_LENGTH, DECODE (T.DATA_TYPE, 'NUMBER', DECODE(T.DATA_PRECISION , NULL, DECODE(T.DATA_SCALE, NULL, -127 , T.DATA_SCALE), T.DATA_SCALE), T.DATA_SCALE) AS DECIMAL_DIGITS, 10 AS NUM_PREC_RADIX, DECODE (T.NULLABLE, 'N', 0, 1) AS NULLABLE, C.COMMENTS AS REMARKS, T.DATA_DEFAULT AS COLUMN_DEF, 0 AS SQL_DATA_TYPE, 0 AS SQL_DATETIME_SUB, T.DATA_LENGTH AS CHAR_OCTET_LENGTH, T.COLUMN_ID AS ORDINAL_POSITION, DECODE (T.NULLABLE, 'N', 'NO', 'YES') AS IS_NULLABLE, NULL AS SCOPE_CATALOG, NULL AS SCOPE_SCHEMA, NULL AS SCOPE_TABLE, NULL AS SOURCE_DATA_TYPE, 'NO' AS IS_AUTOINCREMENT, NULL AS IS_GENERATEDCOLUMN FROM ALL_TAB_COLUMNS T, ALL_COL_COMMENTS C WHERE T.OWNER LIKE :B3 ESCAPE '/' AND T.TABLE_NAME LIKE :B2 ESCAPE '/' AND T.COLUMN_NAME LIKE :B1 ESCAPE '/' AND T.OWNER = C.OWNER (+) AND T.TABLE_NAME = C.TABLE_NAME (+) AND T.COLUMN_NAME = C.COLUMN_NAME (+) ORDER BY TABLE_SCHEM, TABLE_NAME, ORDINAL_POSITION

格式化之后

SELECT NULL AS TABLE_CAT, T.OWNER AS TABLE_SCHEM, T.TABLE_NAME AS TABLE_NAME, T.COLUMN_NAME AS COLUMN_NAME, DECODE( (SELECT A.TYPECODE FROM ALL_TYPES A WHERE A.TYPE_NAME = T.DATA_TYPE), 'OBJECT', 2002, 'COLLECTION', 2003, 
         DECODE(SUBSTR(T.DATA_TYPE, 1, 9), 'TIMESTAMP', DECODE(SUBSTR(T.DATA_TYPE, 10, 1), '(', DECODE(SUBSTR(T.DATA_TYPE, 19, 5), 'LOCAL', -102, 'TIME ', -101, 93), 
         DECODE(SUBSTR(T.DATA_TYPE, 16, 5), 'LOCAL', -102, 'TIME ', -101, 93)), 'INTERVAL ', DECODE(SUBSTR(T.DATA_TYPE, 10, 3), 'DAY', -104, 'YEA', -103), 
         DECODE(T.DATA_TYPE, 'BINARY_DOUBLE', 101, 'BINARY_FLOAT', 100, 'BFILE', -13, 'BLOB', 2004, 'CHAR', 1, 'CLOB', 2005, 'COLLECTION', 2003, 'DATE', 93, 'FLOAT', 6, 'LONG', -1, 'LONG RAW', -4, 'NCHAR', -15, 'NCLOB', 2011, 'NUMBER', 2, 'NVARCHAR', -9, 'NVARCHAR2', -9, 'OBJECT', 2002, 'OPAQUE/XMLTYPE', 2009, 'RAW', -3, 'REF', 2006, 'ROWID', -8, 'SQLXML', 2009, 'UROWID', -8, 'VARCHAR2', 12, 'VARRAY', 2003, 'XMLTYPE', 2009, 1111))) AS DATA_TYPE, T.DATA_TYPE AS TYPE_NAME, 
         DECODE (T.DATA_PRECISION, NULL, DECODE(T.DATA_TYPE, 'NUMBER', DECODE(T.DATA_SCALE, NULL, 0 , 38), DECODE (T.DATA_TYPE, 'CHAR', T.CHAR_LENGTH, 'VARCHAR', T.CHAR_LENGTH, 'VARCHAR2', T.CHAR_LENGTH, 'NVARCHAR2', T.CHAR_LENGTH, 'NCHAR', T.CHAR_LENGTH, 'NUMBER', 0, T.DATA_LENGTH) ), T.DATA_PRECISION) AS COLUMN_SIZE, 0 AS BUFFER_LENGTH, 
         DECODE (T.DATA_TYPE, 'NUMBER', DECODE(T.DATA_PRECISION , NULL, 
         DECODE(T.DATA_SCALE, NULL, -127 , T.DATA_SCALE), T.DATA_SCALE), T.DATA_SCALE) AS DECIMAL_DIGITS, 10 AS NUM_PREC_RADIX, DECODE (T.NULLABLE, 'N', 0, 1) AS NULLABLE, C.COMMENTS AS REMARKS, T.DATA_DEFAULT AS COLUMN_DEF, 0 AS SQL_DATA_TYPE, 0 AS SQL_DATETIME_SUB, T.DATA_LENGTH AS CHAR_OCTET_LENGTH, T.COLUMN_ID AS ORDINAL_POSITION, DECODE (T.NULLABLE, 'N', 'NO', 'YES') AS IS_NULLABLE, NULL AS SCOPE_CATALOG, NULL AS SCOPE_SCHEMA, NULL AS SCOPE_TABLE, NULL AS SOURCE_DATA_TYPE, 'NO' AS IS_AUTOINCREMENT, NULL AS IS_GENERATEDCOLUMN 
FROM ALL_TAB_COLUMNS T, ALL_COL_COMMENTS C 
WHERE T.OWNER LIKE :B3 ESCAPE '/' AND T.TABLE_NAME LIKE :B2 ESCAPE '/'  AND T.COLUMN_NAME LIKE :B1 ESCAPE '/' AND T.OWNER = C.OWNER (+) AND T.TABLE_NAME = C.TABLE_NAME (+) AND T.COLUMN_NAME = C.COLUMN_NAME (+) 
ORDER BY TABLE_SCHEM, TABLE_NAME, ORDINAL_POSITION;

SQL性能分析:

  • 两个视图之间外关联
  • 子查询
  • 模糊查询
  • 大量调用 oracle的DECODE函数
  • 3个字段的排序

SQL出处
分析SQL查询结果,是为了查询一个表的字段信息,但是并非我们代码中写的SQL。
回归程序,找到获取oracle表字段信息的代码:

...
    /**
     * 获取字段拓展信息
     * 
     * @param catalog
     *            数据库名
     * @param schema
     *            用户名
     * @param table
     *            表名
     * @return 字段拓展信息map对象
     */
    protected Map<String, Map> getColumnExt(Connection conn, String catalog, String schema, String tabname) {
        Map<String, Map> m_ext = new HashMap<String, Map>();
        String sql = "select column_name colname,data_type coltype,data_precision prec,data_scale scale,virtual_column virtualcolumn,data_length,char_length,char_used"
                + " from all_tab_cols where HIDDEN_COLUMN='NO' and owner=? and table_name=?";
        String colname, coltype, virtualCol, charUsed;
        long charLength, dataLength;
        boolean isPrecNull, isScaleNull;
        try (PreparedStatement pstmt = conn.prepareStatement(sql)) {
            pstmt.setFetchSize(FETCH_SIZE);
            pstmt.setString(1, schema);
            pstmt.setString(2, tabname);
            try (ResultSet rs = pstmt.executeQuery()) {
                while (rs.next()) {
                    ...
                }
            }
        } catch (SQLException ex) {
            ...
            throw new RuntimeException(ex);
        }
        return m_ext;
    }

    // 获取表中的所有字段以及定义信息
    protected List<Document> getColumns(Connection conn, DatabaseMetaData metadata, String catalog, String schemaname, String tabname, String tabid, String dbcharset) {
        ...
        ResultSet rs = null;
        try {
            ...
            Map<String, Map> m_ext = this.getColumnExt(conn, catalog, schemaname, tabname);

            ...
            rs = metadata.getColumns(catalog, schema_tmp, table_tmp, null);
            rs.setFetchSize(1);
            ...
            while (rs.next()) {
                ...
            }
        } catch (SQLException e) {
            ...
            throw new RuntimeException(e);
        } finally {
            if (rs != null) { // 关闭结果集
                try {
                    rs.close();
                } catch (SQLException e) {
                    logger.error("Resultset close failed with " + e.getMessage(), e);
                }
            }
        }
        return cols;
    }
...

刚才那个执行耗时最多的SQL,是以下命令发出的,jdbc驱动内部的SQL。

rs = metadata.getColumns(catalog, schema_tmp, table_tmp, null);

原来获取一个表的信息执行了这个SQL外,还要执行以下SQL。

select column_name colname,data_type coltype,data_precision prec,data_scale scale,virtual_column virtualcolumn,data_length,char_length,char_used from all_tab_cols where HIDDEN_COLUMN='NO' and owner=? and table_name=?

小结
获取一个表的字段信息,需要查询两次数据库,并且有一个查询的效率非常低,平均耗时接近30秒。

三、优化过程

分解目标:

  • 取消模糊查询:一般情况下,oracle的模糊查询需要全表扫描+模糊匹配,耗时非常大
  • 取消子查询:子查询效率低
  • 取消 DECODE 函数 : 尽量减少函数调用,减轻数据库处理消耗
  • 减少oracle数据库数据排序
  • 尽量减少 表间关联查询
  • 减少 查询次数

最终目标:

  • 减少字段信息获取的时间及资源开销,提升处理效率

但是这个SQL是oracle jdbc驱动内部实现jdbc元数据接口的函数,为了优化SQL提升性能,首先就要放弃使用jdbc标准的元数据接口。

以下两个查询:

#查询1
select column_name colname,data_type coltype,data_precision prec,data_scale scale,virtual_column virtualcolumn,data_length,char_length,char_used from all_tab_cols where HIDDEN_COLUMN='NO' and owner=? and table_name=? ;

#查询2
SELECT NULL AS TABLE_CAT, T.OWNER AS TABLE_SCHEM, T.TABLE_NAME AS TABLE_NAME, T.COLUMN_NAME AS COLUMN_NAME, DECODE( (SELECT A.TYPECODE FROM ALL_TYPES A WHERE A.TYPE_NAME = T.DATA_TYPE), 'OBJECT', 2002, 'COLLECTION', 2003, 
         DECODE(SUBSTR(T.DATA_TYPE, 1, 9), 'TIMESTAMP', DECODE(SUBSTR(T.DATA_TYPE, 10, 1), '(', DECODE(SUBSTR(T.DATA_TYPE, 19, 5), 'LOCAL', -102, 'TIME ', -101, 93), 
         DECODE(SUBSTR(T.DATA_TYPE, 16, 5), 'LOCAL', -102, 'TIME ', -101, 93)), 'INTERVAL ', DECODE(SUBSTR(T.DATA_TYPE, 10, 3), 'DAY', -104, 'YEA', -103), 
         DECODE(T.DATA_TYPE, 'BINARY_DOUBLE', 101, 'BINARY_FLOAT', 100, 'BFILE', -13, 'BLOB', 2004, 'CHAR', 1, 'CLOB', 2005, 'COLLECTION', 2003, 'DATE', 93, 'FLOAT', 6, 'LONG', -1, 'LONG RAW', -4, 'NCHAR', -15, 'NCLOB', 2011, 'NUMBER', 2, 'NVARCHAR', -9, 'NVARCHAR2', -9, 'OBJECT', 2002, 'OPAQUE/XMLTYPE', 2009, 'RAW', -3, 'REF', 2006, 'ROWID', -8, 'SQLXML', 2009, 'UROWID', -8, 'VARCHAR2', 12, 'VARRAY', 2003, 'XMLTYPE', 2009, 1111))) AS DATA_TYPE, T.DATA_TYPE AS TYPE_NAME, 
         DECODE (T.DATA_PRECISION, NULL, DECODE(T.DATA_TYPE, 'NUMBER', DECODE(T.DATA_SCALE, NULL, 0 , 38), DECODE (T.DATA_TYPE, 'CHAR', T.CHAR_LENGTH, 'VARCHAR', T.CHAR_LENGTH, 'VARCHAR2', T.CHAR_LENGTH, 'NVARCHAR2', T.CHAR_LENGTH, 'NCHAR', T.CHAR_LENGTH, 'NUMBER', 0, T.DATA_LENGTH) ), T.DATA_PRECISION) AS COLUMN_SIZE, 0 AS BUFFER_LENGTH, 
         DECODE (T.DATA_TYPE, 'NUMBER', DECODE(T.DATA_PRECISION , NULL, 
         DECODE(T.DATA_SCALE, NULL, -127 , T.DATA_SCALE), T.DATA_SCALE), T.DATA_SCALE) AS DECIMAL_DIGITS, 10 AS NUM_PREC_RADIX, DECODE (T.NULLABLE, 'N', 0, 1) AS NULLABLE, C.COMMENTS AS REMARKS, T.DATA_DEFAULT AS COLUMN_DEF, 0 AS SQL_DATA_TYPE, 0 AS SQL_DATETIME_SUB, T.DATA_LENGTH AS CHAR_OCTET_LENGTH, T.COLUMN_ID AS ORDINAL_POSITION, DECODE (T.NULLABLE, 'N', 'NO', 'YES') AS IS_NULLABLE, NULL AS SCOPE_CATALOG, NULL AS SCOPE_SCHEMA, NULL AS SCOPE_TABLE, NULL AS SOURCE_DATA_TYPE, 'NO' AS IS_AUTOINCREMENT, NULL AS IS_GENERATEDCOLUMN 
FROM ALL_TAB_COLUMNS T, ALL_COL_COMMENTS C 
WHERE T.OWNER LIKE :B3 ESCAPE '/' AND T.TABLE_NAME LIKE :B2 ESCAPE '/'  AND T.COLUMN_NAME LIKE :B1 ESCAPE '/' AND T.OWNER = C.OWNER (+) AND T.TABLE_NAME = C.TABLE_NAME (+) AND T.COLUMN_NAME = C.COLUMN_NAME (+) 
ORDER BY TABLE_SCHEM, TABLE_NAME, ORDINAL_POSITION;

涉及的表或试图:

  • all_tab_cols
  • ALL_TAB_COLUMNS
  • ALL_COL_COMMENTS
  • ALL_TYPES (子查询)
目标逐个实现:
1. 取消模糊查询
根据业务需要,对复杂SQL调整,取消模糊查询
...
WHERE T.OWNER LIKE :B3 ESCAPE '/' AND T.TABLE_NAME LIKE :B2 ESCAPE '/'  AND T.COLUMN_NAME LIKE :B1 ESCAPE '/' AND T.OWNER = C.OWNER (+) AND T.TABLE_NAME = C.TABLE_NAME (+) AND T.COLUMN_NAME = C.COLUMN_NAME (+) 
...
改为
...
WHERE T.OWNER = :B3 AND T.TABLE_NAME = :B2 AND T.OWNER = C.OWNER (+) AND T.TABLE_NAME = C.TABLE_NAME (+) AND T.COLUMN_NAME = C.COLUMN_NAME (+) 
...
2. 取消子查询,同时取消 DECODE 函数
在复杂SQL中,存在以下子查询,并使用很多decode函数。
DECODE( (SELECT A.TYPECODE FROM ALL_TYPES A WHERE A.TYPE_NAME = T.DATA_TYPE), 'OBJECT', 2002, 'COLLECTION', 2003, 
         DECODE(SUBSTR(T.DATA_TYPE, 1, 9), 'TIMESTAMP', DECODE(SUBSTR(T.DATA_TYPE, 10, 1), '(', DECODE(SUBSTR(T.DATA_TYPE, 19, 5), 'LOCAL', -102, 'TIME ', -101, 93), 
         DECODE(SUBSTR(T.DATA_TYPE, 16, 5), 'LOCAL', -102, 'TIME ', -101, 93)), 'INTERVAL ', DECODE(SUBSTR(T.DATA_TYPE, 10, 3), 'DAY', -104, 'YEA', -103), 
         DECODE(T.DATA_TYPE, 'BINARY_DOUBLE', 101, 'BINARY_FLOAT', 100, 'BFILE', -13, 'BLOB', 2004, 'CHAR', 1, 'CLOB', 2005, 'COLLECTION', 2003, 'DATE', 93, 'FLOAT', 6, 'LONG', -1, 'LONG RAW', -4, 'NCHAR', -15, 'NCLOB', 2011, 'NUMBER', 2, 'NVARCHAR', -9, 'NVARCHAR2', -9, 'OBJECT', 2002, 'OPAQUE/XMLTYPE', 2009, 'RAW', -3, 'REF', 2006, 'ROWID', -8, 'SQLXML', 2009, 'UROWID', -8, 'VARCHAR2', 12, 'VARRAY', 2003, 'XMLTYPE', 2009, 1111))) AS DATA_TYPE

分析:
子查询涉及表ALL_TYPES ,这是oracle的类型表,然后使用多层decode函数对DATA_TYPE、TYPECODE的值进行分支判断,返回jdbc对应的DATA_TYPE。

优化方案:

  • 对应DECODE函数,采用java函数处理代替oracle decode函数。
  • 对应ALL_TYPES表采用缓存,避免对其关联查询找出额外的数据库资源开销。
3. 减少oracle数据库数据排序

目前是每次获取一个表

ORDER BY TABLE_SCHEM, TABLE_NAME, ORDINAL_POSITION;

改为

 ORDER BY ORDINAL_POSITION;

还可以直接取消数据库排序,采用java函数排序。(目前没有做到这点)。

4. 减少查询次数

将上面两个涉及到字段的查询,合成一个查询,避免多次查询数据库。

4. 减少 表外关联查询

以上查询涉及表或视图

  • all_tab_cols
  • ALL_TAB_COLUMNS
  • ALL_COL_COMMENTS
  • ALL_TYPES (子查询,前面已经对其缓存)

为了尽可能减少外关联查询造成的开销,对all_tab_cols 、ALL_TAB_COLUMNS、ALL_COL_COMMENTS 这3个视图进一步了解。其中ALL_TAB_COLUMNS是all_tab_cols 的视图。

CREATE VIEW
    ALL_TAB_COLUMNS
    (
        OWNER,
        TABLE_NAME,
        COLUMN_NAME,
        DATA_TYPE,
        DATA_TYPE_MOD,
        DATA_TYPE_OWNER,
        DATA_LENGTH,
        DATA_PRECISION,
        DATA_SCALE,
        NULLABLE,
        COLUMN_ID,
        DEFAULT_LENGTH,
        DATA_DEFAULT,
        NUM_DISTINCT,
        LOW_VALUE,
        HIGH_VALUE,
        DENSITY,
        NUM_NULLS,
        NUM_BUCKETS,
        LAST_ANALYZED,
        SAMPLE_SIZE,
        CHARACTER_SET_NAME,
        CHAR_COL_DECL_LENGTH,
        GLOBAL_STATS,
        USER_STATS,
        AVG_COL_LEN,
        CHAR_LENGTH,
        CHAR_USED,
        V80_FMT_IMAGE,
        DATA_UPGRADED,
        HISTOGRAM,
        DEFAULT_ON_NULL,
        IDENTITY_COLUMN,
        EVALUATION_EDITION,
        UNUSABLE_BEFORE,
        UNUSABLE_BEGINNING
    ) AS
SELECT
    OWNER,
    TABLE_NAME,
    COLUMN_NAME,
    DATA_TYPE,
    DATA_TYPE_MOD,
    DATA_TYPE_OWNER,
    DATA_LENGTH,
    DATA_PRECISION,
    DATA_SCALE,
    NULLABLE,
    COLUMN_ID,
    DEFAULT_LENGTH,
    DATA_DEFAULT,
    NUM_DISTINCT,
    LOW_VALUE,
    HIGH_VALUE,
    DENSITY,
    NUM_NULLS,
    NUM_BUCKETS,
    LAST_ANALYZED,
    SAMPLE_SIZE,
    CHARACTER_SET_NAME,
    CHAR_COL_DECL_LENGTH,
    GLOBAL_STATS,
    USER_STATS,
    AVG_COL_LEN,
    CHAR_LENGTH,
    CHAR_USED,
    V80_FMT_IMAGE,
    DATA_UPGRADED,
    HISTOGRAM,
    DEFAULT_ON_NULL,
    IDENTITY_COLUMN,
    EVALUATION_EDITION,
    UNUSABLE_BEFORE,
    UNUSABLE_BEGINNING
FROM
    ALL_TAB_COLS
WHERE
    USER_GENERATED = 'YES';

所以,所有查询需要的信息都在all_tab_cols、ALL_COL_COMMENTS 两个视图中,取消对ALL_TAB_COLUMNS的关联,根据业务需要调整SQL:

 SELECT T.DATA_DEFAULT, T.COLUMN_ID,  T.COLUMN_NAME,  T.DATA_TYPE, T.DATA_TYPE_OWNER, T.DATA_LENGTH, T.DATA_PRECISION, T.DATA_SCALE, T.NULLABLE, T.CHAR_LENGTH, T.CHAR_USED, C.COMMENTS, T.VIRTUAL_COLUMN 
 FROM ALL_TAB_COLS T LEFT JOIN ALL_COL_COMMENTS C ON T.OWNER = C.OWNER  AND T.TABLE_NAME = C.TABLE_NAME  AND T.COLUMN_NAME = C.COLUMN_NAME 
 WHERE  T.USER_GENERATED = 'YES' AND  T.OWNER=? AND T.TABLE_NAME=? 
 ORDER BY T.COLUMN_ID ASC 

以上SQL涵盖我们需要的信息。

让客户做SQL优化对比测试:
schema:ABM
table:ABM_IMP_BILL_LINES

-------优化前--------
SELECT NULL AS TABLE_CAT, T.OWNER AS TABLE_SCHEM, T.TABLE_NAME AS TABLE_NAME, T.COLUMN_NAME AS COLUMN_NAME, DECODE( (SELECT A.TYPECODE FROM ALL_TYPES A WHERE A.TYPE_NAME = T.DATA_TYPE), 'OBJECT', 2002, 'COLLECTION', 2003, 
         DECODE(SUBSTR(T.DATA_TYPE, 1, 9), 'TIMESTAMP', DECODE(SUBSTR(T.DATA_TYPE, 10, 1), '(', DECODE(SUBSTR(T.DATA_TYPE, 19, 5), 'LOCAL', -102, 'TIME ', -101, 93), 
         DECODE(SUBSTR(T.DATA_TYPE, 16, 5), 'LOCAL', -102, 'TIME ', -101, 93)), 'INTERVAL ', DECODE(SUBSTR(T.DATA_TYPE, 10, 3), 'DAY', -104, 'YEA', -103), 
         DECODE(T.DATA_TYPE, 'BINARY_DOUBLE', 101, 'BINARY_FLOAT', 100, 'BFILE', -13, 'BLOB', 2004, 'CHAR', 1, 'CLOB', 2005, 'COLLECTION', 2003, 'DATE', 93, 'FLOAT', 6, 'LONG', -1, 'LONG RAW', -4, 'NCHAR', -15, 'NCLOB', 2011, 'NUMBER', 2, 'NVARCHAR', -9, 'NVARCHAR2', -9, 'OBJECT', 2002, 'OPAQUE/XMLTYPE', 2009, 'RAW', -3, 'REF', 2006, 'ROWID', -8, 'SQLXML', 2009, 'UROWID', -8, 'VARCHAR2', 12, 'VARRAY', 2003, 'XMLTYPE', 2009, 1111))) AS DATA_TYPE, T.DATA_TYPE AS TYPE_NAME, 
         DECODE (T.DATA_PRECISION, NULL, DECODE(T.DATA_TYPE, 'NUMBER', DECODE(T.DATA_SCALE, NULL, 0 , 38), DECODE (T.DATA_TYPE, 'CHAR', T.CHAR_LENGTH, 'VARCHAR', T.CHAR_LENGTH, 'VARCHAR2', T.CHAR_LENGTH, 'NVARCHAR2', T.CHAR_LENGTH, 'NCHAR', T.CHAR_LENGTH, 'NUMBER', 0, T.DATA_LENGTH) ), T.DATA_PRECISION) AS COLUMN_SIZE, 0 AS BUFFER_LENGTH, 
         DECODE (T.DATA_TYPE, 'NUMBER', DECODE(T.DATA_PRECISION , NULL, 
         DECODE(T.DATA_SCALE, NULL, -127 , T.DATA_SCALE), T.DATA_SCALE), T.DATA_SCALE) AS DECIMAL_DIGITS, 10 AS NUM_PREC_RADIX, DECODE (T.NULLABLE, 'N', 0, 1) AS NULLABLE, C.COMMENTS AS REMARKS, T.DATA_DEFAULT AS COLUMN_DEF, 0 AS SQL_DATA_TYPE, 0 AS SQL_DATETIME_SUB, T.DATA_LENGTH AS CHAR_OCTET_LENGTH, T.COLUMN_ID AS ORDINAL_POSITION, DECODE (T.NULLABLE, 'N', 'NO', 'YES') AS IS_NULLABLE, NULL AS SCOPE_CATALOG, NULL AS SCOPE_SCHEMA, NULL AS SCOPE_TABLE, NULL AS SOURCE_DATA_TYPE, 'NO' AS IS_AUTOINCREMENT, NULL AS IS_GENERATEDCOLUMN 
FROM ALL_TAB_COLUMNS T, ALL_COL_COMMENTS C 
WHERE T.OWNER LIKE 'ABM' ESCAPE '/' AND T.TABLE_NAME LIKE 'ABM/_IMP/_BILL/_LINES' ESCAPE '/' AND T.OWNER = C.OWNER (+) AND T.TABLE_NAME = C.TABLE_NAME (+) AND T.COLUMN_NAME = C.COLUMN_NAME (+) 
ORDER BY TABLE_SCHEM, TABLE_NAME, ORDINAL_POSITION;

Elapsed: 00:00:26.14

-------优化后--------
--oracle 10/11g
SELECT T.DATA_DEFAULT, T.COLUMN_ID,  T.COLUMN_NAME,  T.DATA_TYPE, T.DATA_TYPE_OWNER, T.DATA_LENGTH, T.DATA_PRECISION, T.DATA_SCALE, T.NULLABLE, T.CHAR_LENGTH, T.CHAR_USED, C.COMMENTS, T.VIRTUAL_COLUMN 
 FROM ALL_TAB_COLS T 
 LEFT JOIN ALL_COL_COMMENTS C ON T.OWNER = C.OWNER  AND T.TABLE_NAME = C.TABLE_NAME  AND T.COLUMN_NAME = C.COLUMN_NAME 
 WHERE  T.HIDDEN_COLUMN = 'NO' AND  T.OWNER='ABM' AND T.TABLE_NAME='ABM_IMP_BILL_LINES' 
 ORDER BY T.COLUMN_ID ASC ;
 
Elapsed: 00:00:00.16

这里就完成了耗时最多的SQL优化,每个表获取字段信息由原来的耗时 29s 优化为 0.15s

四、其他优化

以上只是最耗时的SQL,因采集AWR报告时,已经屏蔽了其它信息采集,其中有两个相对比较耗时的信息采集:

  • 表数据量
  • 表记录数
    在优化和内部测试过程中,发现这两个相对比较耗时,为了提升性能同时保留采集信息完整性。对处理过程进行优化:
  • 在开始采集一个schem的元数据时,一次性把整个schema的表记录数和数据量查出来并缓存。
  • 后面采集过程中,需要 表记录数和数据量 。就直接在缓存中取。

五、性能提升

对于公司内部测试(oracle 12c 整库20W张表,并使用4线程扫描):
优化之前每秒扫描6~10个表的元数据
优化后平均每秒扫描54个表的元数据,20万张表在一个小时左右完成

客户环境(HR库,3W+张表):
优化前耗时(3小时49分)


image.png

优化后耗时(4分35秒)


image.png

五、进一步优化

虽然对客户环境的数据库优化,已经让客户满意,但是根据数据分析,客户环境扫描优化还能进一步优化。

所以,对耗时较多的扫描查询,也采用与表记录数、表数据量相同的优化方案:

  • check约束
  • 主键约束
  • 外键约束
  • 唯一约束
  • 其他(表其他信息、临时表标志。。。)

对于公司内部测试(oracle 12c 整库20W张表,并使用4线程扫描):
每秒扫描54张表提升到 每秒扫描65张表。

客户环境中(客户oracle 10.2 整个库20W表):
预计可以将扫描时间降到0.2秒/每表(原来每个表32秒)。

附录:生成AWR的过程:

1、执行两次生成快照的过程,中间需要间隔一定时间

SQL> exec dbms_workload_repository.create_snapshot();

2、执行生成AWR报告的脚本

 @/app/oracle/product/12c/db_1/rdbms/admin/awrrpt.sql

输入报告的类型、报告开始快照id及结束快照id、快照文件名,就会生成相应的报告文件。

具体参考:https://blog.csdn.net/defonds/article/details/52958303

相关文章

网友评论

      本文标题:Oracle jdbc获取元数据过程优化

      本文链接:https://www.haomeiwen.com/subject/demnxqtx.html