美文网首页
C#中通过SQLLoader导入csv文件到Oracle以及第一

C#中通过SQLLoader导入csv文件到Oracle以及第一

作者: 百里有声 | 来源:发表于2022-09-19 18:42 被阅读0次

    实现的功能为:

    1:从csv文件中的第一行提取列名生成数据库表
    2:将原始的csv文件转换成sqlloder能识别的csv(增加列),根据时间列生成一个整形主键列
    3:生成ctl控制文件
    4:启动Sqlloder并导入

    这里做记录主要是导入时出现了莫名其妙的问题

    第一行第一列总是会报错误
    第一列为数字列,则SQLLoader报 ORA-01722
    第一列调整成时间列并将数字列放到最后,则SQLLoader报时间列转换错误。
    参考各种文章不能解决,参考的解决方案包括
    1:在ctl文件中指定字段类型
    https://blog.csdn.net/qq_25798961/article/details/88534104

    CHAR                         字符型
    INTEGER EXTERNAL    整型
    DECIMAL EXTERNAL    浮点型
    DATE 'YYYYMMDD'      日期型
    

    数字列增加 选项 INTEGER EXTERNAL 没有效果
    2:数据列做转换
    https://community.oracle.com/tech/developers/discussion/3684710/ora-01722-invalid-number-in-sql-loader

    Options(errors=10000,skip=3)
    LOAD DATA
    INFILE '/home/appltest/Sheet1.csv'
    BADFILE 'sample.bad'
    DISCARDFILE 'sample.dsc'
    REPLACE
    INTO TABLE sampl_table
    fields terminated by ','
    optionally enclosed by '"'  
    trailing nullcols
    (
    customer_name  CHAR NULLIF customer_name = BLANKS "RTRIM(:customer_name)",
    customer_no  CHAR NULLIF customer_no = BLANKS "RTRIM(:customer_no)",
    item_no  CHAR NULLIF item_no = BLANKS "RTRIM(:item_no )",
    field1      FILLER,
    field2      FILLER,
    valid_amount_month1    "to_number(:valid_amount_month1 , '999,999,999.99')"  ,
    valid_units_month1      "to_number(substr(:valid_units_month1,instr(:valid_units_month1,'$')+1),'999,999,999.99')",
    field3      FILLER,
    field4      FILLER,
    valid_budget_units_month1    "to_number(:valid_budget_units_month1, '999,999,999.99')"                                      ,
    valid_budget_amount_month1  "to_number(trim(substr(:valid_budget_amount_month1,instr(:valid_budget_amount_month1,'$')+1)),'999,999,999.99')"    ,
    valid_rate    CHAR NULLIF valid_rate = BLANKS "RTRIM(:valid_rate)"
    )
    

    3:ctl文件增加 "str '\r\n'" 项没有效果
    3:ctl文件第一列加上"replace(replace(:CHANNELKEY, CHR(13), ''), CHR(10), '')" 配置项,没有效果

    最终通过以下方案解决

    1:设置生成的csv中包含一行头信息
    2:ctl中包含一个OPTIONS(skip=1)配置,跳过头信息行的导入
    究其根本仍不知道是 文件编码 还是 windows下换行 等等原因导致的。

    核心代码如下:

      public void Process()
            {
                ClearFiles();
                ReBuildDataFile(); //解析-1:将输入文件转换成Sqlloder可识别的文件,拼接时间,生成主键
                CreateDataTable(); //解析-2:生成表结构
                BuildCtlFile();    //解析-3:生成Sqlloder控制文件
                StartSqlloader();  //解析-4:启动Sqlloder,执行导入
            }
    
    
            //生成sqloader可识别的控制文件
            private string BuildCtlFile()
            {
                string strError = string.Empty;
                string strColumns = AttachKeyColumn();      
                strColumns = strColumns.Replace("DATETIMES", "DATETIMES \"to_timestamp(:DATETIMES,'yyyy-mm-dd hh24:mi:ss.ff')\"");
                StringBuilder sb = new StringBuilder();
                sb.AppendLine("OPTIONS(skip=1)");
                sb.AppendLine("load data");
                sb.AppendLine(string.Format("infile '{0}' \"str '\\r\\n'\"", GetCsvFileFullName()));
                sb.AppendLine(string.Format("append into table {0}", GetTableName()));
                sb.AppendLine(string.Format("fields terminated by '{0}'", this.splitChar.ToString()));
                sb.AppendLine("optionally enclosed by '\"'");
                sb.AppendLine("trailing nullcols");
                sb.AppendLine(string.Format("({0})", strColumns));
                GenerateFile(GetCtlFileFullName(), sb.ToString());
                return strError;
            }
    
            private string StartSqlloader()
            {
                string strError = string.Empty;
                string output = string.Empty;
                bool processOk = true;
                using (System.Diagnostics.Process p = new System.Diagnostics.Process())
                {
                    p.StartInfo.FileName = this.SqlLoaderConfig;
                    p.StartInfo.Arguments = string.Format(" {0} control={1} log={2}", this.OracleDBConfig, GetCtlFileFullName(), GetCtlFileFullName().Replace("ctl", "log"));
                    p.StartInfo.CreateNoWindow = true; //不再新窗口启动
                    p.StartInfo.UseShellExecute = false; //不使用操作系统的shell启动
                    p.StartInfo.RedirectStandardOutput = true; //输出重定向
                    p.StartInfo.RedirectStandardError = true;
                    p.StartInfo.RedirectStandardInput = true;
                    //p.Exited += P_Exited;
                    p.Start();
                    output = p.StandardOutput.ReadToEnd();
                    //ExitCode 11g 12c和9i不同
                    if (p.ExitCode == 1) //Ex_Fail
                    {
                        processOk = false;
                    }
                    if (p.ExitCode == 2) //Ex_Warn
                    {
                        processOk = false;
                    }
                    if (p.ExitCode == 4) //Ex_FTL
                    {
                        processOk = false;
                    }
                }
                if (!processOk)
                {
                    throw new IOException(output);
                }
    
                return strError;
            }
    

    原始CSV文件为:

    DATETIMES,CD_000_0,CD_000_1,CD_000_2
    2022-09-19 16:11:17.720,100,200,300
    2022-09-19 16:11:17.721,101,201,301
    2022-09-19 16:11:17.722,102,202,302
    2022-09-19 16:11:17.723,103,203,303
    2022-09-19 16:11:17.724,104,204,304
    2022-09-19 16:11:17.725,105,205,305
    2022-09-19 16:11:17.726,106,206,306
    2022-09-19 16:11:17.727,107,207,307
    2022-09-19 16:11:17.728,108,208,308
    2022-09-19 16:11:17.729,109,209,309
    2022-09-19 16:11:17.730,110,210,310
    2022-09-19 16:11:17.731,111,211,311
    2022-09-19 16:11:17.732,112,212,312
    2022-09-19 16:11:17.733,113,213,313
    2022-09-19 16:11:17.734,114,214,314
    2022-09-19 16:11:17.735,115,215,315
    2022-09-19 16:11:17.736,116,216,316
    2022-09-19 16:11:17.737,117,217,317
    2022-09-19 16:11:17.738,118,218,318
    2022-09-19 16:11:17.739,119,219,319
    2022-09-19 16:11:17.740,120,220,320
    

    转换成的CSV文件为:

    CHANNELKEY INTEGER EXTERNAL,DATETIMES,CD_000_0,CD_000_1,CD_000_2
    4295477720,2022-09-19 17:11:17.720,100,200,300
    4295477721,2022-09-19 17:11:17.721,101,201,301
    4295477722,2022-09-19 17:11:17.722,102,202,302
    4295477723,2022-09-19 17:11:17.723,103,203,303
    4295477724,2022-09-19 17:11:17.724,104,204,304
    4295477725,2022-09-19 17:11:17.725,105,205,305
    4295477726,2022-09-19 17:11:17.726,106,206,306
    4295477727,2022-09-19 17:11:17.727,107,207,307
    4295477728,2022-09-19 17:11:17.728,108,208,308
    4295477729,2022-09-19 17:11:17.729,109,209,309
    4295477730,2022-09-19 17:11:17.730,110,210,310
    4295477731,2022-09-19 17:11:17.731,111,211,311
    4295477732,2022-09-19 17:11:17.732,112,212,312
    4295477733,2022-09-19 17:11:17.733,113,213,313
    4295477734,2022-09-19 17:11:17.734,114,214,314
    4295477735,2022-09-19 17:11:17.735,115,215,315
    4295477736,2022-09-19 17:11:17.736,116,216,316
    4295477737,2022-09-19 17:11:17.737,117,217,317
    4295477738,2022-09-19 17:11:17.738,118,218,318
    4295477739,2022-09-19 17:11:17.739,119,219,319
    4295477740,2022-09-19 17:11:17.740,120,220,320
    
    

    生成的ctl文件为:

    OPTIONS(skip=1)
    load data
    infile 'D:\******\TemplateData\1cf37b73-6727-4618-b636-73a9292ef33d.csv' "str '\r\n'"
    append into table FZ_41657_01
    fields terminated by ','
    optionally enclosed by '"'
    trailing nullcols
    (CHANNELKEY INTEGER EXTERNAL,DATETIMES "to_timestamp(:DATETIMES,'yyyy-mm-dd hh24:mi:ss.ff')",CD_000_0,CD_000_1,CD_000_2)
    

    附ctl文件说明:

    https://blog.csdn.net/demonson/article/details/79712207

    OPTIONS (skip=1,rows=128) -- sqlldr 命令显示的 选项可以写到这里边来,skip=1 用来跳过数据中的第一行  
    LOAD DATA  
    INFILE "users_data.csv" --指定外部数据文件,可以写多 个 INFILE "another_data_file.csv" 指定多个数据文件  
    --这里还可以使 用 BADFILE、DISCARDFILE 来指定坏数据和丢弃数据的文件,  
    truncate --操作类型,用 truncate table 来清除表中原有 记录  
    INTO TABLE users -- 要插入记录的表  
    Fields terminated by "," -- 数据中每行记录用 "," 分隔  
    Optionally enclosed by '"' -- 数据中每个字段用 '"' 框起,比如字段中有 "," 分隔符时  
    trailing nullcols --表的字段没有对应的值时允 许为空  
    (  
      virtual_column FILLER, --这是一个虚拟字段,用来跳 过由 PL/SQL Developer 生成的第一列序号  
      user_id number, --字段可以指定类型,否则认 为是 CHARACTER 类型, log 文件中有显示  
      user_name,  
      login_times,  
      last_login DATE "YYYY-MM-DD HH24:MI:SS" -- 指定接受日期的格式,相当用 to_date() 函数转换  
    )  
    

    主键列 CHANNELKEY,生成规则为:

     private string FixKeyValue(DateTime time)
            {
                TimeSpan toNow = time.Subtract((new DateTime(time.Year, time.Month - 1, 1)));
                //return (toNow.Ticks / 10000).ToString().Substring(3);
                return (toNow.Ticks / 10000).ToString();
            }
    

    相关文章

      网友评论

          本文标题:C#中通过SQLLoader导入csv文件到Oracle以及第一

          本文链接:https://www.haomeiwen.com/subject/wisvortx.html