美文网首页
hive mapjoin MapJoinMemoryExhaus

hive mapjoin MapJoinMemoryExhaus

作者: 旺财旺财 | 来源:发表于2019-02-28 11:40 被阅读0次

Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"aid":252511110,"property":"{\"aid\":252511110,\"alvl\":0,\"avn\":0,\"avdn\":0,\"avpn\":0,\"avcn\":0,\"avsn\":0,\"avti\":0,\"avtp\":0}","dt":"20190226"}

        at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:507)

        at org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.processRow(SparkMapRecordHandler.java:141)

Caused by: org.apache.hadoop.hive.ql.exec.mapjoin.MapJoinMemoryExhaustionException: 2019-02-28 04:08:29 Processing rows:        200000  Hashtable size: 199999  Memory usage:  7528720336      percentage:    0.60

  at org.apache.hadoop.hive.ql.exec.mapjoin.MapJoinMemoryExhaustionHandler.checkMemoryStatus(MapJoinMemoryExhaustionHandler.java:99)

at org.apache.hadoop.hive.ql.exec.HashTableSinkOperator.processOp(HashTableSinkOperator.java:249)

at org.apache.hadoop.hive.ql.exec.SparkHashTableSinkOperator.processOp(SparkHashTableSinkOperator.java:79)

at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)

at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)

at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)

at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:97)

at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:157)

at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:497)

... 17 more

19/02/28 04:08:36 INFO executor.CoarseGrainedExecutorBackend: Driver commanded a shutdown

原因:

MapJoinMemoryExhaustionHandler.java 中有如下代码

public void checkMemoryStatus(long tableContainerSize, long numRows)

  throws MapJoinMemoryExhaustionException {

    long usedMemory = memoryMXBean.getHeapMemoryUsage().getUsed();

    double percentage = (double) usedMemory / (double) maxHeapSize;

    String msg = Utilities.now() + "\tProcessing rows:\t" + numRows + "\tHashtable size:\t"

        + tableContainerSize + "\tMemory usage:\t" + usedMemory + "\tpercentage:\t" + percentageNumberFormat.format(percentage);

    console.printInfo(msg);

    if(percentage > maxMemoryUsage) {

      throw new MapJoinMemoryExhaustionException(msg);

    }

  }

最终解决办法:

hive在 0.11后 hive.auto.convert.join 自动为true,也就是如果满足条件会自动去做mapjoin, mapjoin的参数判断见

https://yq.aliyun.com/articles/64306

参考:

hive-15221: https://issues.apache.org/jira/browse/HIVE-15221?spm=a2c4e.11153940.blogcont64306.15.eeb3541edXrhsd

https://yq.aliyun.com/articles/64306

https://stackoverflow.com/questions/22977790/hive-query-execution-error-return-code-3-from-mapredlocaltask?spm=a2c4e.11153940.blogcont64306.13.eeb3541edXrhsd

https://yq.aliyun.com/articles/476771?spm=a2c4e.11153940.blogcont64306.28.eeb3541edXrhsd

相关文章

  • hive mapjoin MapJoinMemoryExhaus

    Caused by: org.apache.hadoop.hive.ql.metadata.HiveExcepti...

  • 笔记汇总

    Hive Join common join如果不指定MapJoin或者不符合MapJoin的条件,那么Hive解析...

  • Hive Multiple MapJoin优化

    hive中会对多个mapjoin做进一步的优化,即:将多个mapjoin合并为一个mapjoin,这样做的依据是:...

  • hive 优化

    hive 已经自动mapjoin优化,将小表载入到内存;不需要再mapjoin 设置。但是skewjoin 还是得...

  • Hive中bucket-mapjoin和smb-join的区别

    1 bucket-mapjoin 1.1 条件1) set hive.optimize.bucketmapjoin...

  • Hive MapJoin 执行计划

    本文通过展示hive.mapjoin.smalltable.filesize 这个参数的设置,来比较是否使用map...

  • Hive优化实践3-大表join大表优化

    5、大表join大表优化如果Hive优化实战2中mapjoin中小表dim_seller很大呢?比如超过了1GB大...

  • MapReduce中控制Map数量

    起因 近日在工作中遇到一个Hive job报错,查看报错信息如下: 问题猜测是由于MapJoin导致了oom,经指...

  • MapJoin

    背景:最近组内开展每周五技术分享讨论,于是抽空了解了一些平时可能会用到的知识点。选题:MapJoin 1.MapJ...

  • MapJoin原理

    https://blog.csdn.net/louxuez/article/details/39235425htt...

网友评论

      本文标题:hive mapjoin MapJoinMemoryExhaus

      本文链接:https://www.haomeiwen.com/subject/wakhuqtx.html