1.flink on k8s 读取外部hdfs 报错
org.apache.hadoop.security.AccessControlException: Permission denied: user=flink,
access=WRITE, inode="/data/flink/checkpoints":hdfs:hdfs:drwxr-xr-x
hadoop fs -chmod -R 777 /user/flink
java.lang.NoClassDefFoundError: Could not initialize class org.xerial.snappy.Snappy
at org.xerial.snappy.SnappyInputStream.hasNextChunk(SnappyInputStream.java:435)
at org.xerial.snappy.SnappyInputStream.read(SnappyInputStream.java:167)
at java.io.DataInputStream.readFully(DataInputStream.java:195)
at java.io.DataInputStream.readLong(DataInputStream.java:416)
at org.apache.kafka.common.record.MemoryRecords$RecordsIterator.getNextEntryFromStream(MemoryRecords.java:333)
at org.apache.kafka.common.record.MemoryRecords$RecordsIterator.<init>(MemoryRecords.java:255)
at org.apache.kafka.common.record.MemoryRecords$RecordsIterator.makeNext(MemoryRecords.java:307)
at org.apache.kafka.common.record.MemoryRecords$RecordsIterator.makeNext(MemoryRecords.java:221)
at org.apache.kafka.common.utils.AbstractIterator.maybeComputeNext(AbstractIterator.java:79)
at org.apache.kafka.common.utils.AbstractIterator.hasNext(AbstractIterator.java:45)
at org.apache.kafka.clients.consumer.internals.Fetcher.parseFetchedData(Fetcher.java:545)
at org.apache.kafka.clients.consumer.internals.Fetcher.fetchedRecords(Fetcher.java:354)
at org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:1000)
at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:938)
at org.apache.flink.streaming.connectors.kafka.internal.KafkaConsumerThread.getRecordsFromKafka(KafkaConsumerThread.java:539)
at org.apache.flink.streaming.connectors.kafka.internal.KafkaConsumerThread.run(KafkaConsumerThread.java:267)
解决办法:
在Dockerfile里 增加如下两句,因为openjdk的问题。
RUN ln -s /lib /lib64 #新增
RUN apk add --no-cache bash tini libc6-compat linux-pam krb5 krb5-libs #新增
参考:https://www.cnblogs.com/hellxz/p/11936994.html
/user/flink/checkpoint/PushflinkJobtest/00000000000000000000000000000000/chk-14/_metadata for client 172.16.5.174 already exists at
org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.startFile(FSDirWriteFileOp.java:381) at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:2438) at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:2347) at
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:774) at
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:462) at
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523) at
org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991) at
org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:872) at
org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:818) at
java.security.AccessController.doPrivileged(Native Method) at
javax.security.auth.Subject.doAs(Subject.java:422) at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729) at
org.apache.hadoop.ipc.Server$Handler.run(Server.java:2678)
解决办法:
问题原因是standalone perjob方式,不会设置JobId,所以在新任务启动的时候checkpoint写入的位置一样,导致文件已存在。
解决方法是 在启动时设置jobid即可。
- flink 任务日志有时区问题,差8小时
解决办法:
需要将机器本地的时区挂在到镜像里。
在Dockerfile中加上如下语句,重新打镜像即可。
RUN ln -sf /usr/share/zoneinfo/Asia/Shanghai /etc/localtime
RUN echo 'Asia/Shanghai' >/etc/timezone
- checkpoint 写入权限为root怎么解决?
FROM alpine
RUN adduser -D myuser
USER myuser
添加到docker镜像里,这样写入的是以myuser用户写入,不需要root权限
网友评论