java.io.UTFDataFormatException
19/09/04 09:47:24 ERROR Inbox: Ignoring error
java.io.UTFDataFormatException: malformed input around byte 873
at java.io.DataInputStream.readUTF(DataInputStream.java:656)
at java.io.DataInputStream.readUTF(DataInputStream.java:564)
at org.apache.spark.scheduler.TaskDescription$$anonfun$deserializeStringLongMap$1.apply$mcVI$sp(TaskDescription.scala:110)
at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:160)
...
Caused by: java.io.InvalidClassException: org.apache.spark.rdd.RDD; local class incompatible: stream classdesc serialVersionUID = 4416556597546473068, local class serialVersionUID = -3328732449542231715
at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:699)
at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1885)
at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1751)
...
上面的问题应该是spark client和spark集群的版本不一致,导致任务序列化后反序列化失败
java.lang.NoSuchMethodError:
java.lang.NoSuchMethodError: com.esotericsoftware.kryo.serializers.FieldSerializer.setIgnoreSyntheticFields(Z)V
at com.twitter.chill.KryoBase.newDefaultSerializer(KryoBase.scala:55)
at com.esotericsoftware.kryo.Kryo.getDefaultSerializer(Kryo.java:377)
at com.esotericsoftware.kryo.Kryo.register(Kryo.java:406)
...
kryo jar包引用版本混乱,查看项目路径下是不是有多个kryo-xxx.jar,删掉多余的jar包
这里spark2.4.5用的是kyro-shaded-4.0.2.jar,如果引入下面的包,就会报上面的异常
compile group: 'de.javakaffee', name: 'kryo-serializers', version: '0.45'
java.lang.UnsupportedOperationException
java.lang.UnsupportedOperationException: Cannot have circular references in bean class, but got the circular reference of class class com.google.protobuf.Descriptors$Descriptor
at org.apache.spark.sql.catalyst.JavaTypeInference$.org$apache$spark$sql$catalyst$JavaTypeInference$$inferDataType(JavaTypeInference.scala:126)
at org.apache.spark.sql.catalyst.JavaTypeInference$$anonfun$1.apply(JavaTypeInference.scala:136)
...
spark executor 不能进行迭代调用,而protobuf3生成的代码有自引用,需要将Encoders.bean()换成Encoders.kyro()方式
#原始写法
dataset.map(new MapFunction<>(), Encoders.bean(SomeProtobufObject.class))
#更改为
dataset.map(new MapFunction<>(), Encoders.kryo(SomeProtobufObject.class))
java.lang.IncompatibleClassChangeError:
java.lang.IncompatibleClassChangeError: class com.google.protobuf.Descriptors$OneofDescriptor has interface com.google.protobuf.Descriptors$GenericDescriptor as super class
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:763)
at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:468)
spark集群模式使用protobuf生成的类,可能会产生上述异常,原因是项目内有使用低版本的protobuf,而protobuf 3.8.0后 GenericDescriptor 私有化,解决方案是使用3.8.0或之前的版本
参考:https://stackoverflow.com/questions/58786130/issue-deserializing-events-in-protobuf-events-in-apache-flink
网友评论