spark 调用类内方法
https://blog.csdn.net/wangxiao7474/article/details/81742417
在pyspark中调用类方法,报错
Exception: It appears that you are attempting to reference SparkContext from a broadcast variable, action, or transforamtion. SparkContext can only be used on the driver, not in code that it run on workers. For more information, see SPARK-5063.
原因:
spark不允许在action或transformation中访问SparkContext,如果你的action或transformation中引用了self,那么spark会将整个对象进行序列化,并将其发到工作节点上,这其中就保留了SparkContext,即使没有显式的访问它,它也会在闭包内被引用,所以会出错。
解决:
应该将调用的类方法定义为静态方法 @staticmethod
网友评论