美文网首页
SparkCore求男女生人数

SparkCore求男女生人数

作者: 喵星人ZC | 来源:发表于2019-05-04 22:27 被阅读0次

原始数据
ID 性别 身高

1   M   178
2   M   168
3   F   160
4   F   156
5   M   195
6   F   172
7   M   180
8   M   190
9   M   175
10  F   150
11  F   170
12  F   155
13  F   157
14  M   160
15  F   159
16  M   182
17  M   165

拿到性别(tab键分割后取下标为1)

scala> val lines = sc.textFile("file:///home/hadoop/soul/data/m_f_info.txt")
lines: org.apache.spark.rdd.RDD[String] = file:///home/hadoop/soul/data/m_f_info.txt MapPartitionsRDD[15] at textFile at <console>:24

scala> val splits = lines.map(x => x.split("\t")(1))
splits: org.apache.spark.rdd.RDD[String] = MapPartitionsRDD[16] at map at <console>:25

得到值等于M的RDD,和值等于F的RDD

scala> val mRDD = splits.filter(x => (x == "M"))
mRDD: org.apache.spark.rdd.RDD[String] = MapPartitionsRDD[17] at filter at <console>:25


scala> val fRDD =splits.filter( (_ == "F"))
fRDD: org.apache.spark.rdd.RDD[String] = MapPartitionsRDD[18] at filter at <console>:25

使用count算子求总数
1、男生人数

scala> mRDD.count
res15: Long = 9

2、女生人数

scala> fRDD.count
res16: Long = 8

相关文章

网友评论

      本文标题:SparkCore求男女生人数

      本文链接:https://www.haomeiwen.com/subject/fvtqoqtx.html