源码构建简化
很多人吐槽StreamingPro构建实在太麻烦了。看源码都难。然后花了一天时间做了比较大重构,这次只依赖于ServiceFramework项目。具体构建方式如下:
git clone https://github.com/allwefantasy/ServiceFramework.git
cd ServiceFramework
mvn install -Pscala-2.11 -Pjetty-9 -Pweb-include-jetty-9
mvn install -Pscala-2.10 -Pjetty-9 -Pweb-include-jetty-9
//如果你需要切换scala版本,在构建之前,记得运行下面的命令
./dev/change-version-to-2.10.sh
接着就可以构建StreamingPro了:
git clone https://github.com/allwefantasy/streamingpro.git
// for spark 1.6.*
mvn -DskipTests clean package -pl streamingpro-spark -am -Ponline -Pscala-2.10 -Pcarbondata -Phive-thrift-server -Pspark-1.6.1 -Pshade
// for spark 2.*
mvn -DskipTests clean package -pl streamingpro-spark-2.0 -am -Ponline -Pscala-2.11 -Phive-thrift-server -Pspark-2.1.0 -Pshade
基于Spark 2.1.1 的StreamingPro 同时支持Spark Streaming 以及Structured Streaming
Structured Streaming 的支持参看文章:
StreamingPro 再次支持 Structured Streaming
Spark Streaming 则和Structure Streaming的形态一模一样:
我们看具体的配置文件:
{
"scalamaptojson": {
"desc": "测试",
"strategy": "spark",
"algorithm": [],
"ref": [
],
"compositor": [
{
"name": "stream.sources",
"params": [
{
"format": "socket",
"outputTable": "test",
"port": "9999",
"host": "localhost",
"path": "-"
},
{
"format": "com.databricks.spark.csv",
"outputTable": "sample",
"header": "true",
"path": "/Users/allwefantasy/streamingpro/sample.csv"
}
]
},
{
"name": "stream.sql",
"params": [
{
"sql": "select city from test left join sample on test.content == sample.name",
"outputTableName": "test3"
}
]
},
{
"name": "stream.outputs",
"params": [
{
"mode": "Overwrite",
"format": "console",
"inputTableName": "test3",
"path": "-"
}
]
}
],
"configParams": {
}
}
}
只是把 ss 前缀换成了 stream。 启动方式如下:
SHome=/Users/allwefantasy/streamingpro
./bin/spark-submit --class streaming.core.StreamingApp \
--master local[2] \
--name test \
$SHome/streamingpro-spark-2.0-0.4.15-SNAPSHOT.jar \
-streaming.name test \
-streaming.platform spark_streaming \
-streaming.job.file.path file://$SHome/spark-streaming.json
网友评论
mvn install -Pscala-2.10 -Pjetty-9 -Pweb-include-jetty-9 。也就是serviceframework 需要scala-2.10,scala-2.11 两个版本。之后就没啥问题。另外,你给的链接是非常早的版本了。现在spark streaming 的配置已经得到很大的简化。可以参看这篇文章的内容
Error:(14, 28) java: 程序包net.csdn.http.server不存在
Error:(26, 1) java: 程序包org.eclipse.jetty.server不存在
Error:(27, 40) java: 程序包org.eclipse.jetty.server.handler不存在
Error:(45, 19) java: 找不到符号
符号: 类 Server
位置: 类 net.csdn.modules.http.HttpServer
......
而对于依赖的SF,进行如下编译:
mvn install -Pscala-2.11 -Pjetty-9 -Pweb-include-jetty-9
1) Error injecting constructor, java.lang.NoSuchMethodError: org.eclipse.jetty.server.Server.<init>(Lorg/eclipse/jetty/util/thread/ThreadPool;)V
at net.csdn.modules.http.HttpServer.<init>(HttpServer.java:74)
at net.csdn.modules.http.HttpModule.configure(HttpModule.java:15)
while locating net.csdn.modules.http.HttpServer
1 error
at com.google.inject.internal.Errors.throwCreationExceptionIfErrorsExist(Errors.java:435)
at com.google.inject.internal.InternalInjectorCreator.injectDynamically(InternalInjectorCreator.java:183)
at com.google.inject.internal.InternalInjectorCreator.build(InternalInjectorCreator.java:109)
at com.google.inject.Guice.createInjector(Guice.java:95)
at net.csdn.bootstrap.loader.impl.ControllerLoader.load(ControllerLoader.java:46)
at net.csdn.bootstrap.Bootstrap.configureSystem(Bootstrap.java:103)
at net.csdn.bootstrap.Bootstrap.main(Bootstrap.java:41)
这个是有冲突么?