kylo官网教程 EMR Install Guide
Upload required Jars to the S3 EMR bucket you created above
http://central.maven.org/maven2/commons-httpclient/commons-httpclient/3.1/commons-httpclient-3.1.jar
echo "spark.hadoop.yarn.timeline-service.enabled false" >> /etc/spark/conf/spark-defaults.conf
Update application.properties to prepare for the template
Add the following properties to the kylo-services application.properties file
vim /opt/kylo/kylo-services/conf/application.properties
config.s3ingest.s3.protocol=s3a
config.s3ingest.hiveBucket=<S3_BUCKET>
config.s3ingest.es.jar_url=s3a://<S3_BUCKET>/elasticsearch-hadoop-5.5.0.jar
config.s3ingest.apache-commons.jar_url=s3a://<S3_BUCKET>/commons-httpclient-3.1.jar
config.s3ingest.es.nodes=<KYLO_NODE_IP_ADDRESS>
nifi.executesparkjob.sparkhome=/usr/lib/spark
nifi.executesparkjob.sparkmaster=yarn-cluster
config.spark.validateAndSplitRecords.extraJars=/usr/lib/hive-hcatalog/share/hcatalog/hive-hcatalog-core-2.1.1-amzn-0.jar,/usr/lib/spark/jars/datanucleus-api-jdo-3.2.6.jar,/usr/lib/spark/jars/datanucleus-core-3.2.10.jar,/usr/lib/spark/jars/datanucleus-rdbms-3.2.9.jar
nifi.executesparkjob.extra_jars=/usr/lib/spark/jars/datanucleus-api-jdo-3.2.6.jar,/usr/lib/spark/jars/datanucleus-core-3.2.10.jar,/usr/lib/spark/jars/datanucleus-rdbms-3.2.9.jar
nifi.executesparkjob.extra_files=$nifi{table_field_policy_json_file},/etc/spark/conf/hive-site.xml
config.spark.version=2
Nifi
P:Archive Originals
Additional Classpath Resources: /usr/lib/hadoop-lzo/lib/hadoop-lzo.jar
P:Upload to HDFS
Additional Classpath Resources: /usr/lib/hadoop-lzo/lib/hadoop-lzo.jar
P:Validate And Split Records
SparkMaster: yarn
Spark YARN Deploy Mode: cluster
P:Profile Data
SparkMaster: yarn
Spark YARN Deploy Mode: cluster
/usr/lib/spark/jars/datanucleus-api-jdo-3.2.6.jar,/usr/lib/spark/jars/datanucleus-core-3.2.10.jar,/usr/lib/spark/jars/datanucleus-rdbms-3.2.9.jar,/usr/lib/hadoop-lzo/lib/hadoop-lzo.jar
网友评论