一、向ES导入数据
1.1 从CSV数据源导入
启动进程
yay@yay-ThinkPad-T470-W10DG:~/software/hadoop-2.6.0/sbin$ ./start-dfs.sh
yay@yay-ThinkPad-T470-W10DG:~/software/hadoop-2.6.0/sbin$ ./start-yarn.sh
//主要提供Mapreduce Job历史信息查询
yay@yay-ThinkPad-T470-W10DG:~/software/hadoop-2.6.0/sbin$ ./mr-jobhistory-daemon.sh start historyserver
jps进程情况:

把es-hadoop和csv文件放到hdfs上:
yay@yay-ThinkPad-T470-W10DG:~/software/hadoop-2.6.0/sbin$ hdfs dfs -mkdir /lib
yay@yay-ThinkPad-T470-W10DG:~/software/hadoop-2.6.0/sbin$ hdfs dfs -put /home/yay/software/elasticsearch-hadoop-2.1.1/dist/elasticsearch-hadoop-2.1.1.jar /lib/elasticsearch-hadoop-2.1.1.jar
yay@yay-ThinkPad-T470-W10DG:~/software/hadoop-2.6.0/sbin$ hdfs dfs -mkdir /ch07
yay@yay-ThinkPad-T470-W10DG:~/software/hadoop-2.6.0/sbin$ hdfs dfs -put /home/yay/下载/Elasticsearch-for-Hadoop_code/eshadoop-master/Chapter7/data/crimes_dataset.csv /ch07/crime_dataset.csv
pig里面执行的内容:
yay@yay-ThinkPad-T470-W10DG:~/software/pig-0.17.0/bin$ ./pig
//1. 首先需要在Pig中注册ES-HADOOP的jar文件
//grunt> REGISTER hdfs://localhost:9000/lib/elasticsearch-hadoop-7.8.0.jar;
grunt> REGISTER hdfs://localhost:9000/lib/elasticsearch-hadoop-2.1.1.jar
//2. 加载CSV数据文件
grunt> SOURCE = load '/ch07/crime_dataset.csv' using PigStorage(',') as (id:chararray, caseNumber:chararray,date:datetime,block:chararray,iucr:chararray,primaryType:chararray,description:chararray,location:chararray,arrest:boolean,domestic:boolean, lat:double,lon:double);
//3.生成与ES索引匹配的数据结构
grunt> TARGET = foreach SOURCE generate id, caseNumber,date,block,iucr,primaryType,description,location,arrest,domestic, TOTUPLE(lon, lat) AS geoLocation;
//4. 数据存入ES索引中(这个实际是在执行MapReduce命令)
grunt> STORE TARGET INTO 'es_pig/crimes' USING org.elasticsearch.hadoop.pig.EsStorage('es.http.timeout = 5m','es_index.auto.create = true', 'es.mapping.names=arrest:isArrest, domestic:isDomestic','es.mapping.id=id');
看看执行最后一条命令的回显信息:
grunt> STORE TARGET INTO 'es_pig/crimes' USING org.elasticsearch.hadoop.pig.EsStorage('es.http.timeout = 5m','es_index.auto.create = true', 'es.mapping.names=arrest:isArrest, domestic:isDomestic','es.mapping.id=id');
2020-07-24 10:04:57,323 [main] WARN org.elasticsearch.hadoop.mr.EsOutputFormat - Speculative execution enabled for reducer - consider disabling it to prevent data corruption
2020-07-24 10:04:57,339 [main] INFO org.apache.pig.tools.pigstats.ScriptState - Pig features used in the script: UNKNOWN
2020-07-24 10:04:57,375 [main] INFO org.apache.pig.data.SchemaTupleBackend - Key [pig.schematuple] was not set... will not generate code.
2020-07-24 10:04:57,408 [main] INFO org.apache.pig.newplan.logical.optimizer.LogicalPlanOptimizer - {RULES_ENABLED=[AddForEach, ColumnMapKeyPrune, ConstantCalculator, GroupByConstParallelSetter, LimitOptimizer, LoadTypeCastInserter, MergeFilter, MergeForEach, NestedLimitOptimizer, PartitionFilterOptimizer, PredicatePushdownOptimizer, PushDownForEachFlatten, PushUpFilter, SplitFilter, StreamTypeCastInserter]}
2020-07-24 10:04:57,474 [main] INFO org.apache.pig.impl.util.SpillableMemoryManager - Selected heap (PS Old Gen) of size 699400192 to monitor. collectionUsageThreshold = 489580128, usageThreshold = 489580128
2020-07-24 10:04:57,570 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false
2020-07-24 10:04:57,605 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1
2020-07-24 10:04:57,605 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1
2020-07-24 10:04:57,644 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - session.id is deprecated. Instead, use dfs.metrics.session-id
2020-07-24 10:04:57,645 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Initializing JVM Metrics with processName=JobTracker, sessionId=
2020-07-24 10:04:57,709 [main] INFO org.apache.pig.tools.pigstats.mapreduce.MRScriptState - Pig script settings are added to the job
2020-07-24 10:04:57,716 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.job.reduce.markreset.buffer.percent is deprecated. Instead, use mapreduce.reduce.markreset.buffer.percent
2020-07-24 10:04:57,716 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2020-07-24 10:04:57,719 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.output.compress is deprecated. Instead, use mapreduce.output.fileoutputformat.compress
2020-07-24 10:04:57,725 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - This job cannot be converted run in-process
2020-07-24 10:04:57,738 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.submit.replication is deprecated. Instead, use mapreduce.client.submit.file.replication
2020-07-24 10:04:57,971 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/tmp/pig1493399292987591061tmp/elasticsearch-hadoop-7.8.0.jar to DistributedCache through /tmp/temp-1797060858/tmp-1378164927/elasticsearch-hadoop-7.8.0.jar
2020-07-24 10:04:58,081 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/yay/software/pig-0.17.0/pig-0.17.0-core-h2.jar to DistributedCache through /tmp/temp-1797060858/tmp1140381131/pig-0.17.0-core-h2.jar
2020-07-24 10:04:58,170 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/yay/software/pig-0.17.0/lib/automaton-1.11-8.jar to DistributedCache through /tmp/temp-1797060858/tmp-1965548869/automaton-1.11-8.jar
2020-07-24 10:04:58,249 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/yay/software/pig-0.17.0/lib/antlr-runtime-3.4.jar to DistributedCache through /tmp/temp-1797060858/tmp2116659353/antlr-runtime-3.4.jar
2020-07-24 10:04:58,339 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Added jar file:/home/yay/software/pig-0.17.0/lib/joda-time-2.9.3.jar to DistributedCache through /tmp/temp-1797060858/tmp1679799996/joda-time-2.9.3.jar
2020-07-24 10:04:58,408 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
2020-07-24 10:04:58,427 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Key [pig.schematuple] is false, will not generate code.
2020-07-24 10:04:58,427 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Starting process to move generated code to distributed cacche
2020-07-24 10:04:58,427 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Setting key [pig.schematuple.classes] with classes to deserialize []
2020-07-24 10:04:58,470 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission.
2020-07-24 10:04:58,471 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.job.tracker.http.address is deprecated. Instead, use mapreduce.jobtracker.http.address
2020-07-24 10:04:58,476 [JobControl] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
2020-07-24 10:04:58,491 [JobControl] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.task.id is deprecated. Instead, use mapreduce.task.attempt.id
2020-07-24 10:04:58,493 [JobControl] WARN org.elasticsearch.hadoop.mr.EsOutputFormat - Speculative execution enabled for reducer - consider disabling it to prevent data corruption
2020-07-24 10:04:58,670 [JobControl] WARN org.apache.hadoop.mapreduce.JobSubmitter - No job jar file set. User classes may not be found. See Job or Job#setJar(String).
2020-07-24 10:04:58,707 [JobControl] INFO org.apache.pig.builtin.PigStorage - Using PigTextInputFormat
2020-07-24 10:04:58,714 [JobControl] INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2020-07-24 10:04:58,714 [JobControl] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
2020-07-24 10:04:58,739 [JobControl] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths (combined) to process : 1
2020-07-24 10:04:58,799 [JobControl] INFO org.apache.hadoop.mapreduce.JobSubmitter - number of splits:1
2020-07-24 10:04:58,897 [JobControl] INFO org.apache.hadoop.mapreduce.JobSubmitter - Submitting tokens for job: job_local1469778342_0001
2020-07-24 10:04:59,236 [JobControl] INFO org.apache.hadoop.mapred.LocalDistributedCacheManager - Creating symlink: /home/yay/hadoop_tmp/mapred/local/1595556299054/elasticsearch-hadoop-7.8.0.jar <- /home/yay/software/pig-0.17.0/bin/elasticsearch-hadoop-7.8.0.jar
2020-07-24 10:04:59,277 [JobControl] INFO org.apache.hadoop.mapred.LocalDistributedCacheManager - Localized hdfs://localhost:9000/tmp/temp-1797060858/tmp-1378164927/elasticsearch-hadoop-7.8.0.jar as file:/home/yay/hadoop_tmp/mapred/local/1595556299054/elasticsearch-hadoop-7.8.0.jar
2020-07-24 10:04:59,300 [JobControl] INFO org.apache.hadoop.mapred.LocalDistributedCacheManager - Creating symlink: /home/yay/hadoop_tmp/mapred/local/1595556299055/pig-0.17.0-core-h2.jar <- /home/yay/software/pig-0.17.0/bin/pig-0.17.0-core-h2.jar
2020-07-24 10:04:59,309 [JobControl] INFO org.apache.hadoop.mapred.LocalDistributedCacheManager - Localized hdfs://localhost:9000/tmp/temp-1797060858/tmp1140381131/pig-0.17.0-core-h2.jar as file:/home/yay/hadoop_tmp/mapred/local/1595556299055/pig-0.17.0-core-h2.jar
2020-07-24 10:04:59,309 [JobControl] INFO org.apache.hadoop.mapred.LocalDistributedCacheManager - Creating symlink: /home/yay/hadoop_tmp/mapred/local/1595556299056/automaton-1.11-8.jar <- /home/yay/software/pig-0.17.0/bin/automaton-1.11-8.jar
2020-07-24 10:04:59,315 [JobControl] INFO org.apache.hadoop.mapred.LocalDistributedCacheManager - Localized hdfs://localhost:9000/tmp/temp-1797060858/tmp-1965548869/automaton-1.11-8.jar as file:/home/yay/hadoop_tmp/mapred/local/1595556299056/automaton-1.11-8.jar
2020-07-24 10:04:59,316 [JobControl] INFO org.apache.hadoop.mapred.LocalDistributedCacheManager - Creating symlink: /home/yay/hadoop_tmp/mapred/local/1595556299057/antlr-runtime-3.4.jar <- /home/yay/software/pig-0.17.0/bin/antlr-runtime-3.4.jar
2020-07-24 10:04:59,323 [JobControl] INFO org.apache.hadoop.mapred.LocalDistributedCacheManager - Localized hdfs://localhost:9000/tmp/temp-1797060858/tmp2116659353/antlr-runtime-3.4.jar as file:/home/yay/hadoop_tmp/mapred/local/1595556299057/antlr-runtime-3.4.jar
2020-07-24 10:04:59,323 [JobControl] INFO org.apache.hadoop.mapred.LocalDistributedCacheManager - Creating symlink: /home/yay/hadoop_tmp/mapred/local/1595556299058/joda-time-2.9.3.jar <- /home/yay/software/pig-0.17.0/bin/joda-time-2.9.3.jar
2020-07-24 10:04:59,328 [JobControl] INFO org.apache.hadoop.mapred.LocalDistributedCacheManager - Localized hdfs://localhost:9000/tmp/temp-1797060858/tmp1679799996/joda-time-2.9.3.jar as file:/home/yay/hadoop_tmp/mapred/local/1595556299058/joda-time-2.9.3.jar
2020-07-24 10:04:59,374 [JobControl] INFO org.apache.hadoop.mapred.LocalDistributedCacheManager - file:/home/yay/hadoop_tmp/mapred/local/1595556299054/elasticsearch-hadoop-7.8.0.jar
2020-07-24 10:04:59,374 [JobControl] INFO org.apache.hadoop.mapred.LocalDistributedCacheManager - file:/home/yay/hadoop_tmp/mapred/local/1595556299055/pig-0.17.0-core-h2.jar
2020-07-24 10:04:59,374 [JobControl] INFO org.apache.hadoop.mapred.LocalDistributedCacheManager - file:/home/yay/hadoop_tmp/mapred/local/1595556299056/automaton-1.11-8.jar
2020-07-24 10:04:59,374 [JobControl] INFO org.apache.hadoop.mapred.LocalDistributedCacheManager - file:/home/yay/hadoop_tmp/mapred/local/1595556299057/antlr-runtime-3.4.jar
2020-07-24 10:04:59,374 [JobControl] INFO org.apache.hadoop.mapred.LocalDistributedCacheManager - file:/home/yay/hadoop_tmp/mapred/local/1595556299058/joda-time-2.9.3.jar
2020-07-24 10:04:59,377 [JobControl] INFO org.apache.hadoop.mapreduce.Job - The url to track the job: http://localhost:8080/
2020-07-24 10:04:59,378 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_local1469778342_0001
2020-07-24 10:04:59,378 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Processing aliases SOURCE,TARGET
2020-07-24 10:04:59,378 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - detailed locations: M: SOURCE[1,9],TARGET[-1,-1] C: R:
2020-07-24 10:04:59,383 [Thread-22] INFO org.apache.hadoop.mapred.LocalJobRunner - OutputCommitter set in config null
2020-07-24 10:04:59,383 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete
2020-07-24 10:04:59,384 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_local1469778342_0001]
2020-07-24 10:04:59,408 [Thread-22] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.job.reduce.markreset.buffer.percent is deprecated. Instead, use mapreduce.reduce.markreset.buffer.percent
2020-07-24 10:04:59,409 [Thread-22] INFO org.apache.hadoop.mapred.LocalJobRunner - OutputCommitter is org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputCommitter
2020-07-24 10:04:59,456 [Thread-22] INFO org.apache.hadoop.mapred.LocalJobRunner - Waiting for map tasks
2020-07-24 10:04:59,457 [LocalJobRunner Map Task Executor #0] INFO org.apache.hadoop.mapred.LocalJobRunner - Starting task: attempt_local1469778342_0001_m_000000_0
2020-07-24 10:04:59,523 [LocalJobRunner Map Task Executor #0] INFO org.apache.hadoop.mapred.Task - Using ResourceCalculatorProcessTree : [ ]
2020-07-24 10:04:59,528 [LocalJobRunner Map Task Executor #0] INFO org.apache.hadoop.mapred.MapTask - Processing split: Number of splits :1
Total Length = 131156
Input split[0]:
Length = 131156
ClassName: org.apache.hadoop.mapreduce.lib.input.FileSplit
Locations:
-----------------------
2020-07-24 10:04:59,542 [LocalJobRunner Map Task Executor #0] INFO org.apache.pig.builtin.PigStorage - Using PigTextInputFormat
2020-07-24 10:04:59,545 [LocalJobRunner Map Task Executor #0] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader - Current split being processed hdfs://localhost:9000/ch07/crime_dataset.csv:0+131156
2020-07-24 10:04:59,578 [LocalJobRunner Map Task Executor #0] INFO org.apache.pig.impl.util.SpillableMemoryManager - Selected heap (PS Old Gen) of size 699400192 to monitor. collectionUsageThreshold = 489580128, usageThreshold = 489580128
2020-07-24 10:04:59,579 [LocalJobRunner Map Task Executor #0] INFO org.apache.pig.data.SchemaTupleBackend - Key [pig.schematuple] was not set... will not generate code.
2020-07-24 10:04:59,596 [LocalJobRunner Map Task Executor #0] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map - Aliases being processed per job phase (AliasName[line,offset]): M: SOURCE[1,9],TARGET[-1,-1] C: R:
2020-07-24 10:04:59,730 [LocalJobRunner Map Task Executor #0] INFO org.elasticsearch.hadoop.util.Version - Elasticsearch Hadoop v7.8.0 [5a921b0e52]
2020-07-24 10:04:59,752 [LocalJobRunner Map Task Executor #0] WARN org.elasticsearch.hadoop.rest.Resource - Detected type name in resource [es_pig/crimes]. Type names are deprecated and will be removed in a later release.
2020-07-24 10:04:59,752 [LocalJobRunner Map Task Executor #0] INFO org.elasticsearch.hadoop.mr.EsOutputFormat - Writing to [es_pig/crimes]
2020-07-24 10:04:59,778 [LocalJobRunner Map Task Executor #0] WARN org.elasticsearch.hadoop.rest.Resource - Detected type name in resource [es_pig/crimes]. Type names are deprecated and will be removed in a later release.
2020-07-24 10:05:02,038 [LocalJobRunner Map Task Executor #0] WARN org.elasticsearch.hadoop.rest.Resource - Detected type name in resource [es_pig/crimes]. Type names are deprecated and will be removed in a later release.
2020-07-24 10:05:02,074 [LocalJobRunner Map Task Executor #0] WARN org.elasticsearch.hadoop.rest.Resource - Detected type name in resource [es_pig/crimes]. Type names are deprecated and will be removed in a later release.
2020-07-24 10:05:02,252 [LocalJobRunner Map Task Executor #0] INFO org.apache.hadoop.mapred.LocalJobRunner -
2020-07-24 10:05:02,562 [LocalJobRunner Map Task Executor #0] INFO org.apache.hadoop.mapred.Task - Task:attempt_local1469778342_0001_m_000000_0 is done. And is in the process of committing
2020-07-24 10:05:02,571 [LocalJobRunner Map Task Executor #0] INFO org.apache.hadoop.mapred.LocalJobRunner - map
2020-07-24 10:05:02,571 [LocalJobRunner Map Task Executor #0] INFO org.apache.hadoop.mapred.Task - Task 'attempt_local1469778342_0001_m_000000_0' done.
2020-07-24 10:05:02,571 [LocalJobRunner Map Task Executor #0] INFO org.apache.hadoop.mapred.LocalJobRunner - Finishing task: attempt_local1469778342_0001_m_000000_0
2020-07-24 10:05:02,571 [Thread-22] INFO org.apache.hadoop.mapred.LocalJobRunner - map task executor complete.
2020-07-24 10:05:02,886 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 50% complete
2020-07-24 10:05:02,886 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_local1469778342_0001]
2020-07-24 10:05:04,395 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
2020-07-24 10:05:04,408 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
2020-07-24 10:05:04,409 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
2020-07-24 10:05:04,409 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces
2020-07-24 10:05:04,410 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
2020-07-24 10:05:04,461 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete
2020-07-24 10:05:04,463 [main] INFO org.apache.pig.tools.pigstats.mapreduce.SimplePigStats - Script Statistics:
HadoopVersion PigVersion UserId StartedAt FinishedAt Features
2.6.0 0.17.0 yay 2020-07-24 10:04:57 2020-07-24 10:05:04 UNKNOWN
Success!
Job Stats (time in seconds):
JobId Maps Reduces MaxMapTime MinMapTime AvgMapTime MedianMapTime MaxReduceTime MinReduceTime AvgReduceTime MedianReducetime Alias Feature Outputs
job_local1469778342_0001 1 0 n/a n/a n/a n/a 0 0 0 0 SOURCE,TARGET MAP_ONLY es_pig/crimes,
Input(s):
Successfully read 999 records (7914968 bytes) from: "/ch07/crime_dataset.csv"
Output(s):
Successfully stored 999 records (6769797 bytes) in: "es_pig/crimes"
Counters:
Total records written : 999
Total bytes written : 6769797
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0
Job DAG:
job_local1469778342_0001
2020-07-24 10:05:04,464 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
2020-07-24 10:05:04,465 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
2020-07-24 10:05:04,466 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
2020-07-24 10:05:04,471 [main] WARN org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Encountered Warning FIELD_DISCARDED_TYPE_CONVERSION_FAILED 1009 time(s).
2020-07-24 10:05:04,472 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Success!
grunt>

1.2 从JSON数据源导入
json文件上传到hdfs上
yay@yay-ThinkPad-T470-W10DG:~/software/hadoop-2.6.0/sbin$ hdfs dfs -put /home/yay/下载/Elasticsearch-for-Hadoop_code/eshadoop-master/Chapter7/data/crimes.json /ch07/crimes.json
网友评论