Distributed Cache在mapreduce中读取小文

作者: Skye_kh | 来源:发表于2016-12-06 08:53 被阅读55次

Distributed Cache在mapreduce中读取小文
OC底层原理11-objc_msgSend源码分析(方法查找快流
015 Hadoop 中的分布式缓存: 最全面的指南
数据库与缓存的一致性问题
大数据
MapReduce工作机制和序列化
网络规划设计师知识点——网络基础篇计算机硬件基础—访问命中
缓存更新模式
大数据生态系统
[iOS] 消息流程分析之快速查找

title: Distributed Cache在mapreduce中读取小文件
date: 2016/11/26 22:48:13
tags: MapReduce
categories: 大数据

Distributed Cache 在 MapReduce 任务中应用很广，它可以大大提高一些被频繁读取文件的访问速度。被添加到 Distributed Cache 的文件会被拷贝到 Mapper 和 Reducer 的运行目录中。

**在job添加如下方法 **

remoteReGamePath为hdfs文件路径字符串
job.addCacheFile(new Path(remoteReGamePath).toUri());

以下例子为在map中读取此文件并存入集合

private Set<String> recommendGame = new HashSet<String>();
/**
         * 读取推荐游戏文件
         * 
         * @param uri
         */
        private void readReGame(URI uri) {
            try {
                Path patternsPath = new Path(uri.getPath());
                String patternsFileName = patternsPath.getName().toString();
                BufferedReader reader = new BufferedReader(new FileReader(
                        patternsFileName));
                String line;
                while ((line = reader.readLine()) != null) {
                    // TODO: your code here
                    //
                    recommendGame.add(line.split(",")[0]);
                }
                reader.close();

            } catch (FileNotFoundException e) {
                // TODO Auto-generated catch block
                e.printStackTrace();
            } catch (IOException e) {
                // TODO Auto-generated catch block
                e.printStackTrace();
            }

        }

        @Override
        protected void setup(
                Mapper<LongWritable, Text, Text, Text>.Context context)
                throws IOException, InterruptedException {
            // TODO Auto-generated method stub
            super.setup(context);
            //获取cache  uri
            URI[] uri = context.getCacheFiles();

            readReGame(uri[0]);

        }

网友评论

本文标题：Distributed Cache在mapreduce中读取小文

本文链接：https://www.haomeiwen.com/subject/nlriuttx.html

延伸阅读

深度阅读

您也可以注册成为美文阅读网的作者，发表您的原创作品、分享您的心情！

Distributed Cache在mapreduce中读取小文

相关文章

Distributed Cache在mapreduce中读取小文

OC底层原理11-objc_msgSend源码分析(方法查找快流

015 Hadoop 中的分布式缓存: 最全面的指南

数据库与缓存的一致性问题

大数据

MapReduce工作机制和序列化

网络规划设计师知识点——网络基础篇计算机硬件基础—访问命中

缓存更新模式

大数据生态系统

[iOS] 消息流程分析之快速查找

网友评论

延伸阅读

深度阅读

栏目导航

热点阅读