AWS

作者: klory | 来源:发表于2017-09-20 06:29 被阅读8次

亚马逊AWS云资源
Hive分析AWS ELB访问日志
AWS 服务概览
2021-09-01 aws s3 防盗链
AWS CloudWatch & Inspector & Tru
AWS 云应用安装与部署
AWS 查询付款、退款
目前中国区AWS新上线的KMS服务的使用及限制
AWS 生产级微服务部署架构分享
aws s3 命令

IaaS (Infrastructure as a Service, like Azure, Google Could, these kinds of virtual services), this is used to build large companies involved in different kinds of servers.

Not like DigitalOcean and Linode(VPS - virtual process service). It is more for building wordpress or kind of small websites involved in single server.

Services

CDN (CloudFront)
Content deliver network, make you to access the website from the closest place.
Glacier
Store data that is not used frequently
Storage
Store data that is used frequently
Virtual Server
Lambda
Pure compute without worrying the server.
Database

Benefits

Scalable(just spend more money)
Total Cost of Ownership is low , you need to hire people to deal with different servers and modules, like power, cooler, etc.
Highly reliable for price point
Centralized Billing and Management

Problems

lock in
learning curve
cost adds up

Pricing

compute
storage
bandwidth
interaction

Normal File system

Linux default disk block size = 4 KB, file smaller than a block, the rest of the block will be wasted
GFS <-> HDFS
MapReduce <-> Hadoop

HDFS

Specially designed FS for storing big data with a streaming access pattern (write once and read as many as you want)
default disk block size = 64MB, file smaller than a block, the rest of the block will NOT BE wasted

Hadoop

daemons

master daemons: name node, secondary name node, job tracker
slaves daemons: data node, task tracker

example - theory

we(client) have 200MB data, so we need 4 blocks
we need 1 name node(nn), and several data node(dn), e.g. 8 data nodes.
nn creates metadata, creates daemons.
nn passes metadata back to client. Then client distributes the blocks to the data nodes and make replications based on the info from name node.
the data nodes send heartbeats back to the nn to notify that it is alive.
client sends code to the data node
job tracker tells task trackers to do its job
after the job are finished, the job tracker will assign a reducer.

example - real world

split data(documents) into input splits, and pass them to Record Readers,
then send them to the mapper. (default for text jobs is to split document into lines then send the lines to the mappers).
then shuffle the data to make the pairs with the same key together, default shuffle(sort) in Hadoop is alphabetically.
then reduce (each reducer reduces one key)

HDFS instructions

step 1
hdfs dfs -ls /, hdfs dfs -mkdir, hdfs dfs -put, hdfs dfs -get
step 2 move file to hdfs
hdfs dfs -put input.txt /user/class/
step 3 complie
javac -cp $HADOOP_core.jar *.java
step 4
jar cvf test.jar *.class
step 5
hadoop jar wordcount.jar ...WordCount

Setup

Setup your AWS accounts by following the below steps:

Go to AWS (https://aws.amazon.com/) and create an account. You need to enter your credit card info.
You can find an AWS account number in your AWS profile. Use that account number to apply for AWS educate credits at https://aws.amazon.com/education/awseducate/apply/ It will take a few hours before your receive an email confirming your credits are active.

If you have not received your AWS educate credits and are not using free tier services you will be charged on your credit card for usage! If you do, you will be responsible for any costs incurred.

网友评论

本文标题：AWS

本文链接：https://www.haomeiwen.com/subject/gdrnsxtx.html

延伸阅读

深度阅读

您也可以注册成为美文阅读网的作者，发表您的原创作品、分享您的心情！

AWS

Services

Benefits

Problems

Pricing

Normal File system

HDFS

Hadoop

daemons

example - theory

example - real world

HDFS instructions

Setup

相关文章

亚马逊AWS云资源

Hive分析AWS ELB访问日志

AWS 服务概览

2021-09-01 aws s3 防盗链

AWS CloudWatch & Inspector & Tru

AWS 云应用安装与部署

AWS 查询付款、退款

目前中国区AWS新上线的KMS服务的使用及限制

AWS 生产级微服务部署架构分享

aws s3 命令

网友评论

延伸阅读

深度阅读

栏目导航

热点阅读