Chapter1 Computer Abstractions and Technology
写过论文的人对abstraction一定不会陌生;不错,本章就是一些综述的内容,对于热衷技术的人而言肯定乏味的很,但作为一个行业管理者而言,可谓字字珠玑啊;
1.1 introduction
首先介绍了计算机科学在广阔领域的发展,主要是如下领域:
- automobiles
- cell phones
- Human genome project
- world wide web
- search engines
然后总结传统计算机应用及其特点,主要分为3个领域:
- PCs
- Servers
- Embedded computers
最后欢迎大家来到后PC时代,介绍后PC时代的特点:
- 最大特点就是PMD替换PC
- PMD:Personal Mobile Device
那么读者可以从这本书学习到什么呢?答案如下:
- 高级软件语言(C,JAVA)如何转为硬件能理解的语言,硬件如何执行程序;
- 软硬件接口是什么,软件如何让硬件执行特定功能;
- 什么决定了程序的性能,怎么提高性能;
- 硬件工程师可利用什么技术提高性能;
- 硬件工程师可利用什么技术提高energy efficiency;
- Parallelism的原因及其后续演进;
- 现代计算机架构中个great ideas;
1.2 Eight Great Ideas in Computer Architecture.
过去60年,计算机架构出现了8个great ideas
- Design for Moore's Law
- Use Abstraction to Simplify Design
- Make the Common Case Fast
- Performance via parallelism
- Performance via Pipelining
- Performance via Prediction
- Hierarchy of Memories
- Dependability via Redundancy
1.3 Below Your Program
Application software在system software之上,而system software又在hardware之上;
有两种system software:
- operating system
- compiler
将高级语言编写的程序翻译成机器能执行的指令;
From a High-Level Language to the Language of Hardware
机器所能理解的语言叫做
instruction:通常为binary形式的,如1001010100101110,此指令为两个数相加;
1.4 Under the Covers
以ipad2为例,从LCD到CPU的硬件简介。
1.5 Technologies for Buildig Processors and Memory
简单介绍半导体工艺
the cost of an integrated circuit can be expressed in 3 simple equations:
Cost per die=Cost per water/(Dies per wafer x yield)
Dies per wafer=wafer area /Die area
Yield=1/(1+(Defects per area x Die area/2))^2
1.6 Performance
- Defining Performance
- Measuring Performance
- time is the measure of computer performance,时间有不同的定义方式:
- wall clock time 经过时间
- response time 响应时间
- elapsed time 运行时间
- CPU execution time/CPU time
- user CPU time:单纯执行程序的时间
- system CPU time:为了执行程序而调用操作系统的时间;
- 二者很难精确区分;
- 一般我们喜欢用elapsed time来评价一个程序的性能,其实elapsed time和CPU time还是有区别的
- we will use the term system performance to refer to elapsed time on an unloaded system and CPU performance to refer to user CPU time;
- 如一些server对IO很依赖,从而性能需要评价软硬件综合性能,而一些application则可能只关注throughput或者response time,或者两者的组合。所以为提高performance,你必须要知道哪些方面构成了performance matric matters,从而方便找到performance的瓶颈。
- time is the measure of computer performance,时间有不同的定义方式:
- CPU Performance and its Factors
- 减少应用程序的时钟个数或者提高时钟频率
- Instruction Performance
- 时钟周期数=指令个数指令平均时钟个数
- 指令平均时钟个数:clock cycles per instrunction,简称CPI
- 每个指令执行的时钟个数不同,CPI为所有指令执行的时钟个数的平均数;
- CPI provides one way of comparing two different implementations of the identical instruction set architecture,since the number of instructions executed for a program will, of course, be the same.
- The Clasic CPU Performance Equation
- CPU time=Instrunction count CPIClock cycle time(or 1/Clock rate)
- 注意IPC=1/CPI,IPC: instructions per clock
- 注意现在很多CPU有变频特点,如Intel i7可以将频率提升10%直到CPU太热在降回来,这种技术被称为turbo技术;
1.7 The Power Wall
首先介绍了8代Intel CPU的频率和功耗走势图;需要注意的是在Pentium4(2001)时,频率达到3.6GHz,功耗达到103M,起后频率与功耗都有降低;
其次介绍单个晶体管的动态功耗(焦耳和瓦特角度):
Energy1/2Capacitive load Voltages2
Power1/2Capacitive load Voltages2 Frequency switched
Frequency switched和时钟频率相关;
with regard to the Figure, how could clock rates grow by a factor of 1000 while power increased by only a factor of 30?(频率有1000倍增长而功耗只有30倍增长)
Energy and thus power can be reduced by lowering the voltage,每代CPU都采用这种技术,即每代电压减少15%(0.0225)
In 20 years, voltages have gone from 5V to 1V, which is why the increase in power is only 30 times.(0.0225x1000=22.530)
现代的问题是,继续降低电压会使晶体管too leaky(服务器芯片40%功耗来源于leakage)
to try to adress the power problem, designers have already attached large devices to increase cooling. and they turn off parts of the chip that are not used in a given clock cycle.
尽管需要高昂的降温设备(300W功耗),这种方案还是应用于PC和server,而PMD则不需要;
Power is a challenge for integrated circuits for 2 reasons:
- power must be brought in and distributed around the chip;现代芯片可能需要数百个ground和power的引脚
- power is dissipated as heat and must be removed.(功耗消耗为热量需要更贵的散热设备)
1.8 The Sea Change: The Switch from Uniprocessors to Multiprocessors
为了解决power问题,引入了multiprocessor的概念,这就要求现代程序员必须要考虑重新编写已有的程序以适应multiprocers;
目前看程序员的转型还很少,期待未来的改变;
For parallel programming, the challenges include scheduling,load balancing,time for synchronization and overhead for communication between the parties .
后面介绍各个章节为parallel revolution而引入的内容;
1.9 Real Stuff: Benchmarking the Intel Core i7
每个章节的结束会举个实例来复习本章内容,第一章结束介绍Benchmark程序如何对不同的CPU进行评价;
如今的Benchmark大多出自SPEC:System Performance Evaluation Cooperative
感兴趣可以看看,具体不介绍了;
1.10 Fallacies and Pitfalls
每章的结束也会有谬论与陷阱这一小节;
1.11 Concluding Remarks
总结本章所有内容。
1.12 Historical Perspective and Further Reading
online reading
网友评论