TensorFlow1.0 - C2 Guide - 8 Ext

作者: 左心Chris | 来源:发表于2019-09-30 17:53 被阅读0次

TensorFlow1.0 - C2 Guide - 8 Ext
TensorFlow1.0 - C2 Guide - 1 Hig
TensorFlow1.0 - C2 Guide - 2 Est
TensorFlow1.0 - C2 Guide - 4 Low
TensorFlow1.0 - C2 Guide - 5 Emb
App Extension - iOS开发
M3U8格式小结
tensorflow1.0 vs tensorflow2.0 v
TensorFlow2.0 - C2 Guide - 2 Ker
Listen And Listen

1 TensorFlow architecture

Client
Distributed Master
Worker Services(one for each task)
Kernel Implementations

Client

Create a session, which sends the graph definition to the distributed master as a tf.GraphDef protocol buffer. When the client evaluates a node or nodes in the graph, the evaluation triggers a call to the distributed master to initiate computation.
简单来说就是传给master图的定义

Distributed Master

修剪graph来找到为了evaluate node需要的subgraph
分割为graph pieces给每个设备
缓存这些graph pieces

Where graph edges are cut by the partition, the distributed master inserts send and receive nodes to pass information between the distributed tasks (Figure 6)

The distributed master then ships the graph pieces to the distributed tasks.

Worker Service

处理来自master的请求
调度核心计算
不同的task直接交流
The worker service dispatches kernels to local devices and runs kernels in parallel when possible, for example by using multiple CPU cores or GPU streams.
对于不同的设备定义了不同的Send和Recv
Transfers between local CPU and GPU devices use the cudaMemcpyAsync() API to overlap computation and data transfer.
Transfers between two local GPUs use peer-to-peer DMA, to avoid an expensive copy via the host CPU.
We also have preliminary support for NVIDIA's NCCL library for multi-GPU communication, see: tf.contrib.nccl

Kernel Implementations

The runtime contains over 200 standard operations including mathematical, array manipulation, control flow, and state management operations.

网友评论

本文标题：TensorFlow1.0 - C2 Guide - 8 Ext

本文链接：https://www.haomeiwen.com/subject/gzcmuctx.html

延伸阅读

深度阅读

您也可以注册成为美文阅读网的作者，发表您的原创作品、分享您的心情！