Environment Variables
MXNet 有几个设置可以通过环境变量来修改.
一般情况下, 用户是不需要修改这些设置的, 我们列出来作为参考, 以备不时之需.
- MXNET_GPU_WORKER_NTHREADS (default=2)
- 每个 GPU 上用来做计算工作的线程的最大数目.
- MXNET_GPU_COPY_NTHREADS (default=1)
- 每个 GPU 上用来做内存拷贝工作的线程的最大数目.
- MXNET_CPU_WORKER_NTHREADS (default=1)
- 做 CPU 计算工作的线程的最大数目.
- MXNET_CPU_PRIORITY_NTHREADS (default=4)
- 用来做优先的 CPU 工作的线程的数目.
- MXNET_EXEC_ENABLE_INPLACE (default=true)
- 在符号计算中是否能做原地优化.
- MXNET_EXEC_MATCH_RANGE (default=10)
- The rough matching scale in symbolic execution memory allocator.
- Set this to 0 if we do not want to enable memory sharing between graph nodes(for debug purpose).
- MXNET_EXEC_NUM_TEMP (default=1)
- Maximum number of temp workspace we can allocate to each device.
- Set this to small number can save GPU memory.
- It will also likely to decrease level of parallelism, which is usually OK.
- MXNET_ENGINE_TYPE (default=ThreadedEnginePerDevice)
- The type of underlying execution engine of MXNet.
- List of choices
- NaiveEngine: very simple engine that use master thread to do computation.
- ThreadedEngine: a threaded engine that uses global thread pool to schedule jobs.
- ThreadedEnginePerDevice: a threaded engine that allocates thread per GPU.
- MXNET_KVSTORE_REDUCTION_NTHREADS (default=4)
- Number of threads used for summing of big arrays.
- MXNET_KVSTORE_BIGARRAY_BOUND (default=1e6)
- The minimum size of "big array".
- When the array size is bigger than this threshold, MXNET_KVSTORE_REDUCTION_NTHREADS threads will be used for reduction.
Settings for Minimum Memory Usage
- Make sure
min(MXNET_EXEC_NUM_TEMP, MXNET_GPU_WORKER_NTHREADS) = 1
- The default setting satisfies this.
Settings for More GPU Parallelism
- Set
MXNET_GPU_WORKER_NTHREADS
to larger number (e.g. 2)- You may want to set
MXNET_EXEC_NUM_TEMP
to reduce memory usage.
- You may want to set
- This may not speed things up, especially for image applications, because GPU is usually fully utilized even with serialized jobs.
网友评论