美文网首页
UnifiedMemoryPerf-Unified and ot

UnifiedMemoryPerf-Unified and ot

作者: fantasy5328 | 来源:发表于2020-03-14 22:19 被阅读0次

    https://github.com/NVIDIA/cuda-samples/tree/master/Samples/UnifiedMemoryPerf
    Unified and other CUDA Memories Performance
    该示例演示了使用带/不带提示的统一内存矩阵乘法内核以及其他类型的内存(例如零拷贝缓冲区,可分页,分页锁定的内存,在单个GPU上执行同步和异步传输)的性能比较:

    UMhint UMhntAs UMeasy 0Copy MemCopy CpAsync CpHpglk CpPglAs

    "UMhint", // Managed Memory With Hints
    "UMhntAs", // Managed Memory With_Hints Async
    "UMeasy", // Managed_Memory with No Hints
    "0Copy", // Zero Copy
    "MemCopy", // USE HOST PAGEABLE AND DEVICE_MEMORY
    "CpAsync", // USE HOST PAGEABLE AND DEVICE_MEMORY ASYNC
    "CpHpglk", // USE HOST PAGELOCKED AND DEVICE MEMORY
    "CpPglAs" // USE HOST PAGELOCKED AND DEVICE MEMORY ASYNC

    测试结果:

    • (Dell Precision 5520) Device 0: "Quadro M1200" (Maxwell cc5.0)


      Quadro M1200

    -Jetson Xavier capability 7.2 (Volta)

    • (机械革命S1):mx150(Pascal)

    相关文章

      网友评论

          本文标题:UnifiedMemoryPerf-Unified and ot

          本文链接:https://www.haomeiwen.com/subject/piloshtx.html