美文网首页
使用OpenVINO的AsyncInferQueue类提升AI推

使用OpenVINO的AsyncInferQueue类提升AI推

作者: LabVIEW_Python | 来源:发表于2023-01-05 13:55 被阅读0次

    本文将介绍基于OpenVINO的异步推理队列类 AyncInferQueue,启动多个(>2)推理请求(infer request),在硬件投入不变的情况下,进一步提升 AI 推理程序的吞吐量(Throughput)

    OpenVINO运行时(Runtime)用推理请求(infer request)来抽象在指定计算设备上运行已编译模型(Compiled_Model)。从编写程序的角度看,推理请求是一个类,封装了支持推理请求以同步或异步方式运行的属性和方法

    OpenVINO运行时(Runtime)提供 AsyncInferQueue 类来抽象并管理异步推理请求池,其常用方法和属性有:

    • init(self, compiled_model, jobs = 0):创建AsyncInferQueue对象
    • set_callback(func_name):为推理请求池中所有的推理请求设置统一的回调函数
    • start_async(inputs, userdata = None):异步启动推理请求
    • wait_all():等待所有的推理请求执行完毕

    基于 AsyncInferQueue 类 YOLOv5 模型的异步推理范例程序: yolov5_async_infer_queue.py

    ...
    def preprocess(frame):
        # Preprocess the frame
        letterbox_im, _, _= letterbox(frame, auto=False) # preprocess frame by letterbox
        im = letterbox_im.transpose((2, 0, 1))[::-1]  # HWC to CHW, BGR to RGB
        im = np.float32(im) / 255.0    # 0 - 255 to 0.0 - 1.0
        blob = im[None]  # expand for batch dim
        return blob, letterbox_im.shape[:-1], frame.shape[:-1]
    def postprocess(ireq: InferRequest, user_data: tuple):
        result = ireq.results[ireq.model_outputs[0]]
        dets = non_max_suppression(torch.tensor(result))[0].numpy()
        bboxes, scores, class_ids= dets[:,:4], dets[:,4], dets[:,5]
        # rescale the coordinates
        bboxes = scale_coords(user_data[1], bboxes, user_data[2]).astype(int)
        print(user_data[0],"\t"+f"{ireq.latency:.3f}"+"\t", class_ids)
        return 
    # Step1:Initialize OpenVINO Runtime Core
    core = Core()
    # Step2:  Build compiled model
    device = device = ['GPU.0', 'GPU.1', 'CPU', 'AUTO', 'AUTO:GPU,-CPU'][0]
    cfgs = {}
    cfgs['PERFORMANCE_HINT'] = ['THROUGHPUT', 'LATENCY', 'CUMULATIVE_THROUGHPUT'][0]
    net = core.compile_model("yolov5s.xml",device,cfgs)
    output_node = net.outputs[0]
    b,n,input_h,input_w = net.inputs[0].shape
    # Step3:  Initialize InferQueue
    ireqs = AsyncInferQueue(net)
    print('Number of infer requests in InferQueue:', len(ireqs))
    # Step3.1: Set unified callback on all InferRequests from queue's pool
    ireqs.set_callback(postprocess)
    # Step4:  Read the images
    image_folder = "./data/images/"
    image_files= os.listdir(image_folder)
    print(image_files)
    frames = []
    for image_file in image_files:
        frame = cv2.imread(os.path.join(image_folder, image_file))
        frames.append(frame)
    # 4.1 Warm up
    for id, _ in enumerate(ireqs):
        # Preprocess the frame
        start = perf_counter()
        blob, letterbox_shape, frame_shape = preprocess(frames[id % 4])
        end = perf_counter()
        print(f"Preprocess {id}: {(end-start):.4f}.")
        # Run asynchronous inference using the next available InferRequest from the pool
        ireqs.start_async({0:blob},(id, letterbox_shape, frame_shape))
    ireqs.wait_all()
    # Step5:  Benchmark the Async Infer
    start = perf_counter()
    in_fly = set()
    latencies = []
    niter = 16
    for i in range(niter):
        # Preprocess the frame
        blob, letterbox_shape, frame_shape = preprocess(frames[i % 4]) 
        idle_id = ireqs.get_idle_request_id()
        if idle_id in in_fly:
            latencies.append(ireqs[idle_id].latency)
        else:
            in_fly.add(idle_id)
        # Run asynchronous inference using the next available InferRequest from the pool 
        ireqs.start_async({0:blob},(i, letterbox_shape, frame_shape) )
    ireqs.wait_all()
    

    蝰蛇峡谷NUC上运行结果:

    YOLOv5s: 158 FPS @ Intel A770m
    结论:使用 OpenVINO™ Runtime 的 AsyncInferQueue 类,可以极大提升 AI 推理程序的吞出量。

    相关文章

      网友评论

          本文标题:使用OpenVINO的AsyncInferQueue类提升AI推

          本文链接:https://www.haomeiwen.com/subject/yacjcdtx.html