Tensorflow 2.0 使用keras_cv和YoloV8

作者: FredricZhu | 来源:发表于2024-04-20 12:10 被阅读0次

【TensorFlow2.0】数据读取与使用方式
TensorFlow2.0教程-使用keras训练模型
TensorFlow 2.0高效开发指南
TensorFlow高阶API Keras介绍
Windows10搭建PyCharm+CUDA+TensorFl
小象学院深度学习之TensorFlow 2.0编程从入门到实践百
小象学院深度学习之TensorFlow 2.0编程从入门到实践
TensorFlow2.0教程-Keras 快速入门
TensorFlow2.0教程-keras 函数api
TensorFlow深度学习-第一章

本例是上例的一个扩展，但是需要先下载一个YouTube的视频。
其他配置环境请参考上节，https://www.jianshu.com/p/3ac3f54636f8
下载yt-dlp，用于下载Youtube的视频文件，
yt-dlp下载地址
https://gitlab.com/zhuge20100104/cpp_practice/-/blob/master/simple_learn/deep_learning/15_tensorflow_object_detection_api_in_video/yt-dlp?ref_type=heads

写一个download.sh脚本用于简化下载步骤，download.sh脚本的代码如下，

#!/bin/bash

if [[ "$#" -lt 1 ]]; then 
    echo "Usage: ./download.sh {You tube file link}"
    exit 1
fi

./yt-dlp "${1}"  --proxy http://10.224.0.110:3128 --yes-playlist -f best

使用download.sh下载detect所需要的猫的视频

 chmod a+x ./download.sh
./download.sh https://www.youtube.com/watch?v=IzluNxh-8_o

完事儿以后将下载的视频Rename成cat.mp4。

注意以上步骤均需要特殊网络才能下载，不要问为什么。

还有download.sh需要在linux环境运行，我觉得直接在tensorflow的docker里面执行就很好。

Detect object in video的代码如下，
完整的notebook地址如下，
https://gitlab.com/zhuge20100104/cpp_practice/-/blob/master/simple_learn/deep_learning/15_tensorflow_object_detection_api_in_video/15.%20Tensorflow%20Objection%20API%20in%20Video.ipynb?ref_type=heads

# imports
import os 
os.environ['KERAS_BACKEND'] = 'jax'

import tensorflow as tf

from tensorflow import data as tf_data
import tensorflow_datasets as tfds
import tensorflow.keras
import keras_cv
import keras
import numpy as np
from keras_cv import bounding_box
import os
from keras_cv import visualization
import tqdm

# 3. env setup
%matplotlib inline

# 详细细节可参考: https://keras.io/guides/keras_cv/object_detection_keras_cv/

# Let's get started by constructing a YOLOV8Detector pretrained on the pascalvoc dataset.
pretrained_model = keras_cv.models.YOLOV8Detector.from_preset(
    "yolo_v8_m_pascalvoc", bounding_box_format="xywh"
)

# Resize the image to the model compat input size
inference_resizing = keras_cv.layers.Resizing(
    640, 640, pad_to_aspect_ratio=True, bounding_box_format='xywh'
)

# keras_cv.visualization.plot_bounding_box_gallery() supports a class_mapping parameter to
# highlight what class each box was assigned to. Let's assemble a class mapping now.

class_ids = [
    "Aeroplane",
    "Bicycle",
    "Bird",
    "Boat",
    "Bottle",
    "Bus",
    "Car",
    "Cat",
    "Chair",
    "Cow",
    "Dining Table",
    "Dog",
    "Horse",
    "Motorbike",
    "Person",
    "Potted Plant",
    "Sheep",
    "Sofa",
    "Train",
    "Tvmonitor",
    "Total",
]

class_mapping = dict(zip(range(len(class_ids)), class_ids))

import imageio
from datetime import datetime

input_video = 'cats'

video_reader = imageio.get_reader('{}.mp4'.format(input_video))
video_writer = imageio.get_writer('{}_annotated.mp4'.format(input_video), fps=10)

t0 = datetime.now()
n_frames = 0
for frame in video_reader:
    if n_frames > 10000:
        break
    n_frames += 1
    # print(frame.shape)
    # This can be used as our inference preprocessing pipeline:
    image_batch = inference_resizing([frame])
    y_pred = pretrained_model.predict(image_batch)
    # 下面这个图就可以save了
    image_with_boxes = visualization.draw_bounding_boxes(image_batch,
        bounding_boxes=y_pred,
        color = (0, 255, 0),
        bounding_box_format="xywh",
        class_mapping=class_mapping, )
    
    image_with_boxes = image_with_boxes.reshape(640, 640, 3)
    video_writer.append_data(image_with_boxes)
fps = n_frames/(datetime.now() - t0).total_seconds()
print('Frames processed: {}, speed: {} fps'.format(n_frames, fps))
# Clean up 
video_writer.close()
video_reader.close()

程序输出的Video效果如下，

image.png