香港中文大學(xué)多媒體實(shí)驗(yàn)室 | 開源視頻目標(biāo)檢測&跟蹤平臺

發(fā)布人：CV研究院時(shí)間：2021-01-11 來源：工程師

加入技術(shù)交流群
- 掃碼加入
  和技術(shù)大咖面對面交流
  海量資料庫查詢

從去年2020年，說起目標(biāo)檢測，大多數(shù)人也許會知道“MMDetection框架”。今天框架還是香港中文大學(xué)實(shí)驗(yàn)室貢獻(xiàn)，首先我們說下MMDetection框架，然后詳細(xì)介紹一體化視頻感知平臺“MMTracking”。

MMDetection V1.0版本發(fā)布以來，就獲得很多用戶的喜歡，發(fā)布以來，其中有不少有價(jià)值的建議，同時(shí)也有很多開發(fā)者貢獻(xiàn)代碼，在2020年5月6日，發(fā)布了MMDetection V2.0。

經(jīng)過對模型各個(gè)組件的重構(gòu)和優(yōu)化，全面提升了MMDetection的速度和精度，達(dá)到了現(xiàn)有檢測框架中的最優(yōu)水平。通過更細(xì)粒度的模塊化設(shè)計(jì)，MMDetection的任務(wù)拓展性大大增強(qiáng)，成為了檢測相關(guān)項(xiàng)目的基礎(chǔ)平臺。同時(shí)對文檔和教程進(jìn)行了完善，增強(qiáng)用戶體驗(yàn)。

MMDetection中實(shí)現(xiàn)了RPN，F(xiàn)ast R-CNN，F(xiàn)aster R-CNN，Mask R-CNN等網(wǎng)絡(luò)和框架。先簡單介紹一下和 Detectron 的對比：

performance 稍高
訓(xùn)練速度稍快
所需顯存稍小

但更重要的是，基于PyTorch和基于Caffe2的code相比，易用性是有代差的。成功安裝 Detectron的時(shí)間，大概可以裝好一打的mmdetection吧。

當(dāng)然Detectron有些優(yōu)勢也很明顯，作為第一個(gè)全面的detection codebase，加上FAIR的金字招牌，release的模型也比較全面。研究者也在努力擴(kuò)充model zoo，奈何人力和算力還是有很大差距，所以還需要時(shí)間。

具體說說上面提到的三個(gè)方面吧。首先是performance ，由于PyTorch官方model zoo里面的ResNet結(jié)構(gòu)和Detectron所用的ResNet有細(xì)微差別（mmdetection中可以通過backbone的style參數(shù)指定），導(dǎo)致模型收斂速度不一樣，所以用兩種結(jié)構(gòu)都跑了實(shí)驗(yàn)，一般來說在1x的lr schedule下Detectron的會高，但2x的結(jié)果PyTorch的結(jié)構(gòu)會比較高。

速度方面Mask R-CNN差距比較大，其余的很小。采用相同的setting，Detectron每個(gè)iteration需要0.89s，而mmdetection只需要0.69s。Fast R-CNN比較例外，比Detectron的速度稍慢。另外在自己的服務(wù)器上跑Detectron會比官方report的速度慢20%左右，猜測是FB的Big Basin服務(wù)器性能比研究者好？

顯存方面優(yōu)勢比較明顯，會小30%左右。但這個(gè)和框架有關(guān)，不完全是codebase優(yōu)化的功勞。一個(gè)讓研究者比較意外的結(jié)果是現(xiàn)在的codebase版本跑ResNet-50的Mask R-CNN，每張卡（12 G）可以放4張圖，比研究者比賽時(shí)候小了不少。

MMTracking

MMDetection是商湯科技（2018 COCO 目標(biāo)檢測挑戰(zhàn)賽冠軍）和香港中文大學(xué)開源的一個(gè)基于Pytorch實(shí)現(xiàn)的深度學(xué)習(xí)目標(biāo)檢測工具箱。

新年2021年，香港中文大學(xué)多媒體實(shí)驗(yàn)室（MMLab）OpenMMLab 又研究并貢獻(xiàn)新的平臺工具，發(fā)布了一款一體化視頻目標(biāo)感知平臺MMTracking。該框架基于PyTorch寫成，支持單目標(biāo)跟蹤、多目標(biāo)跟蹤與視頻目標(biāo)檢測，目前已開源。我們開始詳細(xì)分下下。

主要特征：

第一個(gè)統(tǒng)一的視頻感知平臺

MMLab是第一個(gè)統(tǒng)一多功能視頻感知任務(wù)的開源工具箱，包括視頻目標(biāo)檢測，單個(gè)目標(biāo)跟蹤，多個(gè)目標(biāo)跟蹤。

模塊化設(shè)計(jì)

MMLab將視頻感知框架分解成不同的組件，可以很容易地通過組合不同的模塊來構(gòu)建定制的方法。

Simple, Fast and Strong

Simple：MMTracking與其他Open MMLab項(xiàng)目交互。它是建立在MMDetection上的，通過修改配置文件選擇。

Fast：所有操作都運(yùn)行在GPU上。訓(xùn)練和推理速度比其他實(shí)現(xiàn)快。

Strong：性能超過最先進(jìn)的模型，其中一些模型甚至優(yōu)于官方的實(shí)現(xiàn)。

如何使用：

1、Create a conda virtual environment and activate it.

conda create -n open-mmlab python=3.7 -y

conda activate open-mmlab

2、Install PyTorch and torchvision following the official instructions, e.g.,

conda install pytorch torchvision -c pytorch

Note: Make sure that your compilation CUDA version and runtime CUDA version match. You can check the supported CUDA version for precompiled packages on the PyTorch website.

E.g.1 If you have CUDA 10.1 installed under /usr/local/cuda and would like to install PyTorch 1.5, you need to install the prebuilt PyTorch with CUDA 10.1.

conda install pytorch cudatoolkit=10.1 torchvision -c pytorch

E.g. 2 If you have CUDA 9.2 installed under /usr/local/cuda and would like to install PyTorch 1.3.1., you need to install the prebuilt PyTorch with CUDA 9.2.

conda install pytorch=1.3.1 cudatoolkit=9.2 torchvision=0.4.2 -c pytorch

If you build PyTorch from source instead of installing the prebuilt pacakge, you can use more CUDA versions such as 9.0.

3、Install mmcv-full, we recommend you to install the pre-build package as below.

pip install mmcv-full -f https://download.openmmlab.com/mmcv/dist/cu101/torch1.6.0/index.html

See here for different versions of MMCV compatible to different PyTorch and CUDA versions. Optionally you can choose to compile mmcv from source by the following command

git clone https://github.com/open-mmlab/mmcv.git

cd mmcv

MMCV_WITH_OPS=1 pip install -e . # package mmcv-full will be installed after this step

cd ..

Or directly run

pip install mmcv-full

4、Install MMDetection

pip install mmdet

Optionally, you can also build MMDetection from source in case you want to modify the code:

git clone https://github.com/open-mmlab/mmdetection.git

cd mmdetection

pip install -r requirements/build.txt

pip install -v -e . # or "python setup.py develop"

5、Clone the MMTracking repository.

git clone https://github.com/open-mmlab/mmtracking.git

cd mmtracking

6、Install build requirements and then install MMTracking.

pip install -r requirements/build.txt

pip install -v -e . # or "python setup.py develop"

使用該平臺測試：

This section will show how to test existing models on supported datasets. The following testing environments are supported:

single GPU
single node multiple GPU
multiple nodes

During testing, different tasks share the same API and we only support samples_per_gpu = 1.

You can use the following commands for testing:

# single-gpu testing

python tools/test.py ${CONFIG_FILE} [--checkpoint ${CHECKPOINT_FILE}] [--out ${RESULT_FILE}] [--eval ${EVAL_METRICS}]

# multi-gpu testing

./tools/dist_test.sh ${CONFIG_FILE} ${GPU_NUM} [--checkpoint ${CHECKPOINT_FILE}] [--out ${RESULT_FILE}] [--eval ${EVAL_METRICS}]

Optional arguments:

CHECKPOINT_FILE: Filename of the checkpoint. You do not need to define it when applying some MOT methods but specify the checkpoints in the config.

RESULT_FILE: Filename of the output results in pickle format. If not specified, the results will not be saved to a file.

EVAL_METRICS: Items to be evaluated on the results. Allowed values depend on the dataset, e.g., bbox is available for ImageNet VID, track is available for LaSOT, bbox and track are both suitable for MOT17.

--cfg-options: If specified, the key-value pair optional cfg will be merged into config file

--eval-options: If specified, the key-value pair optional eval cfg will be kwargs for dataset.evaluate() function, it’s only for evaluation

--format-only: If specified, the results will be formated to the offical format.