Table Of Contents
The regionPlugin
is specifically used to generate encoded bounding box predictions, encoded bounding box objectness, and probabilities of bounding box being candidate objects for the YOLOv2 object detection model in TensorRT.
The regionPlugin
takes one input and generates one output.
The input has a shape of [N, C, H, W]
, where:
N
is the batch sizeC
is the number of channels in the input tensor. For example, C = num * (coords + 1 + classes)
.H
is the height of feature mapW
is the width of feature mapThe information order of the channels are:
[t_x, t_y, t_w, t_h, t_o, d_1, d_2, ..., d_classes] for bbox_1
[t_x, t_y, t_w, t_h, t_o, d_1, d_2, ..., d_classes] for bbox_2
, and so on[t_x, t_y, t_w, t_h, t_o, d_1, d_2, ..., d_classes]
for bbox_num
, totalling num * (coords + 1 + classes)
channels.Specifically:
t_x, t_y, t_w, t_h
are the predicted offsets of bounding boxes before sigmoid activationt_o
is the predicted objectness of the bounding box before sigmoid activation (see YOLOv2 paper)d_1, d_2, ..., d_classes
are the digits for each candidate class before the Softmax activation.The output has the same shape as the input, in other words, [N, C, H, W]
. The information order of the channels are:
[sigmoid(t_x), sigmoid(t_y), t_w, t_h, sigmoid(t_o), p_1, p_2, ..., p_classes]
for bbox_1
[sigmoid(t_x), sigmoid(t_y), t_w, t_h, sigmoid(t_o), p_1, p_2, ..., p_classes]
for bbox_2
, and so on[sigmoid(t_x), sigmoid(t_y), t_w, t_h, sigmoid(t_o), p_1, p_2, ..., p_classes]
for bbox_num
, totalling num * (coords + 1 + classes)
channelsSpecifically:
sigmoid(t_x), sigmoid(t_y)
, and sigmoid(t_o)
are sigmoid activated t_x, t_y
, and t_o
from the inputp_1, p_2, ..., p_classes
are the probability for each candidate class after the Softmax activation.Note: t_w
and t_h
from the input remain unchanged.
The regionPlugin
has a plugin creator class RegionPluginCreator
and plugin class Region
.
The following parameters were used to create the Region
instance.
Type | Parameter | Description |
---|---|---|
int | num | The number of predicted bounding box for each grid cell. |
int | coords | The number of coordinates for the bounding box. This value has to be 4 . Other values for coords are not supported currently. |
int | classes | The number of candidate classes to be predicted. |
smTree | softmaxTree | When performing yolo9000, softmaxTree is helping to perform Softmax on confidence scores, for example, to get the precise candidate classes through the word-tree structured candidate classes definition. softmaxTree is not required for non-hierarchical classification. The definition of softmaxTree can be found in NvInferPlugin.h and here. |
bool | hasSoftmaxTree | If softmaxTree is not nullptr , it is true ; else it is false . |
int | C | The number of channels in the input tensor. C = num * (coords + 1 + classes) has to be satisfied. |
int | H | The height of the input tensor (feature map). |
int | W | The width of the input tensor (feature map). |
The following resources provide a deeper understanding of the regionPlugin
plugin:
Networks
For terms and conditions for use, reproduction, and distribution, see the TensorRT Software License Agreement documentation.
May 2019 This is the first release of this README.md
file.
There are no known issues in this plugin.