Table Of Contents
This sample, sampleUffFasterRCNN, serves as a demo on how to use a TensorFlow based Faster-RCNN model. It uses the Proposal
and CropAndResize
TensorRT plugins to implement the proposal layer and ROIPooling layer as custom layers since TensorRT has no native support for them.
The UFF Faster R-CNN network performs the task of object detection and localization in a single forward pass of the network. The Faster R-CNN network was trained on the ResNet-10 backbone (feature extractor) to detect 4 classes of objects: Automobile
, Roadsign
, Bicycle
and Person
along with the background
class(nothing).
This sample makes use of TensorRT plugins to run the UFF Faster R-CNN network. To use these plugins, the TensorFlow graph needs to be preprocessed, and we use the GraphSurgeon utility to do this.
The main components of this network are the Image Preprocessor, FeatureExtractor, Region Proposal Network (RPN), Proposal, ROIPooling (CropAndResize), Classifier and Postprocessor.
Image Preprocessor The image preprocessor step of the graph is responsible for resizing the image. The image is resized to a 3x272x480(CHW) size tensor. This step also performs per-channel mean value subtraction of the images. After preprocessing, the input images's channel order is BGR
instead of RGB
.
FeatureExtractor The FeatureExtractor portion of the graph runs the ResNet10 network on the preprocessed image. The feature maps generated are used by the RPN layer and the Proposal layer to generate the Regions of Interest(ROIs) that may contain objects. As a second branch, the feature maps are also used in the ROIPooling (or more precisely, CropAndResize layer) to crop out the patches from the feature maps with the specified ROIs output from Proposal layer.
In this network, the feature maps come from an intermediate layer's output in the ResNet-10 backbone. The intermediate layer has a cumulative stride of 16.
Region Proposal Network (RPN) The RPN takes the feature maps from the stride-16 backbone and append a small Convolutional Neural Network (CNN) head after it to detect whether a specific region of the image has object or not. It also outputs a rough coordinates of the candidate object.
Proposal The Proposal layer takes the input of the RPN and do some refinement of the candidate boxes from the RPN. The refinement includes taking the top boxes that has the highest confidence and do NMS (non-maximum suppression) against them. Finally, taking the top boxes again according to their confidence after NMS operation.
This operation is implemented in the Proposal
plugin as a TensorRT plugin.
CropAndResize The CropAndResize layer performs a TensorFlow implementation of the original ROIPooling layer in the Caffe implementation. The CropAndResize layer resizes the ROIs from the Proposal layer to a common target size and the output results are followed by a classifier to distinguish which class the ROI belongs to. The difference between the CropAndResize operation and the ROIPooling operation is the former use bilinear interpolation while the latter uses pooling.
This operation is implemented in the CropAndResize
plugin as a TensorRT plugin.
Classifier The classifier is a small network that takes the output of the CropAndResize layer as input and distinguish which class the ROI belongs to. Apart from that, it also gives a delta coordinates to refine the coordinates output from the RPN layer.
Postprocessor The Postprocessor applies the delta values from the classifier output to the coordinates from the RPN output and do NMS after that to get the final detection results.
Specifically, this sample performs the following steps:
The TensorFlow FasterRCNN graph has some operations that are currently not supported in TensorRT. Using a preprocessor on the graph, we can combine multiple operations in the graph into a single custom operation which can be implemented as a plugin layer in TensorRT. Currently, the preprocessor provides the ability to stitch all nodes within a namespace into one custom node.
To use the preprocessor, the convert-to-uff
utility should be called with a -p
flag and a config file. The config script should also include attributes for all custom plugins which will be embedded in the generated .uff
file. Current sample script for UFF Faster R-CNN is located in config.py
in this sample.
The generated network has an input node called input_1
, and the output nodes's names are dense_class/Softmax
, dense_regress/BiasAdd
and proposal
. These nodes are registered by the UFF Parser in the sample.
The input to the UFF Faster R-CNN network in this sample is 3 channel 480x272 images. In the sample, we subtract the per-channel mean values for the input images.
Since TensorRT does not depend on any computer vision libraries, the images are represented in binary R, G, and B values for each pixel. The format is Portable PixMap (PPM), which is a netpbm color image format. In this format, the R, G, and B values for each pixel are represented by a byte of integer (0-255) and they are stored together, pixel by pixel. The channel order of the input image is actually BGR instead of RGB due to implementation.
There is a simple PPM reading function called readPPMFile
.
Details about how to create TensorRT plugins can be found in Extending TensorRT With Custom Layers.
The config.py
defined for theconvert-to-uff
command should have the custom layers mapped to the plugin names in TensorRT by modifying the op field. The names of the plugin parameters should also exactly match those expected by the TensorRT plugins.
If the config.py
is defined as above, the NvUffParser will be able to parse the network and call the appropriate plugins with the correct parameters.
Details about some of the plugin layers implemented for UFF Faster R-CNN in TensorRT are given below.
CropAndResize plugin The CropAndResize
plugin crops out patches from the feature maps according to the ROI coordinates from the Proposal layer and resizes them to a common target size, for example, 7x7. The output tensor is used as input of the classifier that follows CropAndResize
plugin.
Proposal plugin The Proposal
plugin does the refinement of the candidate boxes from the RPN. The refinement includes selecting the top boxes according to their confidence, doing NMS and finally selecting the top boxes that has the highest confidence after NMS.
After the builder is created (see Building An Engine In C++) and the engine is serialized (see Serializing A Model In C++), we can perform inference. Steps for deserialization and running inference are outlined in Performing Inference In C++. The outputs of the UFF FasterRCNN network are human interpretable. The results are visualized by drawing the bounding boxes on the images.
In this sample, the following layers are used. For more information about these layers, see the TensorRT Developer Guide: Layers documentation.
Activation layer The Activation layer implements element-wise activation functions. Specifically, this sample uses the Activation layer with the type kRELU
.
Convolution layer The Convolution layer computes a 2D (channel, height, and width) convolution, with or without bias.
FullyConnected layer The FullyConnected layer implements a matrix-vector product, with or without bias.
Padding layer The IPaddingLayer implements spatial zero-padding of tensors along the two innermost dimensions.
Plugin layer Plugin layers are user-defined and provide the ability to extend the functionalities of TensorRT. See Extending TensorRT With Custom Layers for more details.
Pooling layer The Pooling layer implements pooling within a channel. Supported pooling types are maximum
, average
and maximum-average blend
.
Scale layer The Scale layer implements a per-tensor, per-channel, or per-element affine transformation and/or exponentiation by constant values.
SoftMax layer The SoftMax layer applies the SoftMax function on the input tensor along an input dimension specified by the user.
We provide a bash script to download the model as well as other data required for this sample: ./download_model.sh
.
The model is downloaded and unzipped in the directory uff_faster_rcnn
and the pb
model is uff_faster_rcnn/faster_rcnn.pb
.
Along with the pb
mode there are some PPM images and a list.txt
in the directory. These PPM images are the test images used in this sample. The list.txt
is used in the INT8 mode for listing the image names used in INT8 calibration step in TensorRT.
faster_rcnn.pb
) from the downloaded directory in the previous step to the working directory (for example /usr/src/tensorrt/data/faster-rcnn-uff
).Patch the UFF converter.
Apply a patch to the UFF converter to fix an issue with the Softmax layer in the UFF package. Let UFF_ROOT
denotes the root directory of the Python UFF package, for example, /usr/lib/python2.7/dist-packages/uff
Then, apply the patch with the following command: patch UFF_ROOT/converters/tensorflow/converter_functions.py < fix_softmax.patch
The patch file fix_softmax.patch
is generated using the UFF package version 0.6.3 in TensorRT 5.1 GA. Ensure your UFF package version is also 0.6.3 before applying the patch. For TensorRT 6.0, feel free to ignore this since it should already be fixed.
Run the following command for the conversion. ``` convert-to-uff -p config.py -O dense_class/Softmax -O dense_regress/BiasAdd -O proposal faster_rcnn.pb `` This saves the converted
.ufffile in the same directory as the input with the name
faster_rcnn.uff`.
The config.py
script specifies the preprocessing operations necessary for the UFF Faster R-CNN TensorFlow graph. The plugin nodes and plugin parameters used in the config.py
script should match the registered plugins in TensorRT.
list.txt
file with a list of all the calibration images (basename, without suffix) when running in INT8 mode. Copy the list.txt
to the same directory that contains the pb
model.pb
model.sample_uff_faster_rcnn
will be created in the build/cmake/out
directory.Run the sample to perform object detection and localization.
To run the sample in FP32 mode: ``` ./sample_uff_faster_rcnn –datadir /data/uff_faster_rcnn -W 480 -H 272 -I 2016_1111_185016_003_00001_night_000441.ppm ```
To run the sample in INT8 mode: ``` ./sample_uff_faster_rcnn –datadir /data/uff_faster_rcnn -i -W 480 -H 272 -I 2016_1111_185016_003_00001_night_000441.ppm ```
This output shows that the sample ran successfully;
PASSED`.--help
optionsTo see the full list of available options and their descriptions, use the -h
or --help
command line option.
The following resources provide a deeper understanding about sampleUffFasterRCNN.
Documentation
For terms and conditions for use, reproduction, and distribution, see the TensorRT Software License Agreement documentation.
July 2019 This is the first release of the README.md
file and sample.
There are no known issues in this sample.