The BufferManager class handles host and device buffer allocation and deallocation. More...
Public Member Functions | |
BufferManager (std::shared_ptr< nvinfer1::ICudaEngine > engine, const int batchSize=0, const nvinfer1::IExecutionContext *context=nullptr) | |
Create a BufferManager for handling buffer interactions with engine. More... | |
std::vector< void * > & | getDeviceBindings () |
Returns a vector of device buffers that you can use directly as bindings for the execute and enqueue methods of IExecutionContext. More... | |
const std::vector< void * > & | getDeviceBindings () const |
Returns a vector of device buffers. More... | |
void * | getDeviceBuffer (const std::string &tensorName) const |
Returns the device buffer corresponding to tensorName. More... | |
void * | getHostBuffer (const std::string &tensorName) const |
Returns the host buffer corresponding to tensorName. More... | |
size_t | size (const std::string &tensorName) const |
Returns the size of the host and device buffers that correspond to tensorName. More... | |
void | dumpBuffer (std::ostream &os, const std::string &tensorName) |
Dump host buffer with specified tensorName to ostream. More... | |
template<typename T > | |
void | print (std::ostream &os, void *buf, size_t bufSize, size_t rowCount) |
Templated print function that dumps buffers of arbitrary type to std::ostream. More... | |
void | copyInputToDevice () |
Copy the contents of input host buffers to input device buffers synchronously. More... | |
void | copyOutputToHost () |
Copy the contents of output device buffers to output host buffers synchronously. More... | |
void | copyInputToDeviceAsync (const cudaStream_t &stream=0) |
Copy the contents of input host buffers to input device buffers asynchronously. More... | |
void | copyOutputToHostAsync (const cudaStream_t &stream=0) |
Copy the contents of output device buffers to output host buffers asynchronously. More... | |
~BufferManager ()=default | |
Static Public Attributes | |
static const size_t | kINVALID_SIZE_VALUE = ~size_t(0) |
Private Member Functions | |
void * | getBuffer (const bool isHost, const std::string &tensorName) const |
void | memcpyBuffers (const bool copyInput, const bool deviceToHost, const bool async, const cudaStream_t &stream=0) |
Private Attributes | |
std::shared_ptr< nvinfer1::ICudaEngine > | mEngine |
The pointer to the engine. More... | |
int | mBatchSize |
The batch size for legacy networks, 0 otherwise. More... | |
std::vector< std::unique_ptr< ManagedBuffer > > | mManagedBuffers |
The vector of pointers to managed buffers. More... | |
std::vector< void * > | mDeviceBindings |
The vector of device buffers needed for engine execution. More... | |
The BufferManager class handles host and device buffer allocation and deallocation.
This RAII class handles host and device buffer allocation and deallocation, memcpy between host and device buffers to aid with inference, and debugging dumps to validate inference. The BufferManager class is meant to be used to simplify buffer management and any interactions between buffers and the engine.
|
inline |
Create a BufferManager for handling buffer interactions with engine.
|
default |
|
inline |
Returns a vector of device buffers that you can use directly as bindings for the execute and enqueue methods of IExecutionContext.
|
inline |
Returns a vector of device buffers.
|
inline |
Returns the device buffer corresponding to tensorName.
Returns nullptr if no such tensor can be found.
|
inline |
Returns the host buffer corresponding to tensorName.
Returns nullptr if no such tensor can be found.
|
inline |
Returns the size of the host and device buffers that correspond to tensorName.
Returns kINVALID_SIZE_VALUE if no such tensor can be found.
|
inline |
Dump host buffer with specified tensorName to ostream.
Prints error message to std::ostream if no such tensor can be found.
|
inline |
Templated print function that dumps buffers of arbitrary type to std::ostream.
rowCount parameter controls how many elements are on each line. A rowCount of 1 means that there is only 1 element on each line.
|
inline |
Copy the contents of input host buffers to input device buffers synchronously.
|
inline |
Copy the contents of output device buffers to output host buffers synchronously.
|
inline |
Copy the contents of input host buffers to input device buffers asynchronously.
|
inline |
Copy the contents of output device buffers to output host buffers asynchronously.
|
inlineprivate |
|
inlineprivate |
|
static |
|
private |
The pointer to the engine.
|
private |
The batch size for legacy networks, 0 otherwise.
|
private |
The vector of pointers to managed buffers.
|
private |
The vector of device buffers needed for engine execution.