TensorRT  7.2.1.6
NVIDIA TensorRT
Looking for a C++ dev who knows TensorRT?
I'm looking for work. Hire me!
nvinfer1::IExecutionContext Class Referenceabstract

Context for executing inference using an engine, with functionally unsafe features. More...

Public Member Functions

virtual bool execute (int32_t batchSize, void **bindings) noexcept=0
 Synchronously execute inference on a batch. More...
 
virtual bool enqueue (int32_t batchSize, void **bindings, cudaStream_t stream, cudaEvent_t *inputConsumed) noexcept=0
 Asynchronously execute inference on a batch. More...
 
virtual void setDebugSync (bool sync) noexcept=0
 Set the debug sync flag. More...
 
virtual bool getDebugSync () const noexcept=0
 Get the debug sync flag. More...
 
virtual void setProfiler (IProfiler *) noexcept=0
 Set the profiler. More...
 
virtual IProfilergetProfiler () const noexcept=0
 Get the profiler. More...
 
virtual const ICudaEnginegetEngine () const noexcept=0
 Get the associated engine. More...
 
virtual void destroy () noexcept=0
 Destroy this object. More...
 
virtual void setName (const char *name) noexcept=0
 Set the name of the execution context. More...
 
virtual const char * getName () const noexcept=0
 Return the name of the execution context. More...
 
virtual void setDeviceMemory (void *memory) noexcept=0
 Set the device memory for use by this execution context. More...
 
virtual Dims getStrides (int32_t bindingIndex) const noexcept=0
 Return the strides of the buffer for the given binding. More...
 
 __attribute__ ((deprecated)) virtual bool setOptimizationProfile(int32_t profileIndex) noexcept=0
 Select an optimization profile for the current context. More...
 
virtual int32_t getOptimizationProfile () const noexcept=0
 Get the index of the currently selected optimization profile. More...
 
virtual bool setBindingDimensions (int32_t bindingIndex, Dims dimensions) noexcept=0
 Set the dynamic dimensions of a binding. More...
 
virtual Dims getBindingDimensions (int32_t bindingIndex) const noexcept=0
 Get the dynamic dimensions of a binding. More...
 
virtual bool setInputShapeBinding (int32_t bindingIndex, const int32_t *data) noexcept=0
 Set values of input tensor required by shape calculations. More...
 
virtual bool getShapeBinding (int32_t bindingIndex, int32_t *data) const noexcept=0
 Get values of an input tensor required for shape calculations or an output tensor produced by shape calculations. More...
 
virtual bool allInputDimensionsSpecified () const noexcept=0
 Whether all dynamic dimensions of input tensors have been specified. More...
 
virtual bool allInputShapesSpecified () const noexcept=0
 Whether all input shape bindings have been specified. More...
 
virtual void setErrorRecorder (IErrorRecorder *recorder) noexcept=0
 Set the ErrorRecorder for this interface. More...
 
virtual IErrorRecordergetErrorRecorder () const noexcept=0
 get the ErrorRecorder assigned to this interface. More...
 
virtual bool executeV2 (void **bindings) noexcept=0
 Synchronously execute inference a network. More...
 
virtual bool enqueueV2 (void **bindings, cudaStream_t stream, cudaEvent_t *inputConsumed) noexcept=0
 Asynchronously execute inference. More...
 
virtual bool setOptimizationProfileAsync (int32_t profileIndex, cudaStream_t stream) noexcept=0
 Select an optimization profile for the current context with async semantics. More...
 

Protected Member Functions

virtual ~IExecutionContext () noexcept
 

Detailed Description

Context for executing inference using an engine, with functionally unsafe features.

Multiple execution contexts may exist for one ICudaEngine instance, allowing the same engine to be used for the execution of multiple batches simultaneously. If the engine supports dynamic shapes, each execution context in concurrent use must use a separate optimization profile.

Warning
Do not inherit from this class, as doing so will break forward-compatibility of the API and ABI.

Constructor & Destructor Documentation

◆ ~IExecutionContext()

virtual nvinfer1::IExecutionContext::~IExecutionContext ( )
inlineprotectedvirtualnoexcept

Member Function Documentation

◆ execute()

virtual bool nvinfer1::IExecutionContext::execute ( int32_t  batchSize,
void **  bindings 
)
pure virtualnoexcept

Synchronously execute inference on a batch.

This method requires an array of input and output buffers. The mapping from tensor names to indices can be queried using ICudaEngine::getBindingIndex()

Parameters
batchSizeThe batch size. This is at most the value supplied when the engine was built.
bindingsAn array of pointers to input and output buffers for the network.
Returns
True if execution succeeded.
See also
ICudaEngine::getBindingIndex() ICudaEngine::getMaxBatchSize()

◆ enqueue()

virtual bool nvinfer1::IExecutionContext::enqueue ( int32_t  batchSize,
void **  bindings,
cudaStream_t  stream,
cudaEvent_t *  inputConsumed 
)
pure virtualnoexcept

Asynchronously execute inference on a batch.

This method requires an array of input and output buffers. The mapping from tensor names to indices can be queried using ICudaEngine::getBindingIndex()

Parameters
batchSizeThe batch size. This is at most the value supplied when the engine was built.
bindingsAn array of pointers to input and output buffers for the network.
streamA cuda stream on which the inference kernels will be enqueued
inputConsumedAn optional event which will be signaled when the input buffers can be refilled with new data
Returns
True if the kernels were enqueued successfully.
See also
ICudaEngine::getBindingIndex() ICudaEngine::getMaxBatchSize()
Here is the caller graph for this function:

◆ setDebugSync()

virtual void nvinfer1::IExecutionContext::setDebugSync ( bool  sync)
pure virtualnoexcept

Set the debug sync flag.

If this flag is set to true, the engine will log the successful execution for each kernel during execute(). It has no effect when using enqueue().

See also
getDebugSync()

◆ getDebugSync()

virtual bool nvinfer1::IExecutionContext::getDebugSync ( ) const
pure virtualnoexcept

Get the debug sync flag.

See also
setDebugSync()

◆ setProfiler()

virtual void nvinfer1::IExecutionContext::setProfiler ( IProfiler )
pure virtualnoexcept

Set the profiler.

See also
IProfiler getProfiler()

◆ getProfiler()

virtual IProfiler* nvinfer1::IExecutionContext::getProfiler ( ) const
pure virtualnoexcept

Get the profiler.

See also
IProfiler setProfiler()

◆ getEngine()

virtual const ICudaEngine& nvinfer1::IExecutionContext::getEngine ( ) const
pure virtualnoexcept

Get the associated engine.

See also
ICudaEngine

◆ destroy()

virtual void nvinfer1::IExecutionContext::destroy ( )
pure virtualnoexcept

Destroy this object.

◆ setName()

virtual void nvinfer1::IExecutionContext::setName ( const char *  name)
pure virtualnoexcept

Set the name of the execution context.

This method copies the name string.

See also
getName()

◆ getName()

virtual const char* nvinfer1::IExecutionContext::getName ( ) const
pure virtualnoexcept

Return the name of the execution context.

See also
setName()

◆ setDeviceMemory()

virtual void nvinfer1::IExecutionContext::setDeviceMemory ( void *  memory)
pure virtualnoexcept

Set the device memory for use by this execution context.

The memory must be aligned with cuda memory alignment property (using cudaGetDeviceProperties()), and its size must be at least that returned by getDeviceMemorySize(). Setting memory to nullptr is acceptable if getDeviceMemorySize() returns 0. If using enqueue() to run the network, the memory is in use from the invocation of enqueue() until network execution is complete. If using execute(), it is in use until execute() returns. Releasing or otherwise using the memory for other purposes during this time will result in undefined behavior.

See also
ICudaEngine::getDeviceMemorySize() ICudaEngine::createExecutionContextWithoutDeviceMemory()

◆ getStrides()

virtual Dims nvinfer1::IExecutionContext::getStrides ( int32_t  bindingIndex) const
pure virtualnoexcept

Return the strides of the buffer for the given binding.

The strides are in units of elements, not components or bytes. For example, for TensorFormat::kHWC8, a stride of one spans 8 scalars.

Note that strides can be different for different execution contexts with dynamic shapes.

If the bindingIndex is invalid or there are dynamic dimensions that have not been set yet, returns Dims with Dims::nbDims = -1.

Parameters
bindingIndexThe binding index.
See also
getBindingComponentsPerElement

◆ __attribute__()

nvinfer1::IExecutionContext::__attribute__ ( (deprecated)  )
pure virtualnoexcept

Select an optimization profile for the current context.

Parameters
profileIndexIndex of the profile. It must lie between 0 and getEngine().getNbOptimizationProfiles() - 1

The selected profile will be used in subsequent calls to execute() or enqueue().

When an optimization profile is switched via this API, TensorRT may enqueue GPU memory copy operations required to set up the new profile during the subsequent enqueue() operations. To avoid these calls during enqueue(), use setOptimizationProfileAsync() instead.

If the associated CUDA engine has dynamic inputs, this method must be called at least once with a unique profileIndex before calling execute or enqueue (i.e. the profile index may not be in use by another execution context that has not been destroyed yet). For the first execution context that is created for an engine, setOptimizationProfile(0) is called implicitly.

If the associated CUDA engine does not have inputs with dynamic shapes, this method need not be called, in which case the default profile index of 0 will be used (this is particularly the case for all safe engines).

setOptimizationProfile() must be called before calling setBindingDimensions() and setInputShapeBinding() for all dynamic input tensors or input shape tensors, which in turn must be called before either execute() or enqueue().

Returns
true if the call succeeded, else false (e.g. input out of range)
Deprecated:
This API is superseded by setOptimizationProfileAsync and will be removed in TensorRT 9.0.
See also
ICudaEngine::getNbOptimizationProfiles() IExecutionContext::setOptimizationProfileAsync()

◆ getOptimizationProfile()

virtual int32_t nvinfer1::IExecutionContext::getOptimizationProfile ( ) const
pure virtualnoexcept

Get the index of the currently selected optimization profile.

If the profile index has not been set yet (implicitly to 0 for the first execution context to be created, or explicitly for all subsequent contexts), an invalid value of -1 will be returned and all calls to enqueue() or execute() will fail until a valid profile index has been set.

◆ setBindingDimensions()

virtual bool nvinfer1::IExecutionContext::setBindingDimensions ( int32_t  bindingIndex,
Dims  dimensions 
)
pure virtualnoexcept

Set the dynamic dimensions of a binding.

Requires the engine to be built without an implicit batch dimension. The binding must be an input tensor, and all dimensions must be compatible with the network definition (i.e. only the wildcard dimension -1 can be replaced with a new dimension > 0). Furthermore, the dimensions must be in the valid range for the currently selected optimization profile, and the corresponding engine must not be safety-certified.

This method will fail unless a valid optimization profile is defined for the current execution context (getOptimizationProfile() must not be -1).

For all dynamic non-output bindings (which have at least one wildcard dimension of -1), this method needs to be called before either enqueue() or execute() may be called. This can be checked using the method allInputDimensionsSpecified().

Returns
false if an error occurs (e.g. index out of range), else true
See also
ICudaEngine::getBindingIndex

◆ getBindingDimensions()

virtual Dims nvinfer1::IExecutionContext::getBindingDimensions ( int32_t  bindingIndex) const
pure virtualnoexcept

Get the dynamic dimensions of a binding.

If the engine was built with an implicit batch dimension, same as ICudaEngine::getBindingDimensions.

If setBindingDimensions() has been called on this binding (or if there are no dynamic dimensions), all dimensions will be positive. Otherwise, it is necessary to call setBindingDimensions() before enqueue() or execute() may be called.

If the bindingIndex is out of range, an invalid Dims with nbDims == -1 is returned. The same invalid Dims will be returned if the engine was not built with an implicit batch dimension and if the execution context is not currently associated with a valid optimization profile (i.e. if getOptimizationProfile() returns -1).

If ICudaEngine::bindingIsInput(bindingIndex) is false, then both allInputDimensionsSpecified() and allInputShapesSpecified() must be true before calling this method.

Returns
Currently selected binding dimensions

For backwards compatibility with earlier versions of TensorRT, a bindingIndex that does not belong to the current profile is corrected as described for ICudaEngine::getProfileDimensions.

See also
ICudaEngine::getProfileDimensions
Here is the caller graph for this function:

◆ setInputShapeBinding()

virtual bool nvinfer1::IExecutionContext::setInputShapeBinding ( int32_t  bindingIndex,
const int32_t *  data 
)
pure virtualnoexcept

Set values of input tensor required by shape calculations.

Parameters
bindingIndexindex of an input tensor for which ICudaEngine::isShapeBinding(bindingIndex) and ICudaEngine::bindingIsInput(bindingIndex) are both true.
datapointer to values of the input tensor. The number of values should be the product of the dimensions returned by getBindingDimensions(bindingIndex).

If ICudaEngine::isShapeBinding(bindingIndex) and ICudaEngine::bindingIsInput(bindingIndex) are both true, this method must be called before enqueue() or execute() may be called. This method will fail unless a valid optimization profile is defined for the current execution context (getOptimizationProfile() must not be -1).

◆ getShapeBinding()

virtual bool nvinfer1::IExecutionContext::getShapeBinding ( int32_t  bindingIndex,
int32_t *  data 
) const
pure virtualnoexcept

Get values of an input tensor required for shape calculations or an output tensor produced by shape calculations.

Parameters
bindingIndexindex of an input or output tensor for which ICudaEngine::isShapeBinding(bindingIndex) is true.
datapointer to where values will be written. The number of values written is the product of the dimensions returned by getBindingDimensions(bindingIndex).

If ICudaEngine::bindingIsInput(bindingIndex) is false, then both allInputDimensionsSpecified() and allInputShapesSpecified() must be true before calling this method. The method will also fail if no valid optimization profile has been set for the current execution context, i.e. if getOptimizationProfile() returns -1.

See also
isShapeBinding(bindingIndex)

◆ allInputDimensionsSpecified()

virtual bool nvinfer1::IExecutionContext::allInputDimensionsSpecified ( ) const
pure virtualnoexcept

Whether all dynamic dimensions of input tensors have been specified.

Returns
True if all dynamic dimensions of input tensors have been specified by calling setBindingDimensions().

Trivially true if network has no dynamically shaped input tensors.

See also
setBindingDimensions(bindingIndex,dimensions)

◆ allInputShapesSpecified()

virtual bool nvinfer1::IExecutionContext::allInputShapesSpecified ( ) const
pure virtualnoexcept

Whether all input shape bindings have been specified.

Returns
True if all input shape bindings have been specified by setInputShapeBinding().

Trivially true if network has no input shape bindings.

See also
isShapeBinding(bindingIndex)

◆ setErrorRecorder()

virtual void nvinfer1::IExecutionContext::setErrorRecorder ( IErrorRecorder recorder)
pure virtualnoexcept

Set the ErrorRecorder for this interface.

Assigns the ErrorRecorder to this interface. The ErrorRecorder will track all errors during execution. This function will call incRefCount of the registered ErrorRecorder at least once. Setting recorder to nullptr unregisters the recorder with the interface, resulting in a call to decRefCount if a recorder has been registered.

Parameters
recorderThe error recorder to register with this interface.
See also
getErrorRecorder

◆ getErrorRecorder()

virtual IErrorRecorder* nvinfer1::IExecutionContext::getErrorRecorder ( ) const
pure virtualnoexcept

get the ErrorRecorder assigned to this interface.

Retrieves the assigned error recorder object for the given class. A default error recorder does not exist, so a nullptr will be returned if setErrorRecorder has not been called.

Returns
A pointer to the IErrorRecorder object that has been registered.
See also
setErrorRecorder

◆ executeV2()

virtual bool nvinfer1::IExecutionContext::executeV2 ( void **  bindings)
pure virtualnoexcept

Synchronously execute inference a network.

This method requires an array of input and output buffers. The mapping from tensor names to indices can be queried using ICudaEngine::getBindingIndex(). This method only works for execution contexts built with full dimension networks.

Parameters
bindingsAn array of pointers to input and output buffers for the network.
Returns
True if execution succeeded.
See also
ICudaEngine::getBindingIndex() ICudaEngine::getMaxBatchSize()

◆ enqueueV2()

virtual bool nvinfer1::IExecutionContext::enqueueV2 ( void **  bindings,
cudaStream_t  stream,
cudaEvent_t *  inputConsumed 
)
pure virtualnoexcept

Asynchronously execute inference.

This method requires an array of input and output buffers. The mapping from tensor names to indices can be queried using ICudaEngine::getBindingIndex(). This method only works for execution contexts built with full dimension networks.

Parameters
bindingsAn array of pointers to input and output buffers for the network.
streamA cuda stream on which the inference kernels will be enqueued
inputConsumedAn optional event which will be signaled when the input buffers can be refilled with new data
Returns
True if the kernels were enqueued successfully.
See also
ICudaEngine::getBindingIndex() ICudaEngine::getMaxBatchSize()
Note
Calling enqueueV2() with a stream in CUDA graph capture mode has a known issue. If dynamic shapes are used, the first enqueueV2() call after a setInputShapeBinding() call will cause failure in stream capture due to resource allocation. Please call enqueueV2() once before capturing the graph.
Here is the caller graph for this function:

◆ setOptimizationProfileAsync()

virtual bool nvinfer1::IExecutionContext::setOptimizationProfileAsync ( int32_t  profileIndex,
cudaStream_t  stream 
)
pure virtualnoexcept

Select an optimization profile for the current context with async semantics.

Parameters
profileIndexIndex of the profile. It must lie between 0 and getEngine().getNbOptimizationProfiles() - 1
streamA cuda stream on which the cudaMemcpyAsyncs may be enqueued

When an optimization profile is switched via this API, TensorRT may require that data is copied via cudaMemcpyAsync. It is the application’s responsibility to guarantee that synchronization between the profile sync stream and the enqueue stream occurs.

The selected profile will be used in subsequent calls to execute() or enqueue(). If the associated CUDA engine has inputs with dynamic shapes, the optimization profile must be set with a unique profileIndex before calling execute or enqueue. For the first execution context that is created for an engine, setOptimizationProfile(0) is called implicitly.

If the associated CUDA engine does not have inputs with dynamic shapes, this method need not be called, in which case the default profile index of 0 will be used.

setOptimizationProfileAsync() must be called before calling setBindingDimensions() and setInputShapeBinding() for all dynamic input tensors or input shape tensors, which in turn must be called before either execute() or enqueue().

Warning
Not synchronizing the stream used at enqueue with the stream used to set optimization profile asynchronously using this API will result in undefined behavior.
Returns
true if the call succeeded, else false (e.g. input out of range)
See also
ICudaEngine::getNbOptimizationProfiles() IExecutionContext::setOptimizationProfile()

The documentation for this class was generated from the following file: