Context for executing inference using an engine, with functionally unsafe features.
More...
|
virtual bool | execute (int32_t batchSize, void **bindings) noexcept=0 |
| Synchronously execute inference on a batch. More...
|
|
virtual bool | enqueue (int32_t batchSize, void **bindings, cudaStream_t stream, cudaEvent_t *inputConsumed) noexcept=0 |
| Asynchronously execute inference on a batch. More...
|
|
virtual void | setDebugSync (bool sync) noexcept=0 |
| Set the debug sync flag. More...
|
|
virtual bool | getDebugSync () const noexcept=0 |
| Get the debug sync flag. More...
|
|
virtual void | setProfiler (IProfiler *) noexcept=0 |
| Set the profiler. More...
|
|
virtual IProfiler * | getProfiler () const noexcept=0 |
| Get the profiler. More...
|
|
virtual const ICudaEngine & | getEngine () const noexcept=0 |
| Get the associated engine. More...
|
|
virtual void | destroy () noexcept=0 |
| Destroy this object. More...
|
|
virtual void | setName (const char *name) noexcept=0 |
| Set the name of the execution context. More...
|
|
virtual const char * | getName () const noexcept=0 |
| Return the name of the execution context. More...
|
|
virtual void | setDeviceMemory (void *memory) noexcept=0 |
| Set the device memory for use by this execution context. More...
|
|
virtual Dims | getStrides (int32_t bindingIndex) const noexcept=0 |
| Return the strides of the buffer for the given binding. More...
|
|
| __attribute__ ((deprecated)) virtual bool setOptimizationProfile(int32_t profileIndex) noexcept=0 |
| Select an optimization profile for the current context. More...
|
|
virtual int32_t | getOptimizationProfile () const noexcept=0 |
| Get the index of the currently selected optimization profile. More...
|
|
virtual bool | setBindingDimensions (int32_t bindingIndex, Dims dimensions) noexcept=0 |
| Set the dynamic dimensions of a binding. More...
|
|
virtual Dims | getBindingDimensions (int32_t bindingIndex) const noexcept=0 |
| Get the dynamic dimensions of a binding. More...
|
|
virtual bool | setInputShapeBinding (int32_t bindingIndex, const int32_t *data) noexcept=0 |
| Set values of input tensor required by shape calculations. More...
|
|
virtual bool | getShapeBinding (int32_t bindingIndex, int32_t *data) const noexcept=0 |
| Get values of an input tensor required for shape calculations or an output tensor produced by shape calculations. More...
|
|
virtual bool | allInputDimensionsSpecified () const noexcept=0 |
| Whether all dynamic dimensions of input tensors have been specified. More...
|
|
virtual bool | allInputShapesSpecified () const noexcept=0 |
| Whether all input shape bindings have been specified. More...
|
|
virtual void | setErrorRecorder (IErrorRecorder *recorder) noexcept=0 |
| Set the ErrorRecorder for this interface. More...
|
|
virtual IErrorRecorder * | getErrorRecorder () const noexcept=0 |
| get the ErrorRecorder assigned to this interface. More...
|
|
virtual bool | executeV2 (void **bindings) noexcept=0 |
| Synchronously execute inference a network. More...
|
|
virtual bool | enqueueV2 (void **bindings, cudaStream_t stream, cudaEvent_t *inputConsumed) noexcept=0 |
| Asynchronously execute inference. More...
|
|
virtual bool | setOptimizationProfileAsync (int32_t profileIndex, cudaStream_t stream) noexcept=0 |
| Select an optimization profile for the current context with async semantics. More...
|
|
Context for executing inference using an engine, with functionally unsafe features.
Multiple execution contexts may exist for one ICudaEngine instance, allowing the same engine to be used for the execution of multiple batches simultaneously. If the engine supports dynamic shapes, each execution context in concurrent use must use a separate optimization profile.
- Warning
- Do not inherit from this class, as doing so will break forward-compatibility of the API and ABI.
virtual Dims nvinfer1::IExecutionContext::getStrides |
( |
int32_t |
bindingIndex | ) |
const |
|
pure virtualnoexcept |
Return the strides of the buffer for the given binding.
The strides are in units of elements, not components or bytes. For example, for TensorFormat::kHWC8, a stride of one spans 8 scalars.
Note that strides can be different for different execution contexts with dynamic shapes.
If the bindingIndex is invalid or there are dynamic dimensions that have not been set yet, returns Dims with Dims::nbDims = -1.
- Parameters
-
bindingIndex | The binding index. |
- See also
- getBindingComponentsPerElement
nvinfer1::IExecutionContext::__attribute__ |
( |
(deprecated) |
| ) |
|
|
pure virtualnoexcept |
Select an optimization profile for the current context.
- Parameters
-
profileIndex | Index of the profile. It must lie between 0 and getEngine().getNbOptimizationProfiles() - 1 |
The selected profile will be used in subsequent calls to execute() or enqueue().
When an optimization profile is switched via this API, TensorRT may enqueue GPU memory copy operations required to set up the new profile during the subsequent enqueue() operations. To avoid these calls during enqueue(), use setOptimizationProfileAsync() instead.
If the associated CUDA engine has dynamic inputs, this method must be called at least once with a unique profileIndex before calling execute or enqueue (i.e. the profile index may not be in use by another execution context that has not been destroyed yet). For the first execution context that is created for an engine, setOptimizationProfile(0) is called implicitly.
If the associated CUDA engine does not have inputs with dynamic shapes, this method need not be called, in which case the default profile index of 0 will be used (this is particularly the case for all safe engines).
setOptimizationProfile() must be called before calling setBindingDimensions() and setInputShapeBinding() for all dynamic input tensors or input shape tensors, which in turn must be called before either execute() or enqueue().
- Returns
- true if the call succeeded, else false (e.g. input out of range)
- Deprecated:
- This API is superseded by setOptimizationProfileAsync and will be removed in TensorRT 9.0.
- See also
- ICudaEngine::getNbOptimizationProfiles() IExecutionContext::setOptimizationProfileAsync()
virtual bool nvinfer1::IExecutionContext::setBindingDimensions |
( |
int32_t |
bindingIndex, |
|
|
Dims |
dimensions |
|
) |
| |
|
pure virtualnoexcept |
Set the dynamic dimensions of a binding.
Requires the engine to be built without an implicit batch dimension. The binding must be an input tensor, and all dimensions must be compatible with the network definition (i.e. only the wildcard dimension -1 can be replaced with a new dimension > 0). Furthermore, the dimensions must be in the valid range for the currently selected optimization profile, and the corresponding engine must not be safety-certified.
This method will fail unless a valid optimization profile is defined for the current execution context (getOptimizationProfile() must not be -1).
For all dynamic non-output bindings (which have at least one wildcard dimension of -1), this method needs to be called before either enqueue() or execute() may be called. This can be checked using the method allInputDimensionsSpecified().
- Returns
- false if an error occurs (e.g. index out of range), else true
- See also
- ICudaEngine::getBindingIndex
virtual Dims nvinfer1::IExecutionContext::getBindingDimensions |
( |
int32_t |
bindingIndex | ) |
const |
|
pure virtualnoexcept |
virtual bool nvinfer1::IExecutionContext::setOptimizationProfileAsync |
( |
int32_t |
profileIndex, |
|
|
cudaStream_t |
stream |
|
) |
| |
|
pure virtualnoexcept |
Select an optimization profile for the current context with async semantics.
- Parameters
-
profileIndex | Index of the profile. It must lie between 0 and getEngine().getNbOptimizationProfiles() - 1 |
stream | A cuda stream on which the cudaMemcpyAsyncs may be enqueued |
When an optimization profile is switched via this API, TensorRT may require that data is copied via cudaMemcpyAsync. It is the application’s responsibility to guarantee that synchronization between the profile sync stream and the enqueue stream occurs.
The selected profile will be used in subsequent calls to execute() or enqueue(). If the associated CUDA engine has inputs with dynamic shapes, the optimization profile must be set with a unique profileIndex before calling execute or enqueue. For the first execution context that is created for an engine, setOptimizationProfile(0) is called implicitly.
If the associated CUDA engine does not have inputs with dynamic shapes, this method need not be called, in which case the default profile index of 0 will be used.
setOptimizationProfileAsync() must be called before calling setBindingDimensions() and setInputShapeBinding() for all dynamic input tensors or input shape tensors, which in turn must be called before either execute() or enqueue().
- Warning
- Not synchronizing the stream used at enqueue with the stream used to set optimization profile asynchronously using this API will result in undefined behavior.
- Returns
- true if the call succeeded, else false (e.g. input out of range)
- See also
- ICudaEngine::getNbOptimizationProfiles() IExecutionContext::setOptimizationProfile()