TensorRT  7.2.1.6
NVIDIA TensorRT
Looking for a C++ dev who knows TensorRT?
I'm looking for work. Hire me!
All Classes Namespaces Functions Variables Typedefs Enumerations Enumerator Friends Pages
polygraphy.backend.trt_legacy.TrtLegacyRunner Class Reference
Inheritance diagram for polygraphy.backend.trt_legacy.TrtLegacyRunner:
Collaboration diagram for polygraphy.backend.trt_legacy.TrtLegacyRunner:

Classes

class  HostDeviceMem
 

Public Member Functions

def __init__ (self, network_loader=None, max_workspace_size=None, max_batch_size=None, fp16=None, tf32=None, load_engine=None, save_engine=None, layerwise=False, plugins=[], name=None)
 
def activate_impl (self)
 
def get_input_metadata (self)
 
def deactivate_impl (self)
 
def infer_impl (self, feed_dict)
 
def last_inference_time (self)
 
def __enter__ (self)
 
def __exit__ (self, exc_type, exc_value, traceback)
 
def activate (self)
 
def infer_impl (self)
 
def infer (self, feed_dict)
 
def deactivate (self)
 

Public Attributes

 network_loader
 
 max_workspace_size
 
 fp16
 
 tf32
 
 load_engine
 
 engine_path
 
 layerwise
 
 max_batch_size
 
 engine
 
 context
 
 stream
 
 inference_time
 
 name
 
 is_active
 

Static Public Attributes

 RUNNER_COUNTS = defaultdict(int)
 

Detailed Description

A runner that can perform inference on a single TensorRT engine.

Constructor & Destructor Documentation

◆ __init__()

def polygraphy.backend.trt_legacy.TrtLegacyRunner.__init__ (   self,
  network_loader = None,
  max_workspace_size = None,
  max_batch_size = None,
  fp16 = None,
  tf32 = None,
  load_engine = None,
  save_engine = None,
  layerwise = False,
  plugins = [],
  name = None 
)
Creates a runner that manages a single TensorRT engine.


    network_loader (BaseModelLoader):
    A loader that returns a TRT builder, network, parser and input shapes.
    max_workspace_size (int): The maximum workspace size.
    max_batch_size (int): The maximum batch size.
    fp16 (bool): Whether to run in fp16 mode
    layerwise (bool): Whether to retrieve the outputs of every layer in the network.
    name (str):
    The human-readable name prefix to use for this runner.
    A runner count and timestamp will be appended to this prefix.

Member Function Documentation

◆ activate_impl()

def polygraphy.backend.trt_legacy.TrtLegacyRunner.activate_impl (   self)
Vars:
    engine (trt.ICudaEngine):
    The engine tracked by this runner. The TrtLegacyRunner OWNS the engine it
    manages, and therefore is responsible for it's destruction. Do not free the engine outside of the
    runner, or it will result in a double free.
    context (trt.IExecutionContext): The context used for inference.
    input_buffers (Dict[str, TrtLegacyRunner.HostDeviceMem]):
    A mapping of binding names to HostDeviceMem objects for input buffers.
    output_buffers (Dict[str, TrtLegacyRunner.HostDeviceMem]):
    A mapping of binding names to HostDeviceMem objects for output buffers.
    bindings (List[int]): A list of device pointers for engine bindings.
    stream (cuda.Stream): The CUDA stream that this runner will use for inference.

Reimplemented from polygraphy.backend.base.runner.BaseRunner.

◆ get_input_metadata()

def polygraphy.backend.trt_legacy.TrtLegacyRunner.get_input_metadata (   self)
Returns information about the inputs of the model.
Shapes here may include dynamic dimensions, represented by ``None``.
Must be called only after activate() and before deactivate().

Returns:
    TensorMetadata: Input names, shapes, and data types.

Reimplemented from polygraphy.backend.base.runner.BaseRunner.

◆ deactivate_impl()

def polygraphy.backend.trt_legacy.TrtLegacyRunner.deactivate_impl (   self)
Implementation for runner deactivation. Derived classes should override this function
rather than ``deactivate()``.

Reimplemented from polygraphy.backend.base.runner.BaseRunner.

◆ infer_impl() [1/2]

def polygraphy.backend.trt_legacy.TrtLegacyRunner.infer_impl (   self,
  feed_dict 
)

◆ last_inference_time()

def polygraphy.backend.base.runner.BaseRunner.last_inference_time (   self)
inherited
Returns the total inference time required during the last call to ``infer()``.

Returns:
    float: The time in seconds, or None if runtime was not measured by the runner.

◆ __enter__()

def polygraphy.backend.base.runner.BaseRunner.__enter__ (   self)
inherited
Here is the call graph for this function:

◆ __exit__()

def polygraphy.backend.base.runner.BaseRunner.__exit__ (   self,
  exc_type,
  exc_value,
  traceback 
)
inherited
Here is the call graph for this function:

◆ activate()

def polygraphy.backend.base.runner.BaseRunner.activate (   self)
inherited
Activate the runner for inference. This may involve allocating GPU buffers, for example.
Here is the caller graph for this function:

◆ infer_impl() [2/2]

def polygraphy.backend.base.runner.BaseRunner.infer_impl (   self)
inherited
Implementation for runner inference. Derived classes should override this function
rather than ``infer()``
Here is the caller graph for this function:

◆ infer()

def polygraphy.backend.base.runner.BaseRunner.infer (   self,
  feed_dict 
)
inherited
Runs inference using the provided feed_dict.

Args:
    feed_dict (OrderedDict[str, numpy.ndarray]): A mapping of input tensor names to corresponding input NumPy arrays.

Returns:
    OrderedDict[str, numpy.ndarray]:
    A mapping of output tensor names to their corresponding NumPy arrays.
    IMPORTANT: Runners may reuse these output buffers. Thus, if you need to save
    outputs from multiple inferences, you should make a copy with ``copy.copy(outputs)``.
Here is the call graph for this function:

◆ deactivate()

def polygraphy.backend.base.runner.BaseRunner.deactivate (   self)
inherited
Deactivate the runner.
Here is the caller graph for this function:

Member Data Documentation

◆ network_loader

polygraphy.backend.trt_legacy.TrtLegacyRunner.network_loader

◆ max_workspace_size

polygraphy.backend.trt_legacy.TrtLegacyRunner.max_workspace_size

◆ fp16

polygraphy.backend.trt_legacy.TrtLegacyRunner.fp16

◆ tf32

polygraphy.backend.trt_legacy.TrtLegacyRunner.tf32

◆ load_engine

polygraphy.backend.trt_legacy.TrtLegacyRunner.load_engine

◆ engine_path

polygraphy.backend.trt_legacy.TrtLegacyRunner.engine_path

◆ layerwise

polygraphy.backend.trt_legacy.TrtLegacyRunner.layerwise

◆ max_batch_size

polygraphy.backend.trt_legacy.TrtLegacyRunner.max_batch_size

◆ engine

polygraphy.backend.trt_legacy.TrtLegacyRunner.engine

◆ context

polygraphy.backend.trt_legacy.TrtLegacyRunner.context

◆ stream

polygraphy.backend.trt_legacy.TrtLegacyRunner.stream

◆ inference_time

polygraphy.backend.trt_legacy.TrtLegacyRunner.inference_time

◆ RUNNER_COUNTS

polygraphy.backend.base.runner.BaseRunner.RUNNER_COUNTS = defaultdict(int)
staticinherited

◆ name

polygraphy.backend.base.runner.BaseRunner.name
inherited

◆ is_active

polygraphy.backend.base.runner.BaseRunner.is_active
inherited

The documentation for this class was generated from the following file: