TensorRT  7.2.1.6
NVIDIA TensorRT
Looking for a C++ dev who knows TensorRT?
I'm looking for work. Hire me!
bert::FusedMultiHeadAttentionXMMAKernelV2 Class Reference
Inheritance diagram for bert::FusedMultiHeadAttentionXMMAKernelV2:
Collaboration diagram for bert::FusedMultiHeadAttentionXMMAKernelV2:

Public Types

using KernelMeta = FusedMultiHeadAttentionKernelMetaInfoV2
 
using KernelParam = Fused_multihead_attention_params_v2
 

Public Member Functions

 FusedMultiHeadAttentionXMMAKernelV2 (const FusedMultiHeadAttentionKernelMetaInfoV2 *pMetaStart, unsigned int nMetaCount, Data_type type, unsigned int sm)
 
uint64_t hashID (unsigned int s, bool interleaved, bool unroll) const
 
virtual uint64_t hashID (const KernelMeta &kernelMeta) const
 
virtual void run (Fused_multihead_attention_params_v2 &params, cudaStream_t ss) const
 
uint64_t hashID (unsigned int s, unsigned int d) const
 
void loadXMMAKernels ()
 
bool isValid (int s) const
 

Protected Attributes

nvinfer1::CUDADriverWrapper mDriver
 
Data_type mDataType
 
const FusedMultiHeadAttentionKernelMetaInfoV2mKernelMeta
 
unsigned int mKernelMetaCount
 
unsigned int mSM
 
std::unordered_map< const unsigned char *, CUmodulemModules
 
std::unordered_map< uint64_t, FusedMultiHeadAttentionKernelInfo > mFunctions
 
std::set< intmValidSequences
 

Member Typedef Documentation

◆ KernelMeta

◆ KernelParam

Constructor & Destructor Documentation

◆ FusedMultiHeadAttentionXMMAKernelV2()

bert::FusedMultiHeadAttentionXMMAKernelV2::FusedMultiHeadAttentionXMMAKernelV2 ( const FusedMultiHeadAttentionKernelMetaInfoV2 pMetaStart,
unsigned int  nMetaCount,
Data_type  type,
unsigned int  sm 
)
inline

Member Function Documentation

◆ hashID() [1/3]

uint64_t bert::FusedMultiHeadAttentionXMMAKernelV2::hashID ( unsigned int  s,
bool  interleaved,
bool  unroll 
) const
inline
Here is the caller graph for this function:

◆ hashID() [2/3]

virtual uint64_t bert::FusedMultiHeadAttentionXMMAKernelV2::hashID ( const KernelMeta kernelMeta) const
inlinevirtual

◆ run()

virtual void bert::FusedMultiHeadAttentionXMMAKernelV2::run ( Fused_multihead_attention_params_v2 params,
cudaStream_t  ss 
) const
inlinevirtual

Reimplemented from bert::TFusedMultiHeadAttentionXMMAKernel< FusedMultiHeadAttentionKernelMetaInfoV2, Fused_multihead_attention_params_v2 >.

Here is the call graph for this function:
Here is the caller graph for this function:

◆ hashID() [3/3]

◆ loadXMMAKernels()

◆ isValid()

Member Data Documentation

◆ mDriver

◆ mDataType

◆ mKernelMeta

◆ mKernelMetaCount

◆ mSM

◆ mModules

◆ mFunctions

std::unordered_map<uint64_t, FusedMultiHeadAttentionKernelInfo> bert::TFusedMultiHeadAttentionXMMAKernel< FusedMultiHeadAttentionKernelMetaInfoV2 , Fused_multihead_attention_params_v2 >::mFunctions
protectedinherited

◆ mValidSequences


The documentation for this class was generated from the following file: