TensorRT  7.2.1.6
NVIDIA TensorRT
Looking for a C++ dev who knows TensorRT?
I'm looking for work. Hire me!
All Classes Namespaces Functions Variables Typedefs Enumerations Enumerator Friends Pages
helpers.tokenization Namespace Reference

Classes

class  BasicTokenizer
 
class  BertTokenizer
 
class  FullTokenizer
 
class  WordpieceTokenizer
 

Functions

def validate_case_matches_checkpoint (do_lower_case, init_checkpoint)
 
def convert_to_unicode (text)
 
def printable_text (text)
 
def load_vocab (vocab_file)
 
def convert_by_vocab (vocab, items)
 
def convert_tokens_to_ids (vocab, tokens)
 
def convert_ids_to_tokens (inv_vocab, ids)
 
def whitespace_tokenize (text)
 
def _is_whitespace (char)
 
def _is_control (char)
 
def _is_punctuation (char)
 

Function Documentation

◆ validate_case_matches_checkpoint()

def helpers.tokenization.validate_case_matches_checkpoint (   do_lower_case,
  init_checkpoint 
)
Checks whether the casing config is consistent with the checkpoint name.

◆ convert_to_unicode()

def helpers.tokenization.convert_to_unicode (   text)
Converts `text` to Unicode (if it's not already), assuming utf-8 input.
Here is the caller graph for this function:

◆ printable_text()

def helpers.tokenization.printable_text (   text)
Returns text encoded in a way suitable for print or `tf.logging`.

◆ load_vocab()

def helpers.tokenization.load_vocab (   vocab_file)
Loads a vocabulary file into a dictionary.
Here is the call graph for this function:

◆ convert_by_vocab()

def helpers.tokenization.convert_by_vocab (   vocab,
  items 
)
Converts a sequence of [tokens|ids] using the vocab.
Here is the caller graph for this function:

◆ convert_tokens_to_ids()

def helpers.tokenization.convert_tokens_to_ids (   vocab,
  tokens 
)
Here is the call graph for this function:

◆ convert_ids_to_tokens()

def helpers.tokenization.convert_ids_to_tokens (   inv_vocab,
  ids 
)
Here is the call graph for this function:

◆ whitespace_tokenize()

def helpers.tokenization.whitespace_tokenize (   text)
Runs basic whitespace cleaning and splitting on a piece of text.
Here is the caller graph for this function:

◆ _is_whitespace()

def helpers.tokenization._is_whitespace (   char)
private
Checks whether `chars` is a whitespace character.
Here is the caller graph for this function:

◆ _is_control()

def helpers.tokenization._is_control (   char)
private
Checks whether `chars` is a control character.
Here is the caller graph for this function:

◆ _is_punctuation()

def helpers.tokenization._is_punctuation (   char)
private
Checks whether `chars` is a punctuation character.
Here is the caller graph for this function: