pytext.torchscript.tensorizer package

Submodules

pytext.torchscript.tensorizer.bert module

class pytext.torchscript.tensorizer.bert.ScriptBERTTensorizer(tokenizer: torch.jit._script.ScriptModule, vocab: pytext.torchscript.vocab.ScriptVocabulary, max_seq_len: int)[source]

Bases: pytext.torchscript.tensorizer.bert.ScriptBERTTensorizerBase

class pytext.torchscript.tensorizer.bert.ScriptBERTTensorizerBase(tokenizer: torch.jit._script.ScriptModule, vocab: pytext.torchscript.vocab.ScriptVocabulary, max_seq_len: int)[source]

Bases: pytext.torchscript.tensorizer.tensorizer.ScriptTensorizer

pytext.torchscript.tensorizer.normalizer module

class pytext.torchscript.tensorizer.normalizer.VectorNormalizer(dim: int, do_normalization: bool = True)[source]

Bases: torch.nn.modules.module.Module

Performs in-place normalization over all features of a dense feature vector by doing (x - mean)/stddev for each x in the feature vector.

This is a ScriptModule so that the normalize function can be called at training time in the tensorizer, as well as at inference time by using it in your torchscript forward function. To use this in your tensorizer update_meta_data must be called once per row in your initialize function, and then calculate_feature_stats must be called upon the last time it runs. See usage in FloatListTensorizer for an example.

Setting do_normalization=False will make the normalize function an identity function.

calculate_feature_stats()[source]
forward()[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

normalize(vec: List[List[float]])[source]
update_meta_data(vec)[source]

pytext.torchscript.tensorizer.roberta module

class pytext.torchscript.tensorizer.roberta.ScriptRoBERTaTensorizer(tokenizer: torch.jit._script.ScriptModule, vocab: pytext.torchscript.vocab.ScriptVocabulary, max_seq_len: int)[source]

Bases: pytext.torchscript.tensorizer.bert.ScriptBERTTensorizerBase

class pytext.torchscript.tensorizer.roberta.ScriptRoBERTaTensorizerWithIndices(tokenizer: torch.jit._script.ScriptModule, vocab: pytext.torchscript.vocab.ScriptVocabulary, max_seq_len: int)[source]

Bases: pytext.torchscript.tensorizer.bert.ScriptBERTTensorizerBase

pytext.torchscript.tensorizer.tensorizer module

class pytext.torchscript.tensorizer.tensorizer.ScriptFloat1DListTensorizer[source]

Bases: torch.jit._script.ScriptModule

TorchScript implementation of Float1DListTensorizer in pytext/data/tensorizers.py

torchscriptify()[source]
class pytext.torchscript.tensorizer.tensorizer.ScriptFloatListSeqTensorizer(pad_token)[source]

Bases: torch.jit._script.ScriptModule

TorchScript implementation of ScriptFloatListSeqTensorizer in pytext/data/tensorizers.py

torchscriptify()[source]
class pytext.torchscript.tensorizer.tensorizer.ScriptInteger1DListTensorizer[source]

Bases: torch.jit._script.ScriptModule

TorchScript implementation of Integer1DListTensorizer in pytext/data/tensorizers.py

torchscriptify()[source]
class pytext.torchscript.tensorizer.tensorizer.ScriptTensorizer[source]

Bases: torch.jit._script.ScriptModule

set_padding_control(dimension: str, padding_control: Optional[List[int]])[source]

This functions will be called to set a padding style. None - No padding List: first element 0, round seq length to the smallest list element larger than inputs

class pytext.torchscript.tensorizer.tensorizer.VocabLookup(vocab: pytext.torchscript.vocab.ScriptVocabulary)[source]

Bases: torch.jit._script.ScriptModule

TorchScript implementation of lookup_tokens() in pytext/data/tensorizers.py

pytext.torchscript.tensorizer.xlm module

class pytext.torchscript.tensorizer.xlm.ScriptXLMTensorizer(tokenizer: torch.jit._script.ScriptModule, token_vocab: pytext.torchscript.vocab.ScriptVocabulary, language_vocab: pytext.torchscript.vocab.ScriptVocabulary, max_seq_len: int, default_language: str)[source]

Bases: pytext.torchscript.tensorizer.tensorizer.ScriptTensorizer

Module contents

class pytext.torchscript.tensorizer.ScriptBERTTensorizer(tokenizer: torch.jit._script.ScriptModule, vocab: pytext.torchscript.vocab.ScriptVocabulary, max_seq_len: int)[source]

Bases: pytext.torchscript.tensorizer.bert.ScriptBERTTensorizerBase

class pytext.torchscript.tensorizer.ScriptFloat1DListTensorizer[source]

Bases: torch.jit._script.ScriptModule

TorchScript implementation of Float1DListTensorizer in pytext/data/tensorizers.py

torchscriptify()[source]
class pytext.torchscript.tensorizer.ScriptFloatListSeqTensorizer(pad_token)[source]

Bases: torch.jit._script.ScriptModule

TorchScript implementation of ScriptFloatListSeqTensorizer in pytext/data/tensorizers.py

torchscriptify()[source]
class pytext.torchscript.tensorizer.ScriptInteger1DListTensorizer[source]

Bases: torch.jit._script.ScriptModule

TorchScript implementation of Integer1DListTensorizer in pytext/data/tensorizers.py

torchscriptify()[source]
class pytext.torchscript.tensorizer.ScriptRoBERTaTensorizer(tokenizer: torch.jit._script.ScriptModule, vocab: pytext.torchscript.vocab.ScriptVocabulary, max_seq_len: int)[source]

Bases: pytext.torchscript.tensorizer.bert.ScriptBERTTensorizerBase

class pytext.torchscript.tensorizer.ScriptRoBERTaTensorizerWithIndices(tokenizer: torch.jit._script.ScriptModule, vocab: pytext.torchscript.vocab.ScriptVocabulary, max_seq_len: int)[source]

Bases: pytext.torchscript.tensorizer.bert.ScriptBERTTensorizerBase

class pytext.torchscript.tensorizer.ScriptXLMTensorizer(tokenizer: torch.jit._script.ScriptModule, token_vocab: pytext.torchscript.vocab.ScriptVocabulary, language_vocab: pytext.torchscript.vocab.ScriptVocabulary, max_seq_len: int, default_language: str)[source]

Bases: pytext.torchscript.tensorizer.tensorizer.ScriptTensorizer

class pytext.torchscript.tensorizer.VectorNormalizer(dim: int, do_normalization: bool = True)[source]

Bases: torch.nn.modules.module.Module

Performs in-place normalization over all features of a dense feature vector by doing (x - mean)/stddev for each x in the feature vector.

This is a ScriptModule so that the normalize function can be called at training time in the tensorizer, as well as at inference time by using it in your torchscript forward function. To use this in your tensorizer update_meta_data must be called once per row in your initialize function, and then calculate_feature_stats must be called upon the last time it runs. See usage in FloatListTensorizer for an example.

Setting do_normalization=False will make the normalize function an identity function.

calculate_feature_stats()[source]
forward()[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

normalize(vec: List[List[float]])[source]
update_meta_data(vec)[source]
class pytext.torchscript.tensorizer.ScriptTensorizer[source]

Bases: torch.jit._script.ScriptModule

set_padding_control(dimension: str, padding_control: Optional[List[int]])[source]

This functions will be called to set a padding style. None - No padding List: first element 0, round seq length to the smallest list element larger than inputs

class pytext.torchscript.tensorizer.VocabLookup(vocab: pytext.torchscript.vocab.ScriptVocabulary)[source]

Bases: torch.jit._script.ScriptModule

TorchScript implementation of lookup_tokens() in pytext/data/tensorizers.py