pytext.models package¶
Subpackages¶
- pytext.models.decoders package
- Submodules
- pytext.models.decoders.decoder_base module
- pytext.models.decoders.intent_slot_model_decoder module
- pytext.models.decoders.mlp_decoder module
- pytext.models.decoders.mlp_decoder_query_response module
- pytext.models.decoders.mlp_decoder_two_tower module
- pytext.models.decoders.multilabel_decoder module
- Module contents
- pytext.models.embeddings package
- Submodules
- pytext.models.embeddings.char_embedding module
- pytext.models.embeddings.contextual_token_embedding module
- pytext.models.embeddings.dict_embedding module
- pytext.models.embeddings.embedding_base module
- pytext.models.embeddings.embedding_list module
- pytext.models.embeddings.mlp_embedding module
- pytext.models.embeddings.scriptable_embedding_list module
- pytext.models.embeddings.word_embedding module
- pytext.models.embeddings.word_seq_embedding module
- Module contents
- pytext.models.ensembles package
- pytext.models.language_models package
- pytext.models.output_layers package
- Submodules
- pytext.models.output_layers.distance_output_layer module
- pytext.models.output_layers.doc_classification_output_layer module
- pytext.models.output_layers.doc_regression_output_layer module
- pytext.models.output_layers.intent_slot_output_layer module
- pytext.models.output_layers.lm_output_layer module
- pytext.models.output_layers.multi_label_classification_layer module
- pytext.models.output_layers.output_layer_base module
- pytext.models.output_layers.pairwise_ranking_output_layer module
- pytext.models.output_layers.squad_output_layer module
- pytext.models.output_layers.utils module
- pytext.models.output_layers.word_tagging_output_layer module
- Module contents
- pytext.models.qna package
- pytext.models.representations package
- Subpackages
- pytext.models.representations.transformer package
- Submodules
- pytext.models.representations.transformer.multihead_attention module
- pytext.models.representations.transformer.multihead_linear_attention module
- pytext.models.representations.transformer.positional_embedding module
- pytext.models.representations.transformer.representation module
- pytext.models.representations.transformer.residual_mlp module
- pytext.models.representations.transformer.sentence_encoder module
- pytext.models.representations.transformer.transformer module
- Module contents
- pytext.models.representations.transformer package
- Submodules
- pytext.models.representations.attention module
- pytext.models.representations.augmented_lstm module
- pytext.models.representations.bilstm module
- pytext.models.representations.bilstm_doc_attention module
- pytext.models.representations.bilstm_doc_slot_attention module
- pytext.models.representations.bilstm_slot_attn module
- pytext.models.representations.biseqcnn module
- pytext.models.representations.contextual_intent_slot_rep module
- pytext.models.representations.deepcnn module
- pytext.models.representations.docnn module
- pytext.models.representations.huggingface_bert_sentence_encoder module
- pytext.models.representations.huggingface_electra_sentence_encoder module
- pytext.models.representations.jointcnn_rep module
- pytext.models.representations.ordered_neuron_lstm module
- pytext.models.representations.pair_rep module
- pytext.models.representations.pass_through module
- pytext.models.representations.pooling module
- pytext.models.representations.pure_doc_attention module
- pytext.models.representations.representation_base module
- pytext.models.representations.seq_rep module
- pytext.models.representations.slot_attention module
- pytext.models.representations.sparse_transformer_sentence_encoder module
- pytext.models.representations.stacked_bidirectional_rnn module
- pytext.models.representations.traced_transformer_encoder module
- pytext.models.representations.transformer_sentence_encoder module
- pytext.models.representations.transformer_sentence_encoder_base module
- Module contents
- Subpackages
- pytext.models.semantic_parsers package
- pytext.models.seq_models package
- Submodules
- pytext.models.seq_models.attention module
- pytext.models.seq_models.base module
- pytext.models.seq_models.contextual_intent_slot module
- pytext.models.seq_models.conv_decoder module
- pytext.models.seq_models.conv_encoder module
- pytext.models.seq_models.conv_model module
- pytext.models.seq_models.light_conv module
- pytext.models.seq_models.mask_generator module
- pytext.models.seq_models.nar_length module
- pytext.models.seq_models.nar_modules module
- pytext.models.seq_models.nar_output_layer module
- pytext.models.seq_models.positional module
- pytext.models.seq_models.projection_layers module
- pytext.models.seq_models.rnn_decoder module
- pytext.models.seq_models.rnn_encoder module
- pytext.models.seq_models.rnn_encoder_decoder module
- pytext.models.seq_models.seq2seq_model module
- pytext.models.seq_models.seq2seq_output_layer module
- pytext.models.seq_models.seqnn module
- pytext.models.seq_models.utils module
- Module contents
Submodules¶
pytext.models.bert_classification_models module¶
-
class
pytext.models.bert_classification_models.
BertPairwiseModel
(encoder1, encoder2, decoder, output_layer, encode_relations, shared_encoder)[source]¶ Bases:
pytext.models.bert_classification_models._EncoderPairwiseModel
Bert Pairwise classification model
The model takes two sets of tokens (left and right) and calculates their representations separately using shared BERT encoder. The final prediction can be the cosine similarity of the embeddings, or if encoder_relations is specified the concatenation of the embeddings, their absolute difference, and elementwise product.
pytext.models.bert_regression_model module¶
-
class
pytext.models.bert_regression_model.
BertPairwiseRegressionModel
(encoder1, encoder2, decoder, output_layer, encode_relations, shared_encoder)[source]¶ Bases:
pytext.models.bert_classification_models.BertPairwiseModel
Two-tower model for regression. Encode two texts separately and use the cosine similarity between sentence embeddings to predict regression label.
-
class
pytext.models.bert_regression_model.
NewBertRegressionModel
(encoder, decoder, output_layer)[source]¶ Bases:
pytext.models.bert_classification_models.NewBertModel
BERT single sentence (or concatenated sentences) regression.
pytext.models.crf module¶
-
class
pytext.models.crf.
CRF
(num_tags: int, ignore_index: int, default_label_pad_index: int)[source]¶ Bases:
torch.nn.modules.module.Module
Compute the log-likelihood of the input assuming a conditional random field model.
Parameters: num_tags – The number of tags -
decode
(emissions: torch.Tensor, seq_lens: torch.Tensor) → torch.Tensor[source]¶ Given a set of emission probabilities, return the predicted tags.
Parameters: - emissions – Emission probabilities with expected shape of batch_size * seq_len * num_labels
- seq_lens – Length of each input.
-
export_to_caffe2
(workspace, init_net, predict_net, logits_output_name)[source]¶ Exports the crf layer to caffe2 by manually adding the necessary operators to the init_net and predict net.
Parameters: - init_net – caffe2 init net created by the current graph
- predict_net – caffe2 net created by the current graph
- workspace – caffe2 current workspace
- output_names – current output names of the caffe2 net
- py_model – original pytorch model object
Returns: The updated predictions blob name
Return type: string
-
forward
(emissions: torch.Tensor, tags: torch.Tensor, reduce: bool = True) → torch.Tensor[source]¶ Compute log-likelihood of input.
Parameters: - emissions – Emission values for different tags for each input. The expected shape is batch_size * seq_len * num_labels. Padding is should be on the right side of the input.
- tags – Actual tags for each token in the input. Expected shape is batch_size * seq_len
-
pytext.models.disjoint_multitask_model module¶
-
class
pytext.models.disjoint_multitask_model.
DisjointMultitaskModel
(models, loss_weights)[source]¶ Bases:
pytext.models.model.Model
Wrapper model to train multiple PyText models that share parameters. Designed to be used for multi-tasking when the tasks have disjoint datasets.
Modules which have the same shared_module_key and type share parameters. Only need to configure the first such module in full in each case.
Parameters: models (type) – Dictionary of models of sub-tasks. -
current_model
¶ Current model to route the input batch to.
Type: type
-
contextualize
(context)[source]¶ Add additional context into model. context can be anything that helps maintaining/updating state. For example, it is used by
DisjointMultitaskModel
for changing the task that should be trained with a given iterator.
-
current_model
-
forward
(*inputs) → List[torch.Tensor][source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
pytext.models.distributed_model module¶
-
class
pytext.models.distributed_model.
DistributedModel
(*args, **kwargs)[source]¶ Bases:
torch.nn.parallel.distributed.DistributedDataParallel
Wrapper model class to train models in distributed data parallel manner. The way to use this class to train your module in distributed manner is:
distributed_model = DistributedModel( module=model, device_ids=[device_id0, device_id1], output_device=device_id0, broadcast_buffers=False, )
where, model is the object of the actual model class you want to train in distributed manner.
-
load_state_dict
(*args, **kwargs)[source]¶ Copies parameters and buffers from
state_dict
into this module and its descendants. Ifstrict
isTrue
, then the keys ofstate_dict
must exactly match the keys returned by this module’sstate_dict()
function.Parameters: - state_dict (dict) – a dict containing parameters and persistent buffers.
- strict (bool, optional) – whether to strictly enforce that the keys
in
state_dict
match the keys returned by this module’sstate_dict()
function. Default:True
Returns: - missing_keys is a list of str containing the missing keys
- unexpected_keys is a list of str containing the unexpected keys
Return type: NamedTuple
withmissing_keys
andunexpected_keys
fields
-
state_dict
(*args, **kwargs)[source]¶ Returns a dictionary containing a whole state of the module.
Both parameters and persistent buffers (e.g. running averages) are included. Keys are corresponding parameter and buffer names.
Returns: a dictionary containing a whole state of the module Return type: dict Example:
>>> module.state_dict().keys() ['bias', 'weight']
-
pytext.models.doc_model module¶
-
class
pytext.models.doc_model.
ByteTokensDocumentModel
(embedding: pytext.models.embeddings.embedding_base.EmbeddingBase, representation: pytext.models.representations.representation_base.RepresentationBase, decoder: pytext.models.decoders.decoder_base.DecoderBase, output_layer: pytext.models.output_layers.output_layer_base.OutputLayerBase)[source]¶ Bases:
pytext.models.doc_model.DocModel
DocModel that receives both word IDs and byte IDs as inputs (concatenating word and byte-token embeddings to represent input tokens).
-
class
pytext.models.doc_model.
DocModel
(embedding: pytext.models.embeddings.embedding_base.EmbeddingBase, representation: pytext.models.representations.representation_base.RepresentationBase, decoder: pytext.models.decoders.decoder_base.DecoderBase, output_layer: pytext.models.output_layers.output_layer_base.OutputLayerBase)[source]¶ Bases:
pytext.models.model.Model
DocModel that’s compatible with the new Model abstraction, which is responsible for describing which inputs it expects and arranging its input tensors.
-
classmethod
create_decoder
(config: pytext.models.doc_model.DocModel.Config, representation_dim: int, num_labels: int)[source]¶
-
classmethod
create_embedding
(config: pytext.models.doc_model.DocModel.Config, tensorizers: Dict[str, pytext.data.tensorizers.Tensorizer])[source]¶
-
classmethod
create_output_layer
(config: pytext.models.doc_model.DocModel.Config, labels: pytext.data.tensorizers.VocabConfig)[source]¶
-
classmethod
-
class
pytext.models.doc_model.
DocRegressionModel
(embedding: pytext.models.embeddings.embedding_base.EmbeddingBase, representation: pytext.models.representations.representation_base.RepresentationBase, decoder: pytext.models.decoders.decoder_base.DecoderBase, output_layer: pytext.models.output_layers.output_layer_base.OutputLayerBase)[source]¶ Bases:
pytext.models.doc_model.DocModel
Model that’s compatible with the new Model abstraction, and is configured for regression tasks (specifically for labels, predictions, and loss).
-
class
pytext.models.doc_model.
PersonalizedDocModel
(embedding: pytext.models.embeddings.embedding_base.EmbeddingBase, representation: pytext.models.representations.representation_base.RepresentationBase, decoder: pytext.models.decoders.decoder_base.DecoderBase, output_layer: pytext.models.output_layers.output_layer_base.OutputLayerBase, user_embedding: Optional[pytext.models.embeddings.embedding_base.EmbeddingBase] = None)[source]¶ Bases:
pytext.models.doc_model.DocModel
DocModel that includes a user embedding which learns user features to produce personalized prediction. In this class, user-embedding is fed directly to the decoder (i.e., does not go through the encoders).
-
forward
(*inputs) → List[torch.Tensor][source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
pytext.models.joint_model module¶
-
class
pytext.models.joint_model.
IntentSlotModel
(default_doc_loss_weight, default_word_loss_weight, *args, **kwargs)[source]¶ Bases:
pytext.models.model.Model
A joint intent-slot model. This is framed as a model to do document classification model and word tagging tasks where the embedding and text representation layers are shared for both tasks.
The supported representation layers are based on bidirectional LSTM or CNN.
It can be instantiated just like any other
Model
.This is in the new data handling design involving tensorizers; that is the difference between this and JointModel
pytext.models.masked_lm module¶
-
class
pytext.models.masked_lm.
MaskedLanguageModel
(encoder: pytext.models.representations.transformer_sentence_encoder_base.TransformerSentenceEncoderBase, decoder: pytext.models.decoders.mlp_decoder.MLPDecoder, output_layer: pytext.models.output_layers.lm_output_layer.LMOutputLayer, token_tensorizer: pytext.data.bert_tensorizer.BERTTensorizerBase, vocab: pytext.data.utils.Vocabulary, mask_prob: float = 0.15, mask_bos: float = False, masking_strategy: pytext.models.masking_utils.MaskingStrategy = <MaskingStrategy.RANDOM: 'random'>, stage: pytext.common.constants.Stage = <Stage.TRAIN: 'Training'>)[source]¶ Bases:
pytext.models.model.BaseModel
Masked language model for BERT style pre-training.
-
SUPPORT_FP16_OPTIMIZER
= True¶
-
forward
(*inputs) → List[torch.Tensor][source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
pytext.models.masking_utils module¶
-
class
pytext.models.masking_utils.
MaskingStrategy
[source]¶ Bases:
enum.Enum
An enumeration.
-
FREQUENCY
= 'frequency_based'¶
-
RANDOM
= 'random'¶
-
-
pytext.models.masking_utils.
frequency_based_masking
(tokens: None._VariableFunctionsClass.tensor, token_sampling_weights: numpy.ndarray, mask_prob: float) → torch.Tensor[source]¶ Function to mask tokens based on frequency.
- Inputs:
- tokens: Tensor with token ids of shape (batch_size x seq_len)
- token_sampling_weights: numpy array with shape (batch_size x seq_len)
- and each element representing the sampling weight assicated with the corresponding token in tokens
- mask_prob: Probability of masking a particular token
- Outputs:
- mask: Tensor with same shape as input tokens (batch_size x seq_len)
- with masked tokens represented by a 1 and everything else as 0.
-
pytext.models.masking_utils.
random_masking
(tokens: None._VariableFunctionsClass.tensor, mask_prob: float) → torch.Tensor[source]¶ Function to mask tokens randomly.
- Inputs:
- tokens: Tensor with token ids of shape (batch_size x seq_len)
- mask_prob: Probability of masking a particular token
- Outputs:
- mask: Tensor with same shape as input tokens (batch_size x seq_len)
- with masked tokens represented by a 1 and everything else as 0.
pytext.models.model module¶
-
class
pytext.models.model.
BaseModel
(stage: pytext.common.constants.Stage = <Stage.TRAIN: 'Training'>)[source]¶ Bases:
torch.nn.modules.module.Module
,pytext.config.component.Component
Base model class which inherits from nn.Module. Also has a stage flag to indicate it’s in train, eval, or test stage. This is because the built-in train/eval flag in PyTorch can’t distinguish eval and test, which is required to support some use cases.
-
SUPPORT_FP16_OPTIMIZER
= False¶
-
arrange_caffe2_model_inputs
(tensor_dict)[source]¶ Generate inputs for exported caffe2 model, default behavior is flatten the input tuples
-
contextualize
(context)[source]¶ Add additional context into model. context can be anything that helps maintaining/updating state. For example, it is used by
DisjointMultitaskModel
for changing the task that should be trained with a given iterator.
-
eval
(stage=<Stage.TEST: 'Test'>)[source]¶ Override to explicitly maintain the stage (train, eval, test).
-
-
class
pytext.models.model.
Model
(embedding: pytext.models.embeddings.embedding_base.EmbeddingBase, representation: pytext.models.representations.representation_base.RepresentationBase, decoder: pytext.models.decoders.decoder_base.DecoderBase, output_layer: pytext.models.output_layers.output_layer_base.OutputLayerBase)[source]¶ Bases:
pytext.models.model.BaseModel
Generic single-task model class that expects four components:
- Embedding
- Representation
- Decoder
- Output Layer
Forward pass: embedding -> representation -> decoder -> output_layer
These four components have specific responsibilities as described below.
Embedding layer should implement the way to represent each token in the input text. It can be as simple as just token/word embedding or can be composed of multiple ways to represent a token, e.g., word embedding, character embedding, etc.
Representation layer should implement the way to encode the entire input text such that the output vector(s) can be used by decoder to produce logits. There is no restriction on the number of inputs it should encode. There is also not restriction on the number of ways to encode input.
Decoder layer should implement the way to consume the output of model’s representation and produce logits that can be used by the output layer to compute loss or generate predictions (and prediction scores/confidence)
Output layer should implement the way loss computation is done as well as the logic to generate predictions from the logits.
Let us discuss the joint intent-slot model as a case to go over these layers. The model predicts intent of input utterance and the slots in the utterance. (Refer to Train Intent-Slot model on ATIS Dataset for details about intent-slot model.)
EmbeddingList
layer is tasked with representing tokens. To do so we can use learnable word embedding table in conjunction with learnable character embedding table that are distilled to token level representation using CNN and pooling. Note: This class is meant to be reused by all models. It acts as a container of all the different ways of representing a token/word.BiLSTMDocSlotAttention
is tasked with encoding the embedded input string for intent classification and slot filling. In order to do that it has a shared bidirectional LSTM layer followed by separate attention layers for document level attention and word level attention. Finally it produces two vectors per utterance.IntentSlotModelDecoder
accepts the two input vectors from BiLSTMDocSlotAttention and produces logits for intent classification and slot filling. Conditioned on a flag it can also use the probabilities from intent classification for slot filling.IntentSlotOutputLayer
implements the logic behind computing loss and prediction, as well as, how to export this layer to export to Caffe2. This is used by model exporter as a post-processing Caffe2 operator.
Parameters: - embedding (EmbeddingBase) – Description of parameter embedding.
- representation (RepresentationBase) – Description of parameter representation.
- decoder (DecoderBase) – Description of parameter decoder.
- output_layer (OutputLayerBase) – Description of parameter output_layer.
-
embedding
¶
-
representation
¶
-
decoder
¶
-
output_layer
¶
-
classmethod
compose_embedding
(sub_emb_module_dict: Dict[str, pytext.models.embeddings.embedding_base.EmbeddingBase], metadata) → pytext.models.embeddings.embedding_list.EmbeddingList[source]¶ Default implementation is to compose an instance of
EmbeddingList
with all the sub-embedding modules. You should override this class method if you want to implement a specific way to embed tokens/words.Parameters: sub_emb_module_dict (Dict[str, EmbeddingBase]) – Named dictionary of embedding modules each of which implement a way to embed/encode a token. Returns: An instance of EmbeddingList
.Return type: EmbeddingList
-
classmethod
create_embedding
(feat_config: pytext.config.field_config.FeatureConfig, metadata: pytext.data.data_handler.CommonMetadata)[source]¶
-
classmethod
create_sub_embs
(emb_config: pytext.config.field_config.FeatureConfig, metadata: pytext.data.data_handler.CommonMetadata) → Dict[str, pytext.models.embeddings.embedding_base.EmbeddingBase][source]¶ Creates the embedding modules defined in the emb_config.
Parameters: - emb_config (FeatureConfig) – Object containing all the sub-embedding configurations.
- metadata (CommonMetadata) – Object containing features and label metadata.
Returns: Named dictionary of embedding modules.
Return type: Dict[str, EmbeddingBase]
-
forward
(*inputs) → List[torch.Tensor][source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
class
pytext.models.model.
ModelInputBase
(**kwargs)[source]¶ Bases:
pytext.config.pytext_config.ConfigBase
Base class for model inputs.
pytext.models.module module¶
-
class
pytext.models.module.
Module
(config=None)[source]¶ Bases:
torch.nn.modules.module.Module
,pytext.config.component.Component
Generic module class that serves as base class for all PyText modules.
Parameters: config (type) – Module’s config object. Specific contents of this object depends on the module. Defaults to None.
-
pytext.models.module.
create_module
(module_config, *args, create_fn=<function _create_module_from_registry>, **kwargs)[source]¶ Create module object given the module’s config object. It depends on the global shared module registry. Hence, your module must be available for the registry. This entails that your module must be imported somewhere in the code path during module creation (ideally in your model class) for the module to be visible for registry.
Parameters: - module_config (type) – Module config object.
- create_fn (type) – The function to use for creating the module. Use this parameter if your module creation requires custom code and pass your function here. Defaults to _create_module_from_registry().
Returns: Description of returned object.
Return type: type
pytext.models.pair_classification_model module¶
-
class
pytext.models.pair_classification_model.
BasePairwiseModel
(decoder: pytext.models.decoders.decoder_base.DecoderBase, output_layer: pytext.models.output_layers.output_layer_base.OutputLayerBase, encode_relations: bool)[source]¶ Bases:
pytext.models.model.BaseModel
A base classification model that scores a pair of texts.
Subclasses need to implement the from_config, forward and save_modules.
-
forward
(input1: Tuple[torch.Tensor, ...], input2: Tuple[torch.Tensor, ...])[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
-
class
pytext.models.pair_classification_model.
PairwiseModel
(embeddings: torch.nn.modules.container.ModuleList, representations: torch.nn.modules.container.ModuleList, decoder: pytext.models.decoders.mlp_decoder.MLPDecoder, output_layer: pytext.models.output_layers.doc_classification_output_layer.ClassificationOutputLayer, encode_relations: bool, shared_representations: bool)[source]¶ Bases:
pytext.models.pair_classification_model.BasePairwiseModel
A classification model that scores a pair of texts, for example, a model for natural language inference.
The model shares embedding space (so it doesn’t support pairs of texts where left and right are in different languages). It uses bidirectional LSTM or CNN to represent the two documents, and concatenates them along with their absolute difference and elementwise product. This concatenated pair representation is passed to a multi-layer perceptron to decode to label/target space.
See https://arxiv.org/pdf/1705.02364.pdf for more details.
It can be instantiated just like any other
Model
.-
EMBEDDINGS
= ['embedding']¶
-
INPUTS_PAIR
= [['tokens1'], ['tokens2']]¶
-
forward
(input1: Tuple[torch.Tensor, ...], input2: Tuple[torch.Tensor, ...]) → torch.Tensor[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
pytext.models.query_document_pairwise_ranking_model module¶
-
class
pytext.models.query_document_pairwise_ranking_model.
QueryDocPairwiseRankingModel
(embeddings: torch.nn.modules.container.ModuleList, representations: torch.nn.modules.container.ModuleList, decoder: pytext.models.decoders.mlp_decoder.MLPDecoder, output_layer: pytext.models.output_layers.doc_classification_output_layer.ClassificationOutputLayer, encode_relations: bool, shared_representations: bool)[source]¶ Bases:
pytext.models.pair_classification_model.PairwiseModel
Pairwise ranking model This model takes in a query, and two responses (pos_response and neg_response) It passes representations of the query and the two responses to a decoder pos_response should be ranked higher than neg_response - this is ensured by training with a ranking hinge loss function
-
forward
(pos_response: Tuple[torch.Tensor, torch.Tensor], neg_response: Tuple[torch.Tensor, torch.Tensor], query: Tuple[torch.Tensor, torch.Tensor]) → List[torch.Tensor][source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
pytext.models.r3f_models module¶
-
class
pytext.models.r3f_models.
R3FConfigOptions
(**kwargs)[source]¶ Bases:
pytext.config.pytext_config.ConfigBase
Configuration options for models using R3F
-
eps
= 1e-05¶
-
noise_type
= 'uniform'¶
-
r3f_default_lambda
= 0.5¶
-
r3f_lambda_by_loss
= {}¶
-
-
class
pytext.models.r3f_models.
R3FNoiseContextManager
(context)[source]¶ Bases:
contextlib.AbstractContextManager
Context manager that adds a forward hook to the embedding module, to insert noise into the model and detatch embedding when doing this pass
-
class
pytext.models.r3f_models.
R3FNoiseType
[source]¶ Bases:
enum.Enum
An enumeration.
-
NORMAL
= 'normal'¶
-
UNIFORM
= 'uniform'¶
-
-
class
pytext.models.r3f_models.
R3FPyTextMixin
(config: pytext.models.r3f_models.R3FConfigOptions)[source]¶ Bases:
object
Mixin class for applying the R3F method, to apply R3F with any model inherit the class and implement the abstract functions.
For more details: https://arxiv.org/abs/2008.03156
-
get_embedding_module
(*args, **kwargs)[source]¶ Given the core model outputs, this returns the embedding module that is used for the R3F loss, in particular noise will be injected to this module.
-
get_r3f_loss_terms
(model_outputs, noise_model_outputs, sample_size: int) → torch.Tensor[source]¶ Computes the auxillary loss for R3F, in particular computes a symmetric KL divergence between the result from the input embedding and the noise input embedding.
-
get_r3f_model_output
(model_output)[source]¶ Extracts the output from the model.forward() call that is used for the r3f loss term
-
pytext.models.roberta module¶
-
class
pytext.models.roberta.
RoBERTa
(encoder, decoder, output_layer, stage=<Stage.TRAIN: 'Training'>)[source]¶ Bases:
pytext.models.bert_classification_models.NewBertModel
-
graph_mode_quantize
(inputs, data_loader, calibration_num_batches=64, qconfig_dict=None, force_quantize=False)[source]¶ Quantize the model during export with graph mode quantization.
-
-
class
pytext.models.roberta.
RoBERTaEncoder
(config: pytext.models.roberta.RoBERTaEncoder.Config, output_encoded_layers: bool, **kwarg)[source]¶ Bases:
pytext.models.roberta.RoBERTaEncoderBase
A PyTorch RoBERTa implementation
-
forward
(input_tuple: Tuple[torch.Tensor, ...], *args) → Tuple[torch.Tensor, ...][source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
-
class
pytext.models.roberta.
RoBERTaEncoderBase
(config: pytext.models.representations.transformer_sentence_encoder_base.TransformerSentenceEncoderBase.Config, output_encoded_layers=False, *args, **kwargs)[source]¶ Bases:
pytext.models.representations.transformer_sentence_encoder_base.TransformerSentenceEncoderBase
-
class
pytext.models.roberta.
RoBERTaEncoderJit
(config: pytext.models.roberta.RoBERTaEncoderJit.Config, output_encoded_layers: bool, **kwarg)[source]¶ Bases:
pytext.models.roberta.RoBERTaEncoderBase
A TorchScript RoBERTa implementation
-
class
pytext.models.roberta.
RoBERTaR3F
(encoder, decoder, output_layer, r3f_options, stage=<Stage.TRAIN: 'Training'>)[source]¶ Bases:
pytext.models.roberta.RoBERTa
,pytext.models.r3f_models.R3FPyTextMixin
-
forward
(*args, use_r3f: bool = False, **kwargs)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
get_embedding_module
(*args, **kwargs)[source]¶ Given the core model outputs, this returns the embedding module that is used for the R3F loss, in particular noise will be injected to this module.
-
-
class
pytext.models.roberta.
RoBERTaRegression
(encoder, decoder, output_layer)[source]¶ Bases:
pytext.models.bert_regression_model.NewBertRegressionModel
-
class
pytext.models.roberta.
RoBERTaWordTaggingModel
(encoder, decoder, output_layer, stage=<Stage.TRAIN: 'Training'>)[source]¶ Bases:
pytext.models.model.BaseModel
Single Sentence Token-level Classification Model using XLM.
-
forward
(encoder_inputs: Tuple[torch.Tensor, ...], *args) → torch.Tensor[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
-
class
pytext.models.roberta.
SELFIE
(encoder, decoder, output_layer, stage=<Stage.TRAIN: 'Training'>)[source]¶ Bases:
pytext.models.roberta.RoBERTa
-
forward
(encoder_inputs: Tuple[torch.Tensor, ...], *args) → List[torch.Tensor][source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
pytext.models.two_tower_classification_model module¶
-
class
pytext.models.two_tower_classification_model.
TwoTowerClassificationModel
(right_encoder, left_encoder, decoder, output_layer, stage=<Stage.TRAIN: 'Training'>)[source]¶ Bases:
pytext.models.model.BaseModel
-
SUPPORT_FP16_OPTIMIZER
= True¶
-
forward
(right_encoder_inputs: Tuple[torch.Tensor, ...], left_encoder_inputs: Tuple[torch.Tensor, ...], *args) → List[torch.Tensor][source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
classmethod
from_config
(config: pytext.models.two_tower_classification_model.TwoTowerClassificationModel.Config, tensorizers: Dict[str, pytext.data.tensorizers.Tensorizer])[source]¶
-
graph_mode_quantize
(inputs, data_loader, calibration_num_batches=64)[source]¶ Quantize the model during export with graph mode quantization for linformer encoder.
-
pytext.models.utils module¶
pytext.models.word_model module¶
-
class
pytext.models.word_model.
WordTaggingLiteModel
(*args, **kwargs)[source]¶ Bases:
pytext.models.word_model.WordTaggingModel
Also a word tagging model, but uses bytes as inputs to the model. Using bytes instead of words, the model does not need to store a word embedding table mapping words in the vocab to their embedding vector representations, but instead compute them on the fly using CharacterEmbedding. This produces an exported/serialized model that requires much less storage space as well as less memory during run/inference time.
-
class
pytext.models.word_model.
WordTaggingModel
(*args, **kwargs)[source]¶ Bases:
pytext.models.model.Model
Word tagging model. It can be used for any task that requires predicting the tag for a word/token. For example, the following tasks can be modeled as word tagging tasks. This is not an exhaustive list. 1. Part of speech tagging. 2. Named entity recognition. 3. Slot filling for task oriented dialog.
It can be instantiated just like any other
Model
.
Module contents¶
-
class
pytext.models.
Model
(embedding: pytext.models.embeddings.embedding_base.EmbeddingBase, representation: pytext.models.representations.representation_base.RepresentationBase, decoder: pytext.models.decoders.decoder_base.DecoderBase, output_layer: pytext.models.output_layers.output_layer_base.OutputLayerBase)[source]¶ Bases:
pytext.models.model.BaseModel
Generic single-task model class that expects four components:
- Embedding
- Representation
- Decoder
- Output Layer
Forward pass: embedding -> representation -> decoder -> output_layer
These four components have specific responsibilities as described below.
Embedding layer should implement the way to represent each token in the input text. It can be as simple as just token/word embedding or can be composed of multiple ways to represent a token, e.g., word embedding, character embedding, etc.
Representation layer should implement the way to encode the entire input text such that the output vector(s) can be used by decoder to produce logits. There is no restriction on the number of inputs it should encode. There is also not restriction on the number of ways to encode input.
Decoder layer should implement the way to consume the output of model’s representation and produce logits that can be used by the output layer to compute loss or generate predictions (and prediction scores/confidence)
Output layer should implement the way loss computation is done as well as the logic to generate predictions from the logits.
Let us discuss the joint intent-slot model as a case to go over these layers. The model predicts intent of input utterance and the slots in the utterance. (Refer to Train Intent-Slot model on ATIS Dataset for details about intent-slot model.)
EmbeddingList
layer is tasked with representing tokens. To do so we can use learnable word embedding table in conjunction with learnable character embedding table that are distilled to token level representation using CNN and pooling. Note: This class is meant to be reused by all models. It acts as a container of all the different ways of representing a token/word.BiLSTMDocSlotAttention
is tasked with encoding the embedded input string for intent classification and slot filling. In order to do that it has a shared bidirectional LSTM layer followed by separate attention layers for document level attention and word level attention. Finally it produces two vectors per utterance.IntentSlotModelDecoder
accepts the two input vectors from BiLSTMDocSlotAttention and produces logits for intent classification and slot filling. Conditioned on a flag it can also use the probabilities from intent classification for slot filling.IntentSlotOutputLayer
implements the logic behind computing loss and prediction, as well as, how to export this layer to export to Caffe2. This is used by model exporter as a post-processing Caffe2 operator.
Parameters: - embedding (EmbeddingBase) – Description of parameter embedding.
- representation (RepresentationBase) – Description of parameter representation.
- decoder (DecoderBase) – Description of parameter decoder.
- output_layer (OutputLayerBase) – Description of parameter output_layer.
-
embedding
¶
-
representation
¶
-
decoder
¶
-
output_layer
¶
-
classmethod
compose_embedding
(sub_emb_module_dict: Dict[str, pytext.models.embeddings.embedding_base.EmbeddingBase], metadata) → pytext.models.embeddings.embedding_list.EmbeddingList[source]¶ Default implementation is to compose an instance of
EmbeddingList
with all the sub-embedding modules. You should override this class method if you want to implement a specific way to embed tokens/words.Parameters: sub_emb_module_dict (Dict[str, EmbeddingBase]) – Named dictionary of embedding modules each of which implement a way to embed/encode a token. Returns: An instance of EmbeddingList
.Return type: EmbeddingList
-
classmethod
create_embedding
(feat_config: pytext.config.field_config.FeatureConfig, metadata: pytext.data.data_handler.CommonMetadata)[source]¶
-
classmethod
create_sub_embs
(emb_config: pytext.config.field_config.FeatureConfig, metadata: pytext.data.data_handler.CommonMetadata) → Dict[str, pytext.models.embeddings.embedding_base.EmbeddingBase][source]¶ Creates the embedding modules defined in the emb_config.
Parameters: - emb_config (FeatureConfig) – Object containing all the sub-embedding configurations.
- metadata (CommonMetadata) – Object containing features and label metadata.
Returns: Named dictionary of embedding modules.
Return type: Dict[str, EmbeddingBase]
-
forward
(*inputs) → List[torch.Tensor][source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
class
pytext.models.
BaseModel
(stage: pytext.common.constants.Stage = <Stage.TRAIN: 'Training'>)[source]¶ Bases:
torch.nn.modules.module.Module
,pytext.config.component.Component
Base model class which inherits from nn.Module. Also has a stage flag to indicate it’s in train, eval, or test stage. This is because the built-in train/eval flag in PyTorch can’t distinguish eval and test, which is required to support some use cases.
-
SUPPORT_FP16_OPTIMIZER
= False¶
-
arrange_caffe2_model_inputs
(tensor_dict)[source]¶ Generate inputs for exported caffe2 model, default behavior is flatten the input tuples
-
contextualize
(context)[source]¶ Add additional context into model. context can be anything that helps maintaining/updating state. For example, it is used by
DisjointMultitaskModel
for changing the task that should be trained with a given iterator.
-
eval
(stage=<Stage.TEST: 'Test'>)[source]¶ Override to explicitly maintain the stage (train, eval, test).
-
-
class
pytext.models.
TwoTowerClassificationModel
(right_encoder, left_encoder, decoder, output_layer, stage=<Stage.TRAIN: 'Training'>)[source]¶ Bases:
pytext.models.model.BaseModel
-
SUPPORT_FP16_OPTIMIZER
= True¶
-
forward
(right_encoder_inputs: Tuple[torch.Tensor, ...], left_encoder_inputs: Tuple[torch.Tensor, ...], *args) → List[torch.Tensor][source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
classmethod
from_config
(config: pytext.models.two_tower_classification_model.TwoTowerClassificationModel.Config, tensorizers: Dict[str, pytext.data.tensorizers.Tensorizer])[source]¶
-
graph_mode_quantize
(inputs, data_loader, calibration_num_batches=64)[source]¶ Quantize the model during export with graph mode quantization for linformer encoder.
-