pytext.models.seq_models package¶

Submodules¶

pytext.models.seq_models.attention module¶

class pytext.models.seq_models.attention.DecoupledMultiheadAttention(embed_dim: int, context_dim: int, num_heads: int, dropout: float, unseen_mask=False, src_length_mask=True)[source]¶

Bases: torch.nn.modules.module.Module

Multiheaded Scaled Dot Product Attention. This function has the same exact signature as the one used in pytorch_translate with the added benefit of supporting torchscript

forward(decoder_state: torch.Tensor, source_hids: torch.Tensor, src_len_mask: Optional[torch.Tensor], squeeze: bool = True) → Tuple[torch.Tensor, torch.Tensor][source]¶

Computes MultiheadAttention with respect to either a vector or a tensor

Inputs:

decoder_state: (bsz x decoder_hidden_state_dim) or: (bsz x T x decoder_hidden_state_dim)

source_hids: srclen x bsz x context_dim src_lengths: bsz x 1, actual sequence lengths squeeze: Whether or not to squeeze on the time dimension.

Even if decoder_state.dim() is 2 dimensional an explicit time step dimension will be unsqueezed.

Outputs:

[batch_size, max_src_len] if decoder_state.dim() == 2 & squeeze: or
[batch_size, 1, max_src_len] if decoder_state.dim() == 2 & !squeeze: or
[batch_size, T, max_src_len] if decoder_state.dim() == 3 & !squeeze: or
[batch_size, T, max_src_len] if decoder_state.dim() == 3 & squeeze & T != 1: or

[batch_size, max_src_len] if decoder_state.dim() == 3 & squeeze & T == 1

class pytext.models.seq_models.attention.DotAttention(decoder_hidden_state_dim, context_dim, force_projection=False, src_length_masking=True)[source]¶

Bases: torch.nn.modules.module.Module

forward(decoder_state, source_hids, src_lengths)[source]¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class pytext.models.seq_models.attention.MultiheadAttention(embed_dim, num_heads, dropout, kdim=None, vdim=None, bias=True)[source]¶

Bases: pytext.models.seq_models.base.PyTextIncrementalDecoderComponent

Refer Attention is All You Need for more details.

This is a simplified implementation of multihead attention optimized for exporting using torchscript. Usage of nn.Linear() instead of F.Linear() helps to quantize the linear layers.

Query represents the output from last decoder step. Key and Values are obtained from encoder. Attention weights are obtained from the dot product of query and key. Attention weights multiplied by the value gives output.

forward(query: torch.Tensor, key: torch.Tensor, value: torch.Tensor, key_padding_mask: Optional[torch.Tensor], need_weights: bool, incremental_state: Optional[Dict[str, torch.Tensor]] = None) → Tuple[torch.Tensor, Optional[torch.Tensor]][source]¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

classmethod from_config(config, embed_dim, num_heads)[source]¶

reorder_incremental_state(incremental_state: Dict[str, torch.Tensor], new_order: torch.Tensor)[source]¶: Reorder buffered internal state (for incremental generation).

pytext.models.seq_models.attention.create_src_lengths_mask(batch_size: int, src_lengths)[source]¶

Generate boolean mask to prevent attention beyond the end of source

Inputs:: batch_size : int src_lengths : [batch_size] of sentence lengths
Outputs:: [batch_size, max_src_len]

pytext.models.seq_models.attention.masked_softmax(scores, src_lengths, src_length_masking: bool = True)[source]¶: Apply source length masking then softmax. Input and output have shape bsz x src_len

pytext.models.seq_models.base module¶

class pytext.models.seq_models.base.PlaceholderAttentionIdentity[source]¶

Bases: torch.nn.modules.module.Module

forward(query, key, value, need_weights: bool = None, key_padding_mask: Optional[torch.Tensor] = None, incremental_state: Optional[Dict[str, torch.Tensor]] = None) → Tuple[torch.Tensor, Optional[torch.Tensor]][source]¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

reorder_incremental_state(incremental_state: Dict[str, torch.Tensor], new_order: torch.Tensor)[source]¶

class pytext.models.seq_models.base.PlaceholderIdentity[source]¶

Bases: torch.nn.modules.module.Module

class Config(**kwargs)[source]¶: Bases: pytext.config.module_config.Module.Config

forward(x, incremental_state: Optional[Dict[str, torch.Tensor]] = None)[source]¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class pytext.models.seq_models.base.PyTextIncrementalDecoderComponent[source]¶

Bases: pytext.models.seq_models.base.PyTextSeq2SeqModule

get_incremental_state(incremental_state: Dict[str, torch.Tensor], key: str) → Optional[torch.Tensor][source]¶: Helper for getting incremental state for an nn.Module.

reorder_incremental_state(incremental_state: Dict[str, torch.Tensor], new_order: torch.Tensor)[source]¶

set_incremental_state(incremental_state: Dict[str, torch.Tensor], key: str, value)[source]¶: Helper for setting incremental state for an nn.Module.

class pytext.models.seq_models.base.PyTextSeq2SeqModule[source]¶

Bases: pytext.models.module.Module

assign_id()[source]¶

instance_id = None¶

pytext.models.seq_models.contextual_intent_slot module¶

class pytext.models.seq_models.contextual_intent_slot.ContextualIntentSlotModel(default_doc_loss_weight, default_word_loss_weight, *args, **kwargs)[source]¶

Bases: pytext.models.joint_model.IntentSlotModel

Joint Model for Intent classification and slot tagging with inputs of contextual information (sequence of utterances) and dictionary feature of the last utterance.

Training data should include: doc_label (string): intent classification label of either the sequence of utterances or just the last sentence word_label (string): slot tagging label of the last utterance in the format of start_idx:end_idx:slot_label, multiple slots are separated by a comma text (list of string): sequence of utterances for training dict_feat (dict): a dict of features that contains the feature of each word in the last utterance

Following is an example of raw columns from training data:

doc_label	reply-where
word_label	10:20:restaurant_name
text	[“dinner at 6?”, “wanna try Tomi Sushi?”]
dict_feat	{“tokenFeatList”: [{“tokenIdx”: 2, “features”: {“poi:eatery”: 0.66}}, {“tokenIdx”: 3, “features”: {“poi:eatery”: 0.66}}]}

arrange_model_inputs(tensor_dict)[source]¶

classmethod create_embedding(config, tensorizers)[source]¶

get_export_input_names(tensorizers)[source]¶

vocab_to_export(tensorizers)[source]¶

pytext.models.seq_models.conv_decoder module¶

class pytext.models.seq_models.conv_decoder.ConvDecoderConfig(**kwargs)[source]¶

Bases: pytext.config.pytext_config.ConfigBase

combine_pos_embed = 'concat'¶

decoder_embed_dim = 128¶

decoder_input_dim = 128¶

decoder_learned_pos = False¶

decoder_normalize_before = False¶

decoder_output_dim = 128¶

dropout = 0.1¶

max_target_positions = 128¶

no_token_positional_embeddings = False¶

positional_embedding_type = 'learned'¶

class pytext.models.seq_models.conv_decoder.LightConvDecoder(target_dict, embed_tokens, layers, decoder_config)[source]¶

Bases: pytext.models.seq_models.conv_decoder.LightConvDecoderBase

forward(prev_output_tokens: torch.Tensor, encoder_out: Dict[str, torch.Tensor], incremental_state: Optional[Dict[str, torch.Tensor]] = None, timestep: Optional[int] = None) → Tuple[torch.Tensor, Dict[str, torch.Tensor]][source]¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

get_probs(decoder_out: Tuple[torch.Tensor, Dict[str, torch.Tensor]]) → Tuple[torch.Tensor, torch.Tensor, torch.Tensor][source]¶

class pytext.models.seq_models.conv_decoder.LightConvDecoderBase(target_dict, embed_tokens, layers, decoder_config)[source]¶

Bases: pytext.models.seq_models.base.PyTextIncrementalDecoderComponent

forward_unprojected(prev_output_tokens: torch.Tensor, encoder_out: Dict[str, torch.Tensor], incremental_state: Optional[Dict[str, torch.Tensor]] = None, timestep: Optional[int] = None) → Tuple[torch.Tensor, Dict[str, torch.Tensor]][source]¶

classmethod from_config(config, tgt_dict, tgt_embedding)[source]¶

get_probs(decoder_out: Tuple[torch.Tensor, Dict[str, torch.Tensor]]) → Tuple[torch.Tensor, torch.Tensor, torch.Tensor][source]¶

max_positions()[source]¶: Maximum output length supported by the decoder.

pos_embed(x, src_tokens)[source]¶

reorder_incremental_state(incremental_state: Dict[str, torch.Tensor], new_order: torch.Tensor)[source]¶

class pytext.models.seq_models.conv_decoder.LightConvDecoderLayer(attention_dropout, decoder_attention_heads, self_attention_heads, decoder_conv_dim, decoder_conv_type, attention_type, self_attention_type, decoder_embed_dim, decoder_ffn_embed_dim, decoder_glu, decoder_normalize_before, dropout, input_dropout, relu_dropout, need_attention, convolution_type, conv=None, self_attention=None, attention=None)[source]¶

Bases: pytext.models.seq_models.base.PyTextSeq2SeqModule

extra_repr()[source]¶

Set the extra representation of the module

To print customized extra information, you should re-implement this method in your own modules. Both single-line and multi-line strings are acceptable.

forward(x, encoder_out: torch.Tensor, encoder_padding_mask: Optional[torch.Tensor], decoder_padding_mask: Optional[torch.Tensor], incremental_state: Optional[Dict[str, torch.Tensor]])[source]¶

Parameters:	x (Tensor) – input to the layer of shape (seq_len, batch, embed_dim) encoder_padding_mask (ByteTensor) – binary ByteTensor of shape (batch, src_len) where padding elements are indicated by `1`.
Returns:	encoded output of shape (batch, src_len, embed_dim)

classmethod from_config(config, kernel_size)[source]¶

maybe_layer_norm(before: bool = False, after: bool = False)[source]¶: This a utility function which helps to control the layer norm behavior before and after specific components using one variable in config. If self.normalize_before is set to True, output is true only when before is True

reorder_incremental_state(incremental_state: Dict[str, torch.Tensor], new_order: torch.Tensor)[source]¶

class pytext.models.seq_models.conv_decoder.LightConvDecoupledDecoder(target_dict, embed_tokens, layers, decoder_config, ontology_generation_only, decoupled_attention_heads, model_output_logprob)[source]¶

Bases: pytext.models.seq_models.conv_decoder.LightConvDecoderBase

forward(prev_output_tokens: torch.Tensor, encoder_out: Dict[str, torch.Tensor], incremental_state: Optional[Dict[str, torch.Tensor]] = None, timestep: Optional[int] = None) → Tuple[torch.Tensor, Dict[str, torch.Tensor]][source]¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

classmethod from_config(config, tgt_dict, tgt_embedding)[source]¶

pytext.models.seq_models.conv_encoder module¶

class pytext.models.seq_models.conv_encoder.ConvEncoderConfig(**kwargs)[source]¶

Bases: pytext.config.pytext_config.ConfigBase

combine_pos_embed = 'concat'¶

dropout = 0.1¶

embedding_dim = 128¶

encoder_embed_dim = 128¶

encoder_learned_pos = False¶

encoder_normalize_before = False¶

max_source_positions = 1024¶

max_target_positions = 100¶

no_token_positional_embeddings = False¶

positional_embedding_type = 'learned'¶

class pytext.models.seq_models.conv_encoder.LightConvEncoder(dictionary, embed_tokens, layers, encoder_config)[source]¶

Bases: pytext.models.seq_models.base.PyTextSeq2SeqModule, pytext.models.seq_models.nar_modules.NAREncoderUtility

extra_repr()[source]¶

Set the extra representation of the module

To print customized extra information, you should re-implement this method in your own modules. Both single-line and multi-line strings are acceptable.

forward(src_tokens: torch.Tensor, src_embeddings: torch.Tensor, src_lengths: torch.Tensor) → Dict[str, torch.Tensor][source]¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

classmethod from_config(config, src_dict, src_embedding)[source]¶

max_positions()[source]¶: Maximum input length supported by the encoder.

pos_embed(x, src_tokens)[source]¶

reorder_encoder_out(encoder_out: Dict[str, torch.Tensor], new_order: torch.Tensor)[source]¶

tile_encoder_out(tile_size: int, encoder_out: Dict[str, torch.Tensor]) → Dict[str, torch.Tensor][source]¶

class pytext.models.seq_models.conv_encoder.LightConvEncoderLayer(dropout, encoder_conv_dim, encoder_conv_type, self_attention_type, encoder_embed_dim, encoder_ffn_embed_dim, self_attention_heads, encoder_glu, encoder_normalize_before, input_dropout, relu_dropout, convolution_type, conv=None, self_attention=None)[source]¶

Bases: pytext.models.seq_models.base.PyTextSeq2SeqModule

extra_repr()[source]¶

Set the extra representation of the module

To print customized extra information, you should re-implement this method in your own modules. Both single-line and multi-line strings are acceptable.

forward(x, encoder_padding_mask: Optional[torch.Tensor] = None)[source]¶

Parameters:	x (Tensor) – input to the layer of shape (seq_len, batch, embed_dim) encoder_padding_mask (ByteTensor) – binary ByteTensor of shape (batch, src_len) where padding elements are indicated by `1`.
Returns:	encoded output of shape (batch, src_len, embed_dim)

classmethod from_config(config, kernel_size)[source]¶

maybe_layer_norm(before: bool = False, after: bool = False)[source]¶

pytext.models.seq_models.conv_model module¶

class pytext.models.seq_models.conv_model.CNNModel(encoder, decoder, source_embedding)[source]¶

Bases: pytext.models.seq_models.base.PyTextSeq2SeqModule

forward(src_tokens: torch.Tensor, additional_features: List[List[torch.Tensor]], src_lengths, prev_output_tokens, src_subword_begin_indices: Optional[torch.Tensor] = None) → Tuple[torch.Tensor, Dict[str, torch.Tensor]][source]¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

classmethod from_config(config: pytext.models.seq_models.conv_model.CNNModel.Config, src_dict, source_embedding, tgt_dict, target_embedding, dict_embedding=None)[source]¶

get_embedding_module()[source]¶

get_normalized_probs(net_output, log_probs, sample=None)[source]¶

max_decoder_positions()[source]¶

classmethod validate_config(config)[source]¶

class pytext.models.seq_models.conv_model.DecoupledCNNModel(encoder, decoder, source_embedding)[source]¶: Bases: pytext.models.seq_models.conv_model.CNNModel

pytext.models.seq_models.light_conv module¶

class pytext.models.seq_models.light_conv.LightweightConv(input_size, kernel_size, convolution_type: str, num_heads, weight_softmax, bias)[source]¶

Bases: pytext.models.seq_models.base.PyTextIncrementalDecoderComponent

extra_repr()[source]¶

Set the extra representation of the module

To print customized extra information, you should re-implement this method in your own modules. Both single-line and multi-line strings are acceptable.

forward(x, incremental_state: Optional[Dict[str, torch.Tensor]] = None)[source]¶: Assuming the input, x, of the shape T x B x C and producing an output in the shape T x B x C :param x: Input of shape T x B x C, i.e. (timesteps, batch_size, input_size) :param incremental_state: A dict to keep the state

classmethod from_config(config, input_size, kernel_size, convolution_type)[source]¶

reorder_incremental_state(incremental_state: Dict[str, torch.Tensor], new_order: torch.Tensor)[source]¶

reset_parameters()[source]¶

pytext.models.seq_models.mask_generator module¶

class pytext.models.seq_models.mask_generator.BeamRankingAlgorithm[source]¶

Bases: enum.Enum

An enumeration.

AVERAGE_TOKEN_LPROB = 'AVERAGE_TOKEN_LPROB'¶

LENGTH_CONDITIONED_AVERAGE_TOKEN_LPROB = 'LENGTH_CONDITIONED_AVERAGE_TOKEN_LPROB'¶

LENGTH_CONDITIONED_AVERAGE_TOKEN_LPROB_MULTIPLIED = 'LENGTH_CONDITIONED_AVERAGE_TOKEN_LPROB_MULTIPLIED'¶

LENGTH_CONDITIONED_RANK = 'LENGTH_CONDITIONED_RANK'¶

LENGTH_CONDITIONED_RANK_MUL = 'LENGTH_CONDITIONED_RANK_MUL'¶

LEN_ONLY = 'LEN_ONLY'¶

TOKEN_LPROB = 'TOKEN_LPROB'¶

class pytext.models.seq_models.mask_generator.EmbedQuantizeType[source]¶

Bases: enum.Enum

An enumeration.

BIT_4 = '4bit'¶

BIT_8 = '8bit'¶

NONE = 'None'¶

class pytext.models.seq_models.mask_generator.MaskedSequenceGenerator(config, model, length_prediction_model, trg_vocab, beam_size, use_gold_length, beam_ranking_algorithm, quantize, embed_quantize)[source]¶

Bases: pytext.models.module.Module

forward(src_tokens: torch.Tensor, dict_feats: Optional[Tuple[torch.Tensor, torch.Tensor, torch.Tensor]], contextual_embed: Optional[torch.Tensor], char_feats: Optional[torch.Tensor], src_lengths: torch.Tensor, src_subword_begin_indices: Optional[torch.Tensor] = None, target_lengths: Optional[torch.Tensor] = None, beam_size: Optional[int] = None, src_index_tokens: Optional[torch.Tensor] = None) → Tuple[torch.Tensor, torch.Tensor, torch.Tensor, torch.Tensor][source]¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

classmethod from_config(config, model, length_prediction, trg_vocab, quantize=False, embed_quantize=False)[source]¶

generate_hypo(tensors: Dict[str, torch.Tensor]) → Tuple[Tuple[torch.Tensor, torch.Tensor], torch.Tensor][source]¶

Generates hypotheses using beam search, also returning their scores

Inputs:

tensors: dictionary containing needed tensors for generation

Outputs:

(hypos, lens): tuple of Tensors
- hypos: Tensor of shape (batch_size, beam_size, MAX) containing the generated tokens. MAX refers to the longest sequence in batch.
- lens: Tensor of shape (batch_size, beam_size) containing generated sequence lengths
_hypo_scores: Tensor of shape (batch_size, beam_size) containing the scores for each generated sequence

generate_non_autoregressive(encoder_out: Dict[str, torch.Tensor], tgt_tokens)[source]¶

get_clip_length(src_lengths: torch.Tensor)[source]¶

get_encoder_out(src_tokens: torch.Tensor, dict_feats: Optional[Tuple[torch.Tensor, torch.Tensor, torch.Tensor]], contextual_embed: Optional[torch.Tensor], char_feats: Optional[torch.Tensor], src_subword_begin_indices: Optional[torch.Tensor], src_lengths: torch.Tensor, src_index_tokens: Optional[torch.Tensor] = None) → Dict[str, torch.Tensor][source]¶

pytext.models.seq_models.mask_generator.avg_token_lprob(token_lprob: torch.Tensor, length_lprob: torch.Tensor, target_lengths: torch.Tensor) → torch.Tensor[source]¶

pytext.models.seq_models.mask_generator.get_beam_ranking_function(ranking_algorithm: pytext.models.seq_models.mask_generator.BeamRankingAlgorithm)[source]¶

pytext.models.seq_models.mask_generator.length(token_lprob: torch.Tensor, length_lprob: torch.Tensor, target_lengths: torch.Tensor) → torch.Tensor[source]¶

pytext.models.seq_models.mask_generator.length_conditioned_avg_lprob_rank(token_lprob: torch.Tensor, length_lprob: torch.Tensor, target_lengths: torch.Tensor) → torch.Tensor[source]¶

pytext.models.seq_models.mask_generator.length_conditioned_avg_lprob_rank_mul(token_lprob: torch.Tensor, length_lprob: torch.Tensor, target_lengths: torch.Tensor) → torch.Tensor[source]¶

pytext.models.seq_models.mask_generator.length_conditioned_rank(token_lprob: torch.Tensor, length_lprob: torch.Tensor, target_lengths: torch.Tensor) → torch.Tensor[source]¶

pytext.models.seq_models.mask_generator.length_conditioned_rank_mul(token_lprob: torch.Tensor, length_lprob: torch.Tensor, target_lengths: torch.Tensor) → torch.Tensor[source]¶

pytext.models.seq_models.mask_generator.prepare_masked_target_for_lengths(beam: torch.Tensor, mask_idx: int, pad_idx: int, length_beam_size: int = 1) → Tuple[torch.Tensor, torch.Tensor][source]¶

pytext.models.seq_models.mask_generator.token_prob(token_lprob: torch.Tensor, length_lprob: torch.Tensor, target_lengths: torch.Tensor) → torch.Tensor[source]¶

pytext.models.seq_models.nar_length module¶

class pytext.models.seq_models.nar_length.ConvLengthPredictionModule(embed_dim: int, conv_dim: int, max_target_positions: int, length_dropout: float, glu: bool, activation, pooling_type, conv_layers)[source]¶

Bases: pytext.models.module.Module

forward(x: torch.Tensor, encoder_padding_mask: Optional[torch.Tensor] = None)[source]¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

classmethod from_config(config: pytext.models.seq_models.nar_length.ConvLengthPredictionModule.Config, embed_dim: int)[source]¶

class pytext.models.seq_models.nar_length.MaskedLengthPredictionModule(embed_dim: int, length_hidden_dim: int, max_target_positions: int, length_dropout: float)[source]¶

Bases: pytext.models.module.Module

forward(x: torch.Tensor, encoder_padding_mask: Optional[torch.Tensor] = None) → Tuple[torch.Tensor, torch.Tensor][source]¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

classmethod from_config(config: pytext.models.seq_models.nar_length.MaskedLengthPredictionModule.Config, embed_dim: int)[source]¶

pytext.models.seq_models.nar_length.mean(rep: torch.Tensor, padding_mask: Optional[torch.Tensor])[source]¶

pytext.models.seq_models.nar_length.pool(pooling_type: str, words: torch.Tensor, encoder_padding_mask: Optional[torch.Tensor])[source]¶

pytext.models.seq_models.nar_modules module¶

class pytext.models.seq_models.nar_modules.NAREncoderUtility[source]¶

Bases: object

prepare_for_nar_inference(length_beam_size: int, encoder_out: Dict[str, torch.Tensor]) → Dict[str, torch.Tensor][source]¶: During masked NAR inference, multiple lengths are predicted for each item in the batch. Hence tiling has to be done in such a way that all new rows related to each item should be placed together. This is the assumption that we are following in the rest of the nar generation code. Eg: [row1, row2, row3] should be tiled as [row1, row1, row1, row2, row2, row2, row3, row3, row3] NOT [row1, row2, row3, row1, row2, row3, row1, row2, row3]

pytext.models.seq_models.nar_output_layer module¶

class pytext.models.seq_models.nar_output_layer.NARSeq2SeqOutputLayer(target_names: Optional[List[str]] = None, loss_fn: Optional[pytext.loss.loss.Loss] = None, *args, **kwargs)[source]¶

Bases: pytext.models.output_layers.output_layer_base.OutputLayerBase

Non-autoregressive seq2seq output layer.

classmethod from_config(config: pytext.models.seq_models.nar_output_layer.NARSeq2SeqOutputLayer.Config, vocab: pytext.data.utils.Vocabulary)[source]¶

get_loss(model_outputs: Tuple[torch.Tensor, Dict[str, torch.Tensor]], targets: Tuple[Tuple[torch.Tensor, torch.Tensor], torch.Tensor], context: Dict[str, Any] = None, reduce=True) → Tuple[torch.Tensor, Dict[str, torch.Tensor]][source]¶: label_logits: B x T x V_1 label_targets: B x T length_logits: B x V_2 length_targets: B

pytext.models.seq_models.positional module¶

class pytext.models.seq_models.positional.LearnedPositionalEmbedding(num_embeddings: int, embedding_dim: int, padding_idx: int)[source]¶

Bases: torch.nn.modules.sparse.Embedding

This module learns positional embeddings up to a fixed maximum size. Padding ids are ignored by either offsetting based on padding_idx or by setting padding_idx to None and ensuring that the appropriate position ids are passed to the forward function.

forward(input: torch.Tensor, incremental_state: Optional[Dict[str, Dict[str, Optional[torch.Tensor]]]] = None, positions: Optional[torch.Tensor] = None)[source]¶: Input is expected to be of size [bsz x seqlen].

class pytext.models.seq_models.positional.PostionalEmbedCombine[source]¶

Bases: enum.Enum

An enumeration.

CONCAT = 'concat'¶

SUM = 'sum'¶

class pytext.models.seq_models.positional.PostionalEmbedType[source]¶

Bases: enum.Enum

An enumeration.

HYBRID = 'hybrid'¶

LEARNED = 'learned'¶

SINUSOIDAL = 'sinusoidal'¶

class pytext.models.seq_models.positional.SinusoidalPositionalEmbedding(embedding_dim, padding_idx, init_size=124, learned_embed=False)[source]¶

Bases: torch.nn.modules.module.Module

This module produces sinusoidal positional embeddings of any length.

Padding symbols are ignored.

forward(input, incremental_state: Optional[Dict[str, torch.Tensor]] = None, timestep: Optional[int] = None)[source]¶: Input is expected to be of size [bsz x seqlen].

max_positions()[source]¶: Maximum number of supported positions.

pytext.models.seq_models.positional.build_positional_embedding(positional_embedding_type: pytext.models.seq_models.positional.PostionalEmbedType, combine_pos_embed: pytext.models.seq_models.positional.PostionalEmbedCombine, max_target_positions: int, input_embed_dim: int, embed_dim: int, padding_idx: int, no_token_positional_embeddings: bool)[source]¶

pytext.models.seq_models.positional.get_sinusoidal_embedding(num_embeddings: int, embedding_dim: int, padding_idx: int)[source]¶

Build sinusoidal embeddings.

This matches the implementation in tensor2tensor, but differs slightly from the description in Section 3.5 of “Attention Is All You Need”.

pytext.models.seq_models.projection_layers module¶

class pytext.models.seq_models.projection_layers.DecoderWithLinearOutputProjection(src_dict, dst_dict, out_embed_dim=512, *args, **kwargs)[source]¶

Bases: torch.nn.modules.module.Module

Simple linear projection from the hidden vector to vocab.

forward(encoder_out: Dict[str, torch.Tensor], decoder_out: Tuple[torch.Tensor, Dict[str, torch.Tensor]], incremental_state: Optional[Dict[str, torch.Tensor]] = None)[source]¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

get_probs(decoder_out: Tuple[torch.Tensor, Dict[str, torch.Tensor]]) → Tuple[torch.Tensor, torch.Tensor, torch.Tensor][source]¶

reset_parameters()[source]¶

class pytext.models.seq_models.projection_layers.DecoupledDecoderHead(src_dict, dst_dict, out_embed_dim=512, encoder_hidden_dim=None, pointer_attention_heads=1, fixed_generation_vocab=None, attention_dropout=0.2, model_output_logprob=True)[source]¶

Bases: torch.nn.modules.module.Module

fixed_generation_vocab_expanded = typing_extensions.Final[torch.Tensor]¶

forward(encoder_out: Dict[str, torch.Tensor], decoder_out: Tuple[torch.Tensor, Dict[str, torch.Tensor]], incremental_state: Optional[Dict[str, torch.Tensor]] = None) → Tuple[torch.Tensor, Dict[str, torch.Tensor]][source]¶: B: Batch T_src: Length of source sequence T_trg: Length of target seuqence C: hidden dimension V_ont: Size of ontology vocabulary V_trg: Size of full target vocabulary

get_pointer_src_tokens(encoder_out: Dict[str, torch.Tensor]) → torch.Tensor[source]¶

get_probs(decoder_out: Tuple[torch.Tensor, Dict[str, torch.Tensor]]) → Tuple[torch.Tensor, torch.Tensor, torch.Tensor][source]¶

verify_encoder_out(encoder_out: Dict[str, torch.Tensor])[source]¶

pytext.models.seq_models.rnn_decoder module¶

class pytext.models.seq_models.rnn_decoder.DecoderWithLinearOutputProjection(out_vocab_size, out_embed_dim=512)[source]¶

Bases: pytext.models.seq_models.base.PyTextSeq2SeqModule

Common super class for decoder networks with output projection layers.

forward(input_tokens, encoder_out: Dict[str, torch.Tensor], incremental_state: Optional[Dict[str, torch.Tensor]] = None, timestep: int = 0) → Tuple[torch.Tensor, Dict[str, torch.Tensor]][source]¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

forward_unprojected(input_tokens, encoder_out, incremental_state=None)[source]¶: Forward pass through the decoder without output projection.

reset_parameters()[source]¶

class pytext.models.seq_models.rnn_decoder.RNNDecoder(out_vocab_size, embed_tokens, encoder_hidden_dim, embed_dim, hidden_dim, out_embed_dim, cell_type, num_layers, dropout_in, dropout_out, attention_type, attention_heads, first_layer_attention, averaging_encoder)[source]¶: Bases: pytext.models.seq_models.rnn_decoder.RNNDecoderBase, pytext.models.seq_models.rnn_decoder.DecoderWithLinearOutputProjection

class pytext.models.seq_models.rnn_decoder.RNNDecoderBase(embed_tokens, encoder_hidden_dim, embed_dim, hidden_dim, out_embed_dim, cell_type, num_layers, dropout_in, dropout_out, attention_type, attention_heads, first_layer_attention, averaging_encoder)[source]¶

Bases: pytext.models.seq_models.base.PyTextIncrementalDecoderComponent

RNN decoder with multihead attention. Attention is calculated using encoder output and output of decoder’s first RNN layerself. Attention is applied after first RNN layer and concatenated to input of subsequent layers.

forward_unprojected(input_tokens, encoder_out: Dict[str, torch.Tensor], incremental_state: Optional[Dict[str, torch.Tensor]] = None) → Tuple[torch.Tensor, Dict[str, torch.Tensor]][source]¶

classmethod from_config(config, out_vocab_size, target_embedding)[source]¶

get_normalized_probs(net_output, log_probs, sample)[source]¶: Get normalized probabilities (or log probs) from a net’s output.

max_positions()[source]¶: Maximum output length supported by the decoder.

reorder_incremental_state(incremental_state: Dict[str, torch.Tensor], new_order)[source]¶: Reorder buffered internal state (for incremental generation).

pytext.models.seq_models.rnn_encoder module¶

class pytext.models.seq_models.rnn_encoder.BiLSTM(num_layers, bidirectional, embed_dim, hidden_dim, dropout)[source]¶

Bases: torch.nn.modules.module.Module

Wrapper for nn.LSTM

Differences include: * weight initialization * the bidirectional option makes the first layer bidirectional only (and in that case the hidden dim is divided by 2)

static LSTM(input_size, hidden_size, **kwargs)[source]¶

forward(embeddings: torch.Tensor, lengths: torch.Tensor, enforce_sorted: bool = True)[source]¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class pytext.models.seq_models.rnn_encoder.LSTMSequenceEncoder(embed_dim, hidden_dim, num_layers, dropout_in, dropout_out, bidirectional)[source]¶

Bases: pytext.models.seq_models.base.PyTextSeq2SeqModule

RNN encoder using nn.LSTM for cuDNN support / ONNX exportability.

forward(src_tokens: torch.Tensor, embeddings: torch.Tensor, src_lengths) → Dict[str, torch.Tensor][source]¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

classmethod from_config(config)[source]¶

max_positions()[source]¶: Maximum output length supported by the decoder.

tile_encoder_out(beam_size: int, encoder_out: Dict[str, torch.Tensor]) → Dict[str, torch.Tensor][source]¶

pytext.models.seq_models.rnn_encoder_decoder module¶

class pytext.models.seq_models.rnn_encoder_decoder.RNNModel(encoder, decoder, source_embeddings)[source]¶

Bases: pytext.models.seq_models.base.PyTextSeq2SeqModule

forward(src_tokens: torch.Tensor, additional_features: List[List[torch.Tensor]], src_lengths, prev_output_tokens, incremental_state: Optional[Dict[str, torch.Tensor]] = None)[source]¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

classmethod from_config(config: pytext.models.seq_models.rnn_encoder_decoder.RNNModel.Config, source_vocab, source_embedding, target_vocab, target_embedding)[source]¶

get_normalized_probs(net_output, log_probs, sample=None)[source]¶

max_decoder_positions()[source]¶

pytext.models.seq_models.seq2seq_model module¶

class pytext.models.seq_models.seq2seq_model.Seq2SeqModel(model: pytext.models.seq_models.rnn_encoder_decoder.RNNModel, output_layer: pytext.models.seq_models.seq2seq_output_layer.Seq2SeqOutputLayer, src_vocab: pytext.data.utils.Vocabulary, trg_vocab: pytext.data.utils.Vocabulary, dictfeat_vocab: pytext.data.utils.Vocabulary, generator_config=None)[source]¶

Bases: pytext.models.model.Model

Sequence to sequence model using an encoder-decoder architecture.

arrange_model_inputs(tensor_dict) → Tuple[torch.Tensor, Optional[Tuple[torch.Tensor, torch.Tensor, torch.Tensor]], torch.Tensor, torch.Tensor][source]¶

arrange_targets(tensor_dict)[source]¶

forward(src_tokens: torch.Tensor, dict_feats: Optional[Tuple[torch.Tensor, torch.Tensor, torch.Tensor]], contextual_token_embedding: Optional[torch.Tensor], src_lengths: torch.Tensor, trg_tokens: torch.Tensor)[source]¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

classmethod from_config(config: pytext.models.seq_models.seq2seq_model.Seq2SeqModel.Config, tensorizers: Dict[str, pytext.data.tensorizers.Tensorizer])[source]¶

get_pred(model_outputs, context=None)[source]¶

max_decoder_positions()[source]¶

torchscriptify()[source]¶

pytext.models.seq_models.seq2seq_output_layer module¶

class pytext.models.seq_models.seq2seq_output_layer.Seq2SeqOutputLayer(target_names: Optional[List[str]] = None, loss_fn: Optional[pytext.loss.loss.Loss] = None, *args, **kwargs)[source]¶

Bases: pytext.models.output_layers.output_layer_base.OutputLayerBase

classmethod from_config(config: pytext.models.seq_models.seq2seq_output_layer.Seq2SeqOutputLayer.Config, vocab: pytext.data.utils.Vocabulary)[source]¶

get_loss(model_outputs: Tuple[torch.Tensor, Dict[str, torch.Tensor]], targets: Tuple[torch.Tensor, torch.Tensor], context: Dict[str, Any] = None, reduce=True) → torch.Tensor[source]¶

Compute and return the loss given logits and targets.

Parameters:	logit (torch.Tensor) – Logits returned `Model`. target (torch.Tensor) – True label/target to compute loss against. context (Optional[Dict[str, Any]]) – Context is a dictionary of items that’s passed as additional metadata by the `DataHandler`. Defaults to None. reduce (bool) – Whether to reduce loss over the batch. Defaults to True.
Returns:	Model loss.
Return type:	torch.Tensor

pytext.models.seq_models.seqnn module¶

class pytext.models.seq_models.seqnn.SeqNNModel(embedding: pytext.models.embeddings.embedding_base.EmbeddingBase, representation: pytext.models.representations.representation_base.RepresentationBase, decoder: pytext.models.decoders.decoder_base.DecoderBase, output_layer: pytext.models.output_layers.output_layer_base.OutputLayerBase)[source]¶

Bases: pytext.models.doc_model.DocModel

Classification model with sequence of utterances as input. It uses a docnn model (CNN or LSTM) to generate vector representation for each sequence, and then use an LSTM or BLSTM to capture the dynamics and produce labels for each sequence.

arrange_model_inputs(tensor_dict)[source]¶

class pytext.models.seq_models.seqnn.SeqNNModel_Deprecated(embedding: pytext.models.embeddings.embedding_base.EmbeddingBase, representation: pytext.models.representations.representation_base.RepresentationBase, decoder: pytext.models.decoders.decoder_base.DecoderBase, output_layer: pytext.models.output_layers.output_layer_base.OutputLayerBase)[source]¶

Bases: pytext.models.model.Model

Classification model with sequence of utterances as input. It uses a docnn model (CNN or LSTM) to generate vector representation for each sequence, and then use an LSTM or BLSTM to capture the dynamics and produce labels for each sequence.

DEPRECATED: Use SeqNNModel

pytext.models.seq_models.utils module¶

pytext.models.seq_models.utils.Linear(in_features, out_features, bias=True)[source]¶

pytext.models.seq_models.utils.extract_ontology_vocab(target_dictionary)[source]¶

pytext.models.seq_models.utils.make_positions(input, padding_idx: int)[source]¶

Replace non-padding symbols with their position numbers.

Position numbers begin at padding_idx+1. Padding symbols are ignored.

pytext.models.seq_models.utils.prepare_full_key(instance_id: str, key: str, secondary_key: Optional[str] = None)[source]¶

pytext.models.seq_models.utils.unfold1d(x, kernel_size: int, padding_l: int, pad_value: float = 0)[source]¶: unfold T x B x C to T x B x C x K

pytext.models.seq_models.utils.verify_encoder_out(encoder_out: Dict[str, torch.Tensor], keys: List[str])[source]¶

pytext.models.seq_models package¶

Submodules¶

pytext.models.seq_models.attention module¶

pytext.models.seq_models.base module¶

pytext.models.seq_models.contextual_intent_slot module¶

pytext.models.seq_models.conv_decoder module¶

pytext.models.seq_models.conv_encoder module¶

pytext.models.seq_models.conv_model module¶

pytext.models.seq_models.light_conv module¶

pytext.models.seq_models.mask_generator module¶

pytext.models.seq_models.nar_length module¶

pytext.models.seq_models.nar_modules module¶

pytext.models.seq_models.nar_output_layer module¶

pytext.models.seq_models.positional module¶

pytext.models.seq_models.projection_layers module¶

pytext.models.seq_models.rnn_decoder module¶

pytext.models.seq_models.rnn_encoder module¶

pytext.models.seq_models.rnn_encoder_decoder module¶

pytext.models.seq_models.seq2seq_model module¶

pytext.models.seq_models.seq2seq_output_layer module¶

pytext.models.seq_models.seqnn module¶

pytext.models.seq_models.utils module¶

Module contents¶