pytext.models.representations package

Submodules

pytext.models.representations.attention module

class pytext.models.representations.attention.DotProductSelfAttention(input_dim)[source]

Bases: pytext.models.module.Module

Given vector w and token vectors = {t1, t2, …, t_n}, compute self attention weights to weighs the tokens * a_j = softmax(w . t_j)

forward(tokens, tokens_mask)[source]
Input:
x: batch_size * seq_len * input_dim x_mask: batch_size * seq_len (1 for padding, 0 for true)
Output:
alpha: batch_size * seq_len
classmethod from_config(config: pytext.models.representations.attention.DotProductSelfAttention.Config)[source]
class pytext.models.representations.attention.MultiplicativeAttention(p_hidden_dim, q_hidden_dim, normalize)[source]

Bases: pytext.models.module.Module

Given sequence P and vector q, computes attention weights for each element in P by matching q with each element in P using multiplicative attention. * a_i = softmax(p_i . W . q)

forward(p_seq: torch.Tensor, q: torch.Tensor, p_mask: torch.Tensor)[source]
Input:
p_seq: batch_size * p_seq_len * p_hidden_dim q: batch_size * q_hidden_dim p_mask: batch_size * p_seq_len (1 for padding, 0 for true)
Output:
attn_scores: batch_size * p_seq_len
classmethod from_config(config: pytext.models.representations.attention.MultiplicativeAttention.Config)[source]
class pytext.models.representations.attention.SequenceAlignedAttention(proj_dim)[source]

Bases: pytext.models.module.Module

Given sequences P and Q, computes attention weights for each element in P by matching Q with each element in P. * a_i_j = softmax(p_i . q_j) where softmax is computed by summing over q_j

forward(p: torch.Tensor, q: torch.Tensor, q_mask: torch.Tensor)[source]
Input:
p: batch_size * p_seq_len * dim q: batch_size * q_seq_len * dim q_mask: batch_size * q_seq_len (1 for padding, 0 for true)
Output:
matched_seq: batch_size * doc_seq_len * dim
classmethod from_config(config: pytext.models.representations.attention.SequenceAlignedAttention.Config)[source]

pytext.models.representations.augmented_lstm module

class pytext.models.representations.augmented_lstm.AugmentedLSTM(config: pytext.models.representations.augmented_lstm.AugmentedLSTM.Config, embed_dim: int, padding_value: float = 0.0)[source]

Bases: pytext.models.representations.representation_base.RepresentationBase

AugmentedLSTM implements a generic AugmentedLSTM representation layer. AugmentedLSTM is an LSTM which optionally appends an optional highway network to the output layer. Furthermore the dropout controlls the level of variational dropout done.

Parameters:
  • config (Config) – Configuration object of type BiLSTM.Config.
  • embed_dim (int) – The number of expected features in the input.
  • padding_value (float) – Value for the padded elements. Defaults to 0.0.
padding_value

Value for the padded elements.

Type:float
forward_layers

A module list of unidirectional AugmentedLSTM layers moving forward in time.

Type:nn.ModuleList
backward_layers

A module list of unidirectional AugmentedLSTM layers moving backward in time.

Type:nn.ModuleList
representation_dim

The calculated dimension of the output features of AugmentedLSTM.

Type:int
forward(embedded_tokens: torch.Tensor, seq_lengths: torch.Tensor, states: Optional[Tuple[torch.Tensor, torch.Tensor]] = None) → Tuple[torch.Tensor, Tuple[torch.Tensor, torch.Tensor]][source]

Given an input batch of sequential data such as word embeddings, produces a AugmentedLSTM representation of the sequential input and new state tensors.

Parameters:
  • embedded_tokens (torch.Tensor) – Input tensor of shape (bsize x seq_len x input_dim).
  • seq_lengths (torch.Tensor) – List of sequences lengths of each batch element.
  • states (Tuple[torch.Tensor, torch.Tensor]) – Tuple of tensors containing the initial hidden state and the cell state of each element in the batch. Each of these tensors have a dimension of (bsize x num_layers x num_directions * nhid). Defaults to None.
Returns:

AgumentedLSTM representation of input and the state of the LSTM t = seq_len. Shape of representation is (bsize x seq_len x representation_dim). Shape of each state is (bsize x num_layers * num_directions x nhid).

Return type:

Tuple[torch.Tensor, Tuple[torch.Tensor, torch.Tensor]]

class pytext.models.representations.augmented_lstm.AugmentedLSTMCell(embed_dim: int, lstm_dim: int, use_highway: bool, use_bias: bool = True)[source]

Bases: torch.nn.modules.module.Module

AugmentedLSTMCell implements a AugmentedLSTM cell. :param embed_dim: The number of expected features in the input. :type embed_dim: int :param lstm_dim: Number of features in the hidden state of the LSTM. :type lstm_dim: int :param Defaults to 32.: :param use_highway: If True we append a highway network to the :type use_highway: bool :param outputs of the LSTM.: :param use_bias: If True we use a bias in our LSTM calculations, otherwise :type use_bias: bool :param we don’t.:

input_linearity

Fused weight matrix which computes a linear function over the input.

Type:nn.Module
state_linearity

Fused weight matrix which computes a linear function over the states.

Type:nn.Module
forward(x: torch.Tensor, states=typing.Tuple[torch.Tensor, torch.Tensor], variational_dropout_mask: Optional[torch.Tensor] = None) → Tuple[torch.Tensor, torch.Tensor][source]

Warning: DO NOT USE THIS LAYER DIRECTLY, INSTEAD USE the AugmentedLSTM class

Parameters:
  • x (torch.Tensor) – Input tensor of shape (bsize x input_dim).
  • states (Tuple[torch.Tensor, torch.Tensor]) – Tuple of tensors containing the hidden state and the cell state of each element in the batch. Each of these tensors have a dimension of (bsize x nhid). Defaults to None.
Returns:

Returned states. Shape of each state is (bsize x nhid).

Return type:

Tuple[torch.Tensor, torch.Tensor]

reset_parameters()[source]
class pytext.models.representations.augmented_lstm.AugmentedLSTMUnidirectional(embed_dim: int, lstm_dim: int, go_forward: bool = True, recurrent_dropout_probability: float = 0.0, use_highway: bool = True, use_input_projection_bias: bool = True)[source]

Bases: torch.nn.modules.module.Module

AugmentedLSTMUnidirectional implements a one-layer single directional AugmentedLSTM layer. AugmentedLSTM is an LSTM which optionally appends an optional highway network to the output layer. Furthermore the dropout controlls the level of variational dropout done.

Parameters:
  • embed_dim (int) – The number of expected features in the input.
  • lstm_dim (int) – Number of features in the hidden state of the LSTM. Defaults to 32.
  • go_forward (bool) – Whether to compute features left to right (forward) or right to left (backward).
  • recurrent_dropout_probability (float) – Variational dropout probability to use. Defaults to 0.0.
  • use_highway (bool) – If True we append a highway network to the outputs of the LSTM.
  • use_input_projection_bias (bool) – If True we use a bias in our LSTM calculations, otherwise we don’t.
cell

AugmentedLSTMCell that is applied at every timestep.

Type:AugmentedLSTMCell
forward(inputs: torch.nn.utils.rnn.PackedSequence, states: Optional[Tuple[torch.Tensor, torch.Tensor]] = None) → Tuple[torch.nn.utils.rnn.PackedSequence, Tuple[torch.Tensor, torch.Tensor]][source]

Warning: DO NOT USE THIS LAYER DIRECTLY, INSTEAD USE the AugmentedLSTM class

Given an input batch of sequential data such as word embeddings, produces a single layer unidirectional AugmentedLSTM representation of the sequential input and new state tensors.

Parameters:
  • inputs (PackedSequence) – Input tensor of shape (bsize x seq_len x input_dim).
  • states (Tuple[torch.Tensor, torch.Tensor]) – Tuple of tensors containing the initial hidden state and the cell state of each element in the batch. Each of these tensors have a dimension of (1 x bsize x num_directions * nhid). Defaults to None.
Returns:

AgumentedLSTM representation of input and the state of the LSTM t = seq_len. Shape of representation is (bsize x seq_len x representation_dim). Shape of each state is (1 x bsize x nhid).

Return type:

Tuple[PackedSequence, Tuple[torch.Tensor, torch.Tensor]]

get_dropout_mask(dropout_probability: float, tensor_for_masking: torch.Tensor) → torch.Tensor[source]

pytext.models.representations.bilstm module

class pytext.models.representations.bilstm.BiLSTM(config: pytext.models.representations.bilstm.BiLSTM.Config, embed_dim: int, padding_value: float = 0.0)[source]

Bases: pytext.models.representations.representation_base.RepresentationBase

BiLSTM implements a multi-layer bidirectional LSTM representation layer preceded by a dropout layer.

Parameters:
  • config (Config) – Configuration object of type BiLSTM.Config.
  • embed_dim (int) – The number of expected features in the input.
  • padding_value (float) – Value for the padded elements. Defaults to 0.0.
padding_value

Value for the padded elements.

Type:float
dropout

Dropout layer preceding the LSTM.

Type:nn.Dropout
lstm

LSTM layer that operates on the inputs.

Type:nn.LSTM
representation_dim

The calculated dimension of the output features of BiLSTM.

Type:int
forward(embedded_tokens: torch.Tensor, seq_lengths: torch.Tensor, states: Optional[Tuple[torch.Tensor, torch.Tensor]] = None) → Tuple[torch.Tensor, Tuple[torch.Tensor, torch.Tensor]][source]

Given an input batch of sequential data such as word embeddings, produces a bidirectional LSTM representation of the sequential input and new state tensors.

Parameters:
  • embedded_tokens (torch.Tensor) – Input tensor of shape (bsize x seq_len x input_dim).
  • seq_lengths (torch.Tensor) – List of sequences lengths of each batch element.
  • states (Tuple[torch.Tensor, torch.Tensor]) – Tuple of tensors containing the initial hidden state and the cell state of each element in the batch. Each of these tensors have a dimension of (bsize x num_layers * num_directions x nhid). Defaults to None.
Returns:

Bidirectional

LSTM representation of input and the state of the LSTM t = seq_len. Shape of representation is (bsize x seq_len x representation_dim). Shape of each state is (bsize x num_layers * num_directions x nhid).

Return type:

Tuple[torch.Tensor, Tuple[torch.Tensor, torch.Tensor]]

pytext.models.representations.bilstm_doc_attention module

class pytext.models.representations.bilstm_doc_attention.BiLSTMDocAttention(config: pytext.models.representations.bilstm_doc_attention.BiLSTMDocAttention.Config, embed_dim: int)[source]

Bases: pytext.models.representations.representation_base.RepresentationBase

BiLSTMDocAttention implements a multi-layer bidirectional LSTM based representation for documents with or without pooling. The pooling can be max pooling, mean pooling or self attention.

Parameters:
  • config (Config) – Configuration object of type BiLSTMDocAttention.Config.
  • embed_dim (int) – The number of expected features in the input.
dropout

Dropout layer preceding the LSTM.

Type:nn.Dropout
lstm

Module that implements the LSTM.

Type:nn.Module
attention

Module that implements the attention or pooling.

Type:nn.Module
dense

Module that implements the non-linear projection over attended representation.

Type:nn.Module
representation_dim

The calculated dimension of the output features of the BiLSTMDocAttention representation.

Type:int
forward(embedded_tokens: torch.Tensor, seq_lengths: torch.Tensor, *args, states: Tuple[torch.Tensor, torch.Tensor] = None) → Tuple[torch.Tensor, Tuple[torch.Tensor, torch.Tensor]][source]

Given an input batch of sequential data such as word embeddings, produces a bidirectional LSTM representation with or without pooling of the sequential input and new state tensors.

Parameters:
  • embedded_tokens (torch.Tensor) – Input tensor of shape (bsize x seq_len x input_dim).
  • seq_lengths (torch.Tensor) – List of sequences lengths of each batch element.
  • states (Tuple[torch.Tensor, torch.Tensor]) – Tuple of tensors containing the initial hidden state and the cell state of each element in the batch. Each of these tensors have a dimension of (bsize x num_layers * num_directions x nhid). Defaults to None.
Returns:

Bidirectional

LSTM representation of input and the state of the LSTM at t = seq_len.

Return type:

Tuple[torch.Tensor, Tuple[torch.Tensor, torch.Tensor]]

pytext.models.representations.bilstm_doc_slot_attention module

class pytext.models.representations.bilstm_doc_slot_attention.BiLSTMDocSlotAttention(config: pytext.models.representations.bilstm_doc_slot_attention.BiLSTMDocSlotAttention.Config, embed_dim: int)[source]

Bases: pytext.models.representations.representation_base.RepresentationBase

BiLSTMDocSlotAttention implements a multi-layer bidirectional LSTM based representation with support for various attention mechanisms.

In default mode, when attention configuration is not provided, it behaves like a multi-layer LSTM encoder and returns the output features from the last layer of the LSTM, for each t. When document_attention configuration is provided, it produces a fixed-sized document representation. When slot_attention configuration is provide, it attends on output of each cell of LSTM module to produce a fixed sized word representation.

Parameters:
  • config (Config) – Configuration object of type BiLSTMDocSlotAttention.Config.
  • embed_dim (int) – The number of expected features in the input.
dropout

Dropout layer preceding the LSTM.

Type:nn.Dropout
relu

An instance of the ReLU layer.

Type:nn.ReLU
lstm

Module that implements the LSTM.

Type:nn.Module
use_doc_attention

If True, indicates using document attention.

Type:bool
doc_attention

Module that implements document attention.

Type:nn.Module
self.projection_d

A sequence of dense layers for projection over document representation.

Type:nn.Sequential
use_word_attention

If True, indicates using word attention.

Type:bool
word_attention

Module that implements word attention.

Type:nn.Module
self.projection_w

A sequence of dense layers for projection over word representation.

Type:nn.Sequential
representation_dim

The calculated dimension of the output features of the BiLSTMDocAttention representation.

Type:int
forward(embedded_tokens: torch.Tensor, seq_lengths: torch.Tensor, *args, states: torch.Tensor = None) → Tuple[torch.Tensor, torch.Tensor, Tuple[torch.Tensor, torch.Tensor]][source]

Given an input batch of sequential data such as word embeddings, produces a bidirectional LSTM representation the appropriate attention.

Parameters:
  • embedded_tokens (torch.Tensor) – Input tensor of shape (bsize x seq_len x input_dim).
  • seq_lengths (torch.Tensor) – List of sequences lengths of each batch element.
  • states (Tuple[torch.Tensor, torch.Tensor]) – Tuple of tensors containing the initial hidden state and the cell state of each element in the batch. Each of these tensors have a dimension of (bsize x num_layers * num_directions x nhid). Defaults to None.
Returns:

Tensors containing the document and the word representation of the input.

Return type:

Tuple[torch.Tensor, torch.Tensor, Tuple[torch.Tensor, torch.Tensor]]

pytext.models.representations.bilstm_slot_attn module

class pytext.models.representations.bilstm_slot_attn.BiLSTMSlotAttention(config: pytext.models.representations.bilstm_slot_attn.BiLSTMSlotAttention.Config, embed_dim: int)[source]

Bases: pytext.models.representations.representation_base.RepresentationBase

BiLSTMSlotAttention implements a multi-layer bidirectional LSTM based representation with attention over slots.

Parameters:
  • config (Config) – Configuration object of type BiLSTMSlotAttention.Config.
  • embed_dim (int) – The number of expected features in the input.
dropout

Dropout layer preceding the LSTM.

Type:nn.Dropout
lstm

Module that implements the LSTM.

Type:nn.Module
attention

Module that implements the attention.

Type:nn.Module
dense

Module that implements the non-linear projection over attended representation.

Type:nn.Module
representation_dim

The calculated dimension of the output features of the SlotAttention representation.

Type:int
forward(embedded_tokens: torch.Tensor, seq_lengths: torch.Tensor, *args, states: torch.Tensor = None, **kwargs) → torch.Tensor[source]

Given an input batch of sequential data such as word embeddings, produces a bidirectional LSTM representation with or without Slot attention.

Parameters:
  • embedded_tokens (torch.Tensor) – Input tensor of shape (bsize x seq_len x input_dim).
  • seq_lengths (torch.Tensor) – List of sequences lengths of each batch element.
  • states (Tuple[torch.Tensor, torch.Tensor]) – Tuple of tensors containing the initial hidden state and the cell state of each element in the batch. Each of these tensors have a dimension of (bsize x num_layers * num_directions x nhid). Defaults to None.
Returns:

Bidirectional LSTM representation of input with or

without slot attention.

Return type:

torch.Tensor

pytext.models.representations.biseqcnn module

class pytext.models.representations.biseqcnn.BSeqCNNRepresentation(config: pytext.models.representations.biseqcnn.BSeqCNNRepresentation.Config, embed_dim: int)[source]

Bases: pytext.models.representations.representation_base.RepresentationBase

This class is an implementation of the paper https://arxiv.org/pdf/1606.07783. It is a bidirectional CNN model that captures context like RNNs do.

The module expects that input mini-batch is already padded.

TODO: Current implementation has a single layer conv-maxpool operation.

forward(inputs: torch.Tensor, *args) → torch.Tensor[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class pytext.models.representations.biseqcnn.ContextualWordConvolution(in_channels: int, out_channels: int, kernel_sizes: List[int])[source]

Bases: torch.nn.modules.module.Module

forward(words: torch.Tensor)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

pytext.models.representations.contextual_intent_slot_rep module

class pytext.models.representations.contextual_intent_slot_rep.ContextualIntentSlotRepresentation(config: pytext.models.representations.contextual_intent_slot_rep.ContextualIntentSlotRepresentation.Config, embed_dim: Tuple[int, ...])[source]

Bases: pytext.models.representations.representation_base.RepresentationBase

Representation for a contextual intent slot model

The inputs are two embeddings: word level embedding containing dictionary features, sequence (contexts) level embedding. See following diagram for the representation implementation that combines the two embeddings. Seq_representation is concatenated with word_embeddings.

+-----------+
| word_embed|--------------------------->+   +--------------------+
+-----------+                            |   | doc_representation |
+-----------+   +-------------------+    |-->+--------------------+
| seq_embed |-->| seq_representation|--->+   | word_representation|
+-----------+   +-------------------+        +--------------------+
                                              joint_representation
forward(word_seq_embed: Tuple[torch.Tensor, torch.Tensor], word_lengths: torch.Tensor, seq_lengths: torch.Tensor, *args) → List[torch.Tensor][source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

pytext.models.representations.deepcnn module

class pytext.models.representations.deepcnn.DeepCNNRepresentation(config: pytext.models.representations.deepcnn.DeepCNNRepresentation.Config, embed_dim: int)[source]

Bases: pytext.models.representations.representation_base.RepresentationBase

DeepCNNRepresentation implements CNN representation layer preceded by a dropout layer. CNN representation layer is based on the encoder in the architecture proposed by Gehring et. al. in Convolutional Sequence to Sequence Learning.

Parameters:
  • config (Config) – Configuration object of type DeepCNNRepresentation.Config.
  • embed_dim (int) – The number of expected features in the input.
forward(inputs: torch.Tensor, *args) → torch.Tensor[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class pytext.models.representations.deepcnn.SeparableConv1d(input_channels: int, output_channels: int, kernel_size: int, padding: int, dilation: int, bottleneck: int)[source]

Bases: torch.nn.modules.module.Module

Implements a 1d depthwise separable convolutional layer. In regular convolutional layers, the input channels are mixed with each other to produce each output channel. Depthwise separable convolutions decompose this process into two smaller convolutions – a depthwise and pointwise convolution.

The depthwise convolution spatially convolves each input channel separately, then the pointwise convolution projects this result into a new channel space. This process reduces the number of FLOPS used to compute a convolution and also exhibits a regularization effect. The general behavior – including the input parameters – is equivalent to nn.Conv1d.

bottleneck controls the behavior of the pointwise convolution. Instead of upsampling directly, we split the pointwise convolution into two pieces: the first convolution downsamples into a (sufficiently small) low dimension and the second convolution upsamples into the target (higher) dimension. Creating this bottleneck significantly cuts the number of parameters with minimal loss in performance.

forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class pytext.models.representations.deepcnn.Trim1d(trim)[source]

Bases: torch.nn.modules.module.Module

Trims a 1d convolutional output. Used to implement history-padding by removing excess padding from the right.

forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

pytext.models.representations.deepcnn.create_conv_package(index: int, activation: pytext.config.module_config.Activation, in_channels: int, out_channels: int, kernel_size: int, causal: bool, dilated: bool, separable: bool, bottleneck: int, weight_norm: bool)[source]

Creates a convolutional layer with the specified arguments.

Parameters:
  • index (int) – Index of a convolutional layer in the stack.
  • activation (Activation) – Activation function.
  • in_channels (int) – Number of input channels.
  • out_channels (int) – Number of output channels.
  • kernel_size (int) – Size of 1d convolutional filter.
  • causal (bool) – Whether the convolution is causal or not. If set, it
  • for the temporal ordering of the inputs. (accounts) –
  • dilated (bool) – Whether the convolution is dilated or not. If set,
  • receptive field of the convolutional stack grows exponentially. (the) –
  • separable (bool) – Whether to use depthwise separable convolutions
  • not -- see SeparableConv1d. (or) –
  • bottleneck (int) – Bottleneck channel dimension for depthwise separable
  • See SeparableConv1d for an in-depth explanation. (convolutions.) –
  • weight_norm (bool) – Whether to add weight normalization to the
  • convolutions or not. (regular) –
pytext.models.representations.deepcnn.pool(pooling_type, words)[source]

pytext.models.representations.docnn module

class pytext.models.representations.docnn.DocNNRepresentation(config: pytext.models.representations.docnn.DocNNRepresentation.Config, embed_dim: int)[source]

Bases: pytext.models.representations.representation_base.RepresentationBase

CNN based representation of a document.

conv_and_pool(x, conv)[source]
forward(embedded_tokens: torch.Tensor, *args) → torch.Tensor[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

pytext.models.representations.huggingface_bert_sentence_encoder module

class pytext.models.representations.huggingface_bert_sentence_encoder.HuggingFaceBertSentenceEncoder(config: pytext.models.representations.huggingface_bert_sentence_encoder.HuggingFaceBertSentenceEncoder.Config, output_encoded_layers: bool, *args, **kwargs)[source]

Bases: pytext.models.representations.transformer_sentence_encoder_base.TransformerSentenceEncoderBase

Generate sentence representation using the open source HuggingFace BERT model. This class implements loading the model weights from a pre-trained model file.

pytext.models.representations.huggingface_electra_sentence_encoder module

class pytext.models.representations.huggingface_electra_sentence_encoder.HuggingFaceElectraSentenceEncoder(config: pytext.models.representations.huggingface_electra_sentence_encoder.HuggingFaceElectraSentenceEncoder.Config, output_encoded_layers: bool, *args, **kwargs)[source]

Bases: pytext.models.representations.transformer_sentence_encoder_base.TransformerSentenceEncoderBase

Generate sentence representation using the open source HuggingFace Electra model. This class implements loading the model weights from a pre-trained model file.

pytext.models.representations.jointcnn_rep module

class pytext.models.representations.jointcnn_rep.JointCNNRepresentation(config: pytext.models.representations.jointcnn_rep.JointCNNRepresentation.Config, embed_dim: int)[source]

Bases: pytext.models.representations.representation_base.RepresentationBase

forward(embedded_tokens: torch.Tensor, *args) → List[torch.Tensor][source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class pytext.models.representations.jointcnn_rep.SharedCNNRepresentation(config: pytext.models.representations.jointcnn_rep.SharedCNNRepresentation.Config, embed_dim: int)[source]

Bases: pytext.models.representations.representation_base.RepresentationBase

forward(embedded_tokens: torch.Tensor, *args) → List[torch.Tensor][source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

pytext.models.representations.ordered_neuron_lstm module

class pytext.models.representations.ordered_neuron_lstm.OrderedNeuronLSTM(config: pytext.models.representations.ordered_neuron_lstm.OrderedNeuronLSTM.Config, embed_dim: int, padding_value: Optional[float] = 0.0)[source]

Bases: pytext.models.representations.representation_base.RepresentationBase

forward(rep: torch.Tensor, seq_lengths: torch.Tensor, states: Optional[Tuple[torch.Tensor, torch.Tensor]] = None) → Tuple[torch.Tensor, Tuple[torch.Tensor, torch.Tensor]][source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class pytext.models.representations.ordered_neuron_lstm.OrderedNeuronLSTMLayer(embed_dim: int, lstm_dim: int, padding_value: float, dropout: float)[source]

Bases: pytext.models.module.Module

forward(embedded_tokens: torch.Tensor, states: Tuple[torch.Tensor, torch.Tensor], seq_lengths: List[int]) → Tuple[torch.Tensor, Tuple[torch.Tensor, torch.Tensor]][source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

pytext.models.representations.pair_rep module

class pytext.models.representations.pair_rep.PairRepresentation(config: pytext.models.representations.pair_rep.PairRepresentation.Config, embed_dim: Tuple[int, ...])[source]

Bases: pytext.models.representations.representation_base.RepresentationBase

Wrapper representation for a pair of inputs.

Takes a tuple of inputs: the left sentence, and the right sentence(s). Returns a representation of the pair of sentences, either as a concatenation of the two sentence embeddings or as a “siamese” representation which also includes their difference and elementwise product (arXiv:1705.02364). If more than two inputs are provided, the extra inputs are assumed to be extra “right” sentences, and the output will be the stacked pair representations of the left sentence together with all right sentences. This is more efficient than separately computing all these pair representations, because the left sentence will not need to be re-embedded multiple times.

forward(embeddings: Tuple[torch.Tensor, ...], *lengths) → torch.Tensor[source]

Computes the pair representations.

Parameters:
  • embeddings – token embeddings of the left sentence, followed by the token embeddings of the right sentence(s).
  • lengths – the corresponding sequence lengths.
Returns:

A tensor of shape (num_right_inputs, batch_size, rep_size), with the first dimension squeezed if one.

pytext.models.representations.pass_through module

class pytext.models.representations.pass_through.PassThroughRepresentation(config: pytext.config.component.ComponentMeta.__new__.<locals>.Config, embed_dim: int)[source]

Bases: pytext.models.representations.representation_base.RepresentationBase

forward(embedded_tokens: torch.Tensor, *args) → torch.Tensor[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

pytext.models.representations.pooling module

class pytext.models.representations.pooling.BoundaryPool(config: pytext.models.representations.pooling.BoundaryPool.Config, n_input: int)[source]

Bases: pytext.models.module.Module

forward(inputs: torch.Tensor, seq_lengths: torch.Tensor = None) → torch.Tensor[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class pytext.models.representations.pooling.LastTimestepPool(config: pytext.config.module_config.ModuleConfig, n_input: int)[source]

Bases: pytext.models.module.Module

forward(inputs: torch.Tensor, seq_lengths: torch.Tensor) → torch.Tensor[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class pytext.models.representations.pooling.MaxPool(config: pytext.config.module_config.ModuleConfig, n_input: int)[source]

Bases: pytext.models.module.Module

forward(inputs: torch.Tensor, seq_lengths: torch.Tensor = None) → torch.Tensor[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class pytext.models.representations.pooling.MeanPool(config: pytext.config.module_config.ModuleConfig, n_input: int)[source]

Bases: pytext.models.module.Module

forward(inputs: torch.Tensor, seq_lengths: torch.Tensor) → torch.Tensor[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class pytext.models.representations.pooling.NoPool(config: pytext.config.module_config.ModuleConfig, n_input: int)[source]

Bases: pytext.models.module.Module

forward(inputs: torch.Tensor, seq_lengths: torch.Tensor = None) → torch.Tensor[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class pytext.models.representations.pooling.SelfAttention(config: pytext.models.representations.pooling.SelfAttention.Config, n_input: int)[source]

Bases: pytext.models.module.Module

forward(inputs: torch.Tensor, seq_lengths: torch.Tensor = None) → torch.Tensor[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

init_weights(init_range: float = 0.1) → None[source]

pytext.models.representations.pure_doc_attention module

class pytext.models.representations.pure_doc_attention.PureDocAttention(config: pytext.models.representations.pure_doc_attention.PureDocAttention.Config, embed_dim: int)[source]

Bases: pytext.models.representations.representation_base.RepresentationBase

pooling (e.g. max pooling or self attention) followed by optional MLP

forward(embedded_tokens: torch.Tensor, seq_lengths: torch.Tensor = None, *args) → Any[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

pytext.models.representations.representation_base module

class pytext.models.representations.representation_base.RepresentationBase(config)[source]

Bases: pytext.models.module.Module

forward(*inputs)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

get_representation_dim()[source]

pytext.models.representations.seq_rep module

class pytext.models.representations.seq_rep.SeqRepresentation(config: pytext.models.representations.seq_rep.SeqRepresentation.Config, embed_dim: int)[source]

Bases: pytext.models.representations.representation_base.RepresentationBase

Representation for a sequence of sentences Each sentence will be embedded with a DocNN model, then all the sentences are embedded with another DocNN/BiLSTM model

forward(embedded_seqs: torch.Tensor, seq_lengths: torch.Tensor, *args) → torch.Tensor[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

pytext.models.representations.slot_attention module

class pytext.models.representations.slot_attention.SlotAttention(config: pytext.models.representations.slot_attention.SlotAttention.Config, n_input: int, batch_first: bool = True)[source]

Bases: pytext.models.module.Module

forward(inputs: torch.Tensor) → torch.Tensor[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

pytext.models.representations.sparse_transformer_sentence_encoder module

class pytext.models.representations.sparse_transformer_sentence_encoder.SparseTransformerSentenceEncoder(config: pytext.models.representations.sparse_transformer_sentence_encoder.SparseTransformerSentenceEncoder.Config, output_encoded_layers: bool, padding_idx: int, vocab_size: int, *args, **kwarg)[source]

Bases: pytext.models.representations.transformer_sentence_encoder.TransformerSentenceEncoder

Implementation of the Transformer Sentence Encoder. This directly makes use of the TransformerSentenceEncoder module in Fairseq.

A few interesting config options:
  • encoder_normalize_before detemines whether the layer norm is applied before or after self_attention. This is similar to original implementation from Google.
  • activation_fn can be set to ‘gelu’ instead of the default of ‘relu’.
  • project_representation adds a linear projection + tanh to the pooled output in the style of BERT.

pytext.models.representations.stacked_bidirectional_rnn module

class pytext.models.representations.stacked_bidirectional_rnn.RnnType[source]

Bases: enum.Enum

An enumeration.

GRU = 'gru'
LSTM = 'lstm'
RNN = 'rnn'
class pytext.models.representations.stacked_bidirectional_rnn.StackedBidirectionalRNN(config: pytext.models.representations.stacked_bidirectional_rnn.StackedBidirectionalRNN.Config, input_size: int, padding_value: float = 0.0)[source]

Bases: pytext.models.module.Module

StackedBidirectionalRNN implements a multi-layer bidirectional RNN with an option to return outputs from all the layers of RNN.

Parameters:
  • config (Config) – Configuration object of type BiLSTM.Config.
  • embed_dim (int) – The number of expected features in the input.
  • padding_value (float) – Value for the padded elements. Defaults to 0.0.
padding_value

Value for the padded elements.

Type:float
dropout

Dropout layer preceding the LSTM.

Type:nn.Dropout
lstm

LSTM layer that operates on the inputs.

Type:nn.LSTM
representation_dim

The calculated dimension of the output features of BiLSTM.

Type:int
forward(tokens, tokens_mask)[source]
Parameters:
  • tokens – batch, max_seq_len, hidden_size
  • tokens_mask – batch, max_seq_len (1 for padding, 0 for true)
Output:
tokens_encoded: batch, max_seq_len, hidden_size * num_layers if
concat_layers = True else batch, max_seq_len, hidden_size

pytext.models.representations.traced_transformer_encoder module

class pytext.models.representations.traced_transformer_encoder.TraceableTransformerWrapper(eager_encoder: fairseq.modules.transformer_sentence_encoder.TransformerSentenceEncoder)[source]

Bases: torch.nn.modules.module.Module

forward(tokens: torch.Tensor, segment_labels: torch.Tensor = None, positions: torch.Tensor = None, token_embeddings: torch.Tensor = None, attn_mask: torch.Tensor = None) → Tuple[torch.Tensor, torch.Tensor][source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class pytext.models.representations.traced_transformer_encoder.TracedTransformerEncoder(eager_encoder: fairseq.modules.transformer_sentence_encoder.TransformerSentenceEncoder, tokens: torch.Tensor, segment_labels: torch.Tensor = None, positions: torch.Tensor = None, token_embeddings: torch.Tensor = None, attn_mask: torch.Tensor = None)[source]

Bases: torch.nn.modules.module.Module

forward(tokens: torch.Tensor, segment_labels: torch.Tensor = None, positions: torch.Tensor = None, token_embeddings: torch.Tensor = None, attn_mask: torch.Tensor = None)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

pytext.models.representations.transformer_sentence_encoder module

class pytext.models.representations.transformer_sentence_encoder.TransformerSentenceEncoder(config: pytext.models.representations.transformer_sentence_encoder.TransformerSentenceEncoder.Config, output_encoded_layers: bool, padding_idx: int, vocab_size: int, *args, **kwarg)[source]

Bases: pytext.models.representations.transformer_sentence_encoder_base.TransformerSentenceEncoderBase

Implementation of the Transformer Sentence Encoder. This directly makes use of the TransformerSentenceEncoder module in Fairseq.

A few interesting config options:
  • encoder_normalize_before detemines whether the layer norm is applied before or after self_attention. This is similar to original implementation from Google.
  • activation_fn can be set to ‘gelu’ instead of the default of ‘relu’.
load_state_dict(state_dict)[source]

Copies parameters and buffers from state_dict into this module and its descendants. If strict is True, then the keys of state_dict must exactly match the keys returned by this module’s state_dict() function.

Parameters:
  • state_dict (dict) – a dict containing parameters and persistent buffers.
  • strict (bool, optional) – whether to strictly enforce that the keys in state_dict match the keys returned by this module’s state_dict() function. Default: True
Returns:

  • missing_keys is a list of str containing the missing keys
  • unexpected_keys is a list of str containing the unexpected keys

Return type:

NamedTuple with missing_keys and unexpected_keys fields

upgrade_state_dict_named(state_dict)[source]

pytext.models.representations.transformer_sentence_encoder_base module

class pytext.models.representations.transformer_sentence_encoder_base.PoolingMethod[source]

Bases: enum.Enum

Pooling Methods are chosen from the “Feature-based Approachs” section in https://arxiv.org/pdf/1810.04805.pdf

AVG_CONCAT_LAST_4_LAYERS = 'avg_concat_last_4_layers'
AVG_LAST_LAYER = 'avg_last_layer'
AVG_SECOND_TO_LAST_LAYER = 'avg_second_to_last_layer'
AVG_SUM_LAST_4_LAYERS = 'avg_sum_last_4_layers'
CLS_TOKEN = 'cls_token'
NO_POOL = 'no_pool'
class pytext.models.representations.transformer_sentence_encoder_base.TransformerSentenceEncoderBase(config: pytext.models.representations.transformer_sentence_encoder_base.TransformerSentenceEncoderBase.Config, output_encoded_layers=False, *args, **kwargs)[source]

Bases: pytext.models.representations.representation_base.RepresentationBase

Base class for all Bi-directional Transformer based Sentence Encoders. All children of this class should implement an _encoder function which takes as input: tokens, [optional] segment labels and a pad mask and outputs both the sentence representation (output of _pool_encoded_layers) and the output states of all the intermediate Transformer layers as a list of tensors.

Input tuple consists of the following elements: 1) tokens: torch tensor of size B x T which contains tokens ids 2) pad_mask: torch tensor of size B x T generated with the condition tokens != self.vocab.get_pad_index() 3) segment_labels: torch tensor of size B x T which contains the segment id of each token

Output tuple consists of the following elements: 1) encoded_layers: List of torch tensors where each tensor has shape B x T x C and there are num_transformer_layers + 1 of these. Each tensor represents the output of the intermediate transformer layers with the 0th element being the input to the first transformer layer (token + segment + position emebdding). 2) [Optional] pooled_output: Output of the pooling operation associated with config.pooling_method to the encoded_layers. Size B x C (or B x 4C if pooling = AVG_CONCAT_LAST_4_LAYERS)

forward(input_tuple: Tuple[torch.Tensor, ...], *args) → Tuple[torch.Tensor, ...][source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

classmethod from_config(config: pytext.models.representations.transformer_sentence_encoder_base.TransformerSentenceEncoderBase.Config, output_encoded_layers=False, *args, **kwargs)[source]

Module contents