pytext.models.representations package¶
Subpackages¶
- pytext.models.representations.transformer package
- Submodules
- pytext.models.representations.transformer.multihead_attention module
- pytext.models.representations.transformer.multihead_linear_attention module
- pytext.models.representations.transformer.positional_embedding module
- pytext.models.representations.transformer.representation module
- pytext.models.representations.transformer.residual_mlp module
- pytext.models.representations.transformer.sentence_encoder module
- pytext.models.representations.transformer.transformer module
- Module contents
Submodules¶
pytext.models.representations.attention module¶
-
class
pytext.models.representations.attention.
DotProductSelfAttention
(input_dim)[source]¶ Bases:
pytext.models.module.Module
Given vector w and token vectors = {t1, t2, …, t_n}, compute self attention weights to weighs the tokens * a_j = softmax(w . t_j)
-
class
pytext.models.representations.attention.
MultiplicativeAttention
(p_hidden_dim, q_hidden_dim, normalize)[source]¶ Bases:
pytext.models.module.Module
Given sequence P and vector q, computes attention weights for each element in P by matching q with each element in P using multiplicative attention. * a_i = softmax(p_i . W . q)
-
class
pytext.models.representations.attention.
SequenceAlignedAttention
(proj_dim)[source]¶ Bases:
pytext.models.module.Module
Given sequences P and Q, computes attention weights for each element in P by matching Q with each element in P. * a_i_j = softmax(p_i . q_j) where softmax is computed by summing over q_j
pytext.models.representations.augmented_lstm module¶
-
class
pytext.models.representations.augmented_lstm.
AugmentedLSTM
(config: pytext.models.representations.augmented_lstm.AugmentedLSTM.Config, embed_dim: int, padding_value: float = 0.0)[source]¶ Bases:
pytext.models.representations.representation_base.RepresentationBase
AugmentedLSTM implements a generic AugmentedLSTM representation layer. AugmentedLSTM is an LSTM which optionally appends an optional highway network to the output layer. Furthermore the dropout controlls the level of variational dropout done.
Parameters: - config (Config) – Configuration object of type BiLSTM.Config.
- embed_dim (int) – The number of expected features in the input.
- padding_value (float) – Value for the padded elements. Defaults to 0.0.
-
padding_value
¶ Value for the padded elements.
Type: float
-
forward_layers
¶ A module list of unidirectional AugmentedLSTM layers moving forward in time.
Type: nn.ModuleList
-
backward_layers
¶ A module list of unidirectional AugmentedLSTM layers moving backward in time.
Type: nn.ModuleList
-
representation_dim
¶ The calculated dimension of the output features of AugmentedLSTM.
Type: int
-
forward
(embedded_tokens: torch.Tensor, seq_lengths: torch.Tensor, states: Optional[Tuple[torch.Tensor, torch.Tensor]] = None) → Tuple[torch.Tensor, Tuple[torch.Tensor, torch.Tensor]][source]¶ Given an input batch of sequential data such as word embeddings, produces a AugmentedLSTM representation of the sequential input and new state tensors.
Parameters: - embedded_tokens (torch.Tensor) – Input tensor of shape (bsize x seq_len x input_dim).
- seq_lengths (torch.Tensor) – List of sequences lengths of each batch element.
- states (Tuple[torch.Tensor, torch.Tensor]) – Tuple of tensors containing the initial hidden state and the cell state of each element in the batch. Each of these tensors have a dimension of (bsize x num_layers x num_directions * nhid). Defaults to None.
Returns: AgumentedLSTM representation of input and the state of the LSTM t = seq_len. Shape of representation is (bsize x seq_len x representation_dim). Shape of each state is (bsize x num_layers * num_directions x nhid).
Return type: Tuple[torch.Tensor, Tuple[torch.Tensor, torch.Tensor]]
-
class
pytext.models.representations.augmented_lstm.
AugmentedLSTMCell
(embed_dim: int, lstm_dim: int, use_highway: bool, use_bias: bool = True)[source]¶ Bases:
torch.nn.modules.module.Module
AugmentedLSTMCell implements a AugmentedLSTM cell. :param embed_dim: The number of expected features in the input. :type embed_dim: int :param lstm_dim: Number of features in the hidden state of the LSTM. :type lstm_dim: int :param Defaults to 32.: :param use_highway: If True we append a highway network to the :type use_highway: bool :param outputs of the LSTM.: :param use_bias: If True we use a bias in our LSTM calculations, otherwise :type use_bias: bool :param we don’t.:
-
input_linearity
¶ Fused weight matrix which computes a linear function over the input.
Type: nn.Module
-
state_linearity
¶ Fused weight matrix which computes a linear function over the states.
Type: nn.Module
-
forward
(x: torch.Tensor, states=typing.Tuple[torch.Tensor, torch.Tensor], variational_dropout_mask: Optional[torch.Tensor] = None) → Tuple[torch.Tensor, torch.Tensor][source]¶ Warning: DO NOT USE THIS LAYER DIRECTLY, INSTEAD USE the AugmentedLSTM class
Parameters: - x (torch.Tensor) – Input tensor of shape (bsize x input_dim).
- states (Tuple[torch.Tensor, torch.Tensor]) – Tuple of tensors containing the hidden state and the cell state of each element in the batch. Each of these tensors have a dimension of (bsize x nhid). Defaults to None.
Returns: Returned states. Shape of each state is (bsize x nhid).
Return type: Tuple[torch.Tensor, torch.Tensor]
-
-
class
pytext.models.representations.augmented_lstm.
AugmentedLSTMUnidirectional
(embed_dim: int, lstm_dim: int, go_forward: bool = True, recurrent_dropout_probability: float = 0.0, use_highway: bool = True, use_input_projection_bias: bool = True)[source]¶ Bases:
torch.nn.modules.module.Module
AugmentedLSTMUnidirectional implements a one-layer single directional AugmentedLSTM layer. AugmentedLSTM is an LSTM which optionally appends an optional highway network to the output layer. Furthermore the dropout controlls the level of variational dropout done.
Parameters: - embed_dim (int) – The number of expected features in the input.
- lstm_dim (int) – Number of features in the hidden state of the LSTM. Defaults to 32.
- go_forward (bool) – Whether to compute features left to right (forward) or right to left (backward).
- recurrent_dropout_probability (float) – Variational dropout probability to use. Defaults to 0.0.
- use_highway (bool) – If True we append a highway network to the outputs of the LSTM.
- use_input_projection_bias (bool) – If True we use a bias in our LSTM calculations, otherwise we don’t.
-
cell
¶ AugmentedLSTMCell that is applied at every timestep.
Type: AugmentedLSTMCell
-
forward
(inputs: torch.nn.utils.rnn.PackedSequence, states: Optional[Tuple[torch.Tensor, torch.Tensor]] = None) → Tuple[torch.nn.utils.rnn.PackedSequence, Tuple[torch.Tensor, torch.Tensor]][source]¶ Warning: DO NOT USE THIS LAYER DIRECTLY, INSTEAD USE the AugmentedLSTM class
Given an input batch of sequential data such as word embeddings, produces a single layer unidirectional AugmentedLSTM representation of the sequential input and new state tensors.
Parameters: - inputs (PackedSequence) – Input tensor of shape (bsize x seq_len x input_dim).
- states (Tuple[torch.Tensor, torch.Tensor]) – Tuple of tensors containing the initial hidden state and the cell state of each element in the batch. Each of these tensors have a dimension of (1 x bsize x num_directions * nhid). Defaults to None.
Returns: AgumentedLSTM representation of input and the state of the LSTM t = seq_len. Shape of representation is (bsize x seq_len x representation_dim). Shape of each state is (1 x bsize x nhid).
Return type: Tuple[PackedSequence, Tuple[torch.Tensor, torch.Tensor]]
pytext.models.representations.bilstm module¶
-
class
pytext.models.representations.bilstm.
BiLSTM
(config: pytext.models.representations.bilstm.BiLSTM.Config, embed_dim: int, padding_value: float = 0.0)[source]¶ Bases:
pytext.models.representations.representation_base.RepresentationBase
BiLSTM implements a multi-layer bidirectional LSTM representation layer preceded by a dropout layer.
Parameters: - config (Config) – Configuration object of type BiLSTM.Config.
- embed_dim (int) – The number of expected features in the input.
- padding_value (float) – Value for the padded elements. Defaults to 0.0.
-
padding_value
¶ Value for the padded elements.
Type: float
-
dropout
¶ Dropout layer preceding the LSTM.
Type: nn.Dropout
-
lstm
¶ LSTM layer that operates on the inputs.
Type: nn.LSTM
-
representation_dim
¶ The calculated dimension of the output features of BiLSTM.
Type: int
-
forward
(embedded_tokens: torch.Tensor, seq_lengths: torch.Tensor, states: Optional[Tuple[torch.Tensor, torch.Tensor]] = None) → Tuple[torch.Tensor, Tuple[torch.Tensor, torch.Tensor]][source]¶ Given an input batch of sequential data such as word embeddings, produces a bidirectional LSTM representation of the sequential input and new state tensors.
Parameters: - embedded_tokens (torch.Tensor) – Input tensor of shape (bsize x seq_len x input_dim).
- seq_lengths (torch.Tensor) – List of sequences lengths of each batch element.
- states (Tuple[torch.Tensor, torch.Tensor]) – Tuple of tensors containing the initial hidden state and the cell state of each element in the batch. Each of these tensors have a dimension of (bsize x num_layers * num_directions x nhid). Defaults to None.
Returns: - Bidirectional
LSTM representation of input and the state of the LSTM t = seq_len. Shape of representation is (bsize x seq_len x representation_dim). Shape of each state is (bsize x num_layers * num_directions x nhid).
Return type: Tuple[torch.Tensor, Tuple[torch.Tensor, torch.Tensor]]
pytext.models.representations.bilstm_doc_attention module¶
-
class
pytext.models.representations.bilstm_doc_attention.
BiLSTMDocAttention
(config: pytext.models.representations.bilstm_doc_attention.BiLSTMDocAttention.Config, embed_dim: int)[source]¶ Bases:
pytext.models.representations.representation_base.RepresentationBase
BiLSTMDocAttention implements a multi-layer bidirectional LSTM based representation for documents with or without pooling. The pooling can be max pooling, mean pooling or self attention.
Parameters: - config (Config) – Configuration object of type BiLSTMDocAttention.Config.
- embed_dim (int) – The number of expected features in the input.
-
dropout
¶ Dropout layer preceding the LSTM.
Type: nn.Dropout
-
lstm
¶ Module that implements the LSTM.
Type: nn.Module
-
attention
¶ Module that implements the attention or pooling.
Type: nn.Module
-
dense
¶ Module that implements the non-linear projection over attended representation.
Type: nn.Module
-
representation_dim
¶ The calculated dimension of the output features of the BiLSTMDocAttention representation.
Type: int
-
forward
(embedded_tokens: torch.Tensor, seq_lengths: torch.Tensor, *args, states: Tuple[torch.Tensor, torch.Tensor] = None) → Tuple[torch.Tensor, Tuple[torch.Tensor, torch.Tensor]][source]¶ Given an input batch of sequential data such as word embeddings, produces a bidirectional LSTM representation with or without pooling of the sequential input and new state tensors.
Parameters: - embedded_tokens (torch.Tensor) – Input tensor of shape (bsize x seq_len x input_dim).
- seq_lengths (torch.Tensor) – List of sequences lengths of each batch element.
- states (Tuple[torch.Tensor, torch.Tensor]) – Tuple of tensors containing the initial hidden state and the cell state of each element in the batch. Each of these tensors have a dimension of (bsize x num_layers * num_directions x nhid). Defaults to None.
Returns: - Bidirectional
LSTM representation of input and the state of the LSTM at t = seq_len.
Return type: Tuple[torch.Tensor, Tuple[torch.Tensor, torch.Tensor]]
pytext.models.representations.bilstm_doc_slot_attention module¶
-
class
pytext.models.representations.bilstm_doc_slot_attention.
BiLSTMDocSlotAttention
(config: pytext.models.representations.bilstm_doc_slot_attention.BiLSTMDocSlotAttention.Config, embed_dim: int)[source]¶ Bases:
pytext.models.representations.representation_base.RepresentationBase
BiLSTMDocSlotAttention implements a multi-layer bidirectional LSTM based representation with support for various attention mechanisms.
In default mode, when attention configuration is not provided, it behaves like a multi-layer LSTM encoder and returns the output features from the last layer of the LSTM, for each t. When document_attention configuration is provided, it produces a fixed-sized document representation. When slot_attention configuration is provide, it attends on output of each cell of LSTM module to produce a fixed sized word representation.
Parameters: - config (Config) – Configuration object of type BiLSTMDocSlotAttention.Config.
- embed_dim (int) – The number of expected features in the input.
-
dropout
¶ Dropout layer preceding the LSTM.
Type: nn.Dropout
-
relu
¶ An instance of the ReLU layer.
Type: nn.ReLU
-
lstm
¶ Module that implements the LSTM.
Type: nn.Module
-
use_doc_attention
¶ If True, indicates using document attention.
Type: bool
-
doc_attention
¶ Module that implements document attention.
Type: nn.Module
-
self.
projection_d
¶ A sequence of dense layers for projection over document representation.
Type: nn.Sequential
-
use_word_attention
¶ If True, indicates using word attention.
Type: bool
-
word_attention
¶ Module that implements word attention.
Type: nn.Module
-
self.
projection_w
¶ A sequence of dense layers for projection over word representation.
Type: nn.Sequential
-
representation_dim
¶ The calculated dimension of the output features of the BiLSTMDocAttention representation.
Type: int
-
forward
(embedded_tokens: torch.Tensor, seq_lengths: torch.Tensor, *args, states: torch.Tensor = None) → Tuple[torch.Tensor, torch.Tensor, Tuple[torch.Tensor, torch.Tensor]][source]¶ Given an input batch of sequential data such as word embeddings, produces a bidirectional LSTM representation the appropriate attention.
Parameters: - embedded_tokens (torch.Tensor) – Input tensor of shape (bsize x seq_len x input_dim).
- seq_lengths (torch.Tensor) – List of sequences lengths of each batch element.
- states (Tuple[torch.Tensor, torch.Tensor]) – Tuple of tensors containing the initial hidden state and the cell state of each element in the batch. Each of these tensors have a dimension of (bsize x num_layers * num_directions x nhid). Defaults to None.
Returns: Tensors containing the document and the word representation of the input.
Return type: Tuple[torch.Tensor, torch.Tensor, Tuple[torch.Tensor, torch.Tensor]]
pytext.models.representations.bilstm_slot_attn module¶
-
class
pytext.models.representations.bilstm_slot_attn.
BiLSTMSlotAttention
(config: pytext.models.representations.bilstm_slot_attn.BiLSTMSlotAttention.Config, embed_dim: int)[source]¶ Bases:
pytext.models.representations.representation_base.RepresentationBase
BiLSTMSlotAttention implements a multi-layer bidirectional LSTM based representation with attention over slots.
Parameters: - config (Config) – Configuration object of type BiLSTMSlotAttention.Config.
- embed_dim (int) – The number of expected features in the input.
-
dropout
¶ Dropout layer preceding the LSTM.
Type: nn.Dropout
-
lstm
¶ Module that implements the LSTM.
Type: nn.Module
-
attention
¶ Module that implements the attention.
Type: nn.Module
-
dense
¶ Module that implements the non-linear projection over attended representation.
Type: nn.Module
-
representation_dim
¶ The calculated dimension of the output features of the SlotAttention representation.
Type: int
-
forward
(embedded_tokens: torch.Tensor, seq_lengths: torch.Tensor, *args, states: torch.Tensor = None, **kwargs) → torch.Tensor[source]¶ Given an input batch of sequential data such as word embeddings, produces a bidirectional LSTM representation with or without Slot attention.
Parameters: - embedded_tokens (torch.Tensor) – Input tensor of shape (bsize x seq_len x input_dim).
- seq_lengths (torch.Tensor) – List of sequences lengths of each batch element.
- states (Tuple[torch.Tensor, torch.Tensor]) – Tuple of tensors containing the initial hidden state and the cell state of each element in the batch. Each of these tensors have a dimension of (bsize x num_layers * num_directions x nhid). Defaults to None.
Returns: - Bidirectional LSTM representation of input with or
without slot attention.
Return type: torch.Tensor
pytext.models.representations.biseqcnn module¶
-
class
pytext.models.representations.biseqcnn.
BSeqCNNRepresentation
(config: pytext.models.representations.biseqcnn.BSeqCNNRepresentation.Config, embed_dim: int)[source]¶ Bases:
pytext.models.representations.representation_base.RepresentationBase
This class is an implementation of the paper https://arxiv.org/pdf/1606.07783. It is a bidirectional CNN model that captures context like RNNs do.
The module expects that input mini-batch is already padded.
TODO: Current implementation has a single layer conv-maxpool operation.
-
forward
(inputs: torch.Tensor, *args) → torch.Tensor[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
-
class
pytext.models.representations.biseqcnn.
ContextualWordConvolution
(in_channels: int, out_channels: int, kernel_sizes: List[int])[source]¶ Bases:
torch.nn.modules.module.Module
-
forward
(words: torch.Tensor)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
pytext.models.representations.contextual_intent_slot_rep module¶
-
class
pytext.models.representations.contextual_intent_slot_rep.
ContextualIntentSlotRepresentation
(config: pytext.models.representations.contextual_intent_slot_rep.ContextualIntentSlotRepresentation.Config, embed_dim: Tuple[int, ...])[source]¶ Bases:
pytext.models.representations.representation_base.RepresentationBase
Representation for a contextual intent slot model
The inputs are two embeddings: word level embedding containing dictionary features, sequence (contexts) level embedding. See following diagram for the representation implementation that combines the two embeddings. Seq_representation is concatenated with word_embeddings.
+-----------+ | word_embed|--------------------------->+ +--------------------+ +-----------+ | | doc_representation | +-----------+ +-------------------+ |-->+--------------------+ | seq_embed |-->| seq_representation|--->+ | word_representation| +-----------+ +-------------------+ +--------------------+ joint_representation
-
forward
(word_seq_embed: Tuple[torch.Tensor, torch.Tensor], word_lengths: torch.Tensor, seq_lengths: torch.Tensor, *args) → List[torch.Tensor][source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
pytext.models.representations.deepcnn module¶
-
class
pytext.models.representations.deepcnn.
DeepCNNRepresentation
(config: pytext.models.representations.deepcnn.DeepCNNRepresentation.Config, embed_dim: int)[source]¶ Bases:
pytext.models.representations.representation_base.RepresentationBase
DeepCNNRepresentation implements CNN representation layer preceded by a dropout layer. CNN representation layer is based on the encoder in the architecture proposed by Gehring et. al. in Convolutional Sequence to Sequence Learning.
Parameters: - config (Config) – Configuration object of type DeepCNNRepresentation.Config.
- embed_dim (int) – The number of expected features in the input.
-
forward
(inputs: torch.Tensor, *args) → torch.Tensor[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
class
pytext.models.representations.deepcnn.
SeparableConv1d
(input_channels: int, output_channels: int, kernel_size: int, padding: int, dilation: int, bottleneck: int)[source]¶ Bases:
torch.nn.modules.module.Module
Implements a 1d depthwise separable convolutional layer. In regular convolutional layers, the input channels are mixed with each other to produce each output channel. Depthwise separable convolutions decompose this process into two smaller convolutions – a depthwise and pointwise convolution.
The depthwise convolution spatially convolves each input channel separately, then the pointwise convolution projects this result into a new channel space. This process reduces the number of FLOPS used to compute a convolution and also exhibits a regularization effect. The general behavior – including the input parameters – is equivalent to nn.Conv1d.
bottleneck controls the behavior of the pointwise convolution. Instead of upsampling directly, we split the pointwise convolution into two pieces: the first convolution downsamples into a (sufficiently small) low dimension and the second convolution upsamples into the target (higher) dimension. Creating this bottleneck significantly cuts the number of parameters with minimal loss in performance.
-
forward
(x)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
-
class
pytext.models.representations.deepcnn.
Trim1d
(trim)[source]¶ Bases:
torch.nn.modules.module.Module
Trims a 1d convolutional output. Used to implement history-padding by removing excess padding from the right.
-
forward
(x)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
-
pytext.models.representations.deepcnn.
create_conv_package
(index: int, activation: pytext.config.module_config.Activation, in_channels: int, out_channels: int, kernel_size: int, causal: bool, dilated: bool, separable: bool, bottleneck: int, weight_norm: bool)[source]¶ Creates a convolutional layer with the specified arguments.
Parameters: - index (int) – Index of a convolutional layer in the stack.
- activation (Activation) – Activation function.
- in_channels (int) – Number of input channels.
- out_channels (int) – Number of output channels.
- kernel_size (int) – Size of 1d convolutional filter.
- causal (bool) – Whether the convolution is causal or not. If set, it
- for the temporal ordering of the inputs. (accounts) –
- dilated (bool) – Whether the convolution is dilated or not. If set,
- receptive field of the convolutional stack grows exponentially. (the) –
- separable (bool) – Whether to use depthwise separable convolutions
- not -- see SeparableConv1d. (or) –
- bottleneck (int) – Bottleneck channel dimension for depthwise separable
- See SeparableConv1d for an in-depth explanation. (convolutions.) –
- weight_norm (bool) – Whether to add weight normalization to the
- convolutions or not. (regular) –
pytext.models.representations.docnn module¶
-
class
pytext.models.representations.docnn.
DocNNRepresentation
(config: pytext.models.representations.docnn.DocNNRepresentation.Config, embed_dim: int)[source]¶ Bases:
pytext.models.representations.representation_base.RepresentationBase
CNN based representation of a document.
-
forward
(embedded_tokens: torch.Tensor, *args) → torch.Tensor[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
pytext.models.representations.huggingface_bert_sentence_encoder module¶
-
class
pytext.models.representations.huggingface_bert_sentence_encoder.
HuggingFaceBertSentenceEncoder
(config: pytext.models.representations.huggingface_bert_sentence_encoder.HuggingFaceBertSentenceEncoder.Config, output_encoded_layers: bool, *args, **kwargs)[source]¶ Bases:
pytext.models.representations.transformer_sentence_encoder_base.TransformerSentenceEncoderBase
Generate sentence representation using the open source HuggingFace BERT model. This class implements loading the model weights from a pre-trained model file.
pytext.models.representations.huggingface_electra_sentence_encoder module¶
-
class
pytext.models.representations.huggingface_electra_sentence_encoder.
HuggingFaceElectraSentenceEncoder
(config: pytext.models.representations.huggingface_electra_sentence_encoder.HuggingFaceElectraSentenceEncoder.Config, output_encoded_layers: bool, *args, **kwargs)[source]¶ Bases:
pytext.models.representations.transformer_sentence_encoder_base.TransformerSentenceEncoderBase
Generate sentence representation using the open source HuggingFace Electra model. This class implements loading the model weights from a pre-trained model file.
pytext.models.representations.jointcnn_rep module¶
-
class
pytext.models.representations.jointcnn_rep.
JointCNNRepresentation
(config: pytext.models.representations.jointcnn_rep.JointCNNRepresentation.Config, embed_dim: int)[source]¶ Bases:
pytext.models.representations.representation_base.RepresentationBase
-
forward
(embedded_tokens: torch.Tensor, *args) → List[torch.Tensor][source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
Bases:
pytext.models.representations.representation_base.RepresentationBase
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
pytext.models.representations.ordered_neuron_lstm module¶
-
class
pytext.models.representations.ordered_neuron_lstm.
OrderedNeuronLSTM
(config: pytext.models.representations.ordered_neuron_lstm.OrderedNeuronLSTM.Config, embed_dim: int, padding_value: Optional[float] = 0.0)[source]¶ Bases:
pytext.models.representations.representation_base.RepresentationBase
-
forward
(rep: torch.Tensor, seq_lengths: torch.Tensor, states: Optional[Tuple[torch.Tensor, torch.Tensor]] = None) → Tuple[torch.Tensor, Tuple[torch.Tensor, torch.Tensor]][source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
-
class
pytext.models.representations.ordered_neuron_lstm.
OrderedNeuronLSTMLayer
(embed_dim: int, lstm_dim: int, padding_value: float, dropout: float)[source]¶ Bases:
pytext.models.module.Module
-
forward
(embedded_tokens: torch.Tensor, states: Tuple[torch.Tensor, torch.Tensor], seq_lengths: List[int]) → Tuple[torch.Tensor, Tuple[torch.Tensor, torch.Tensor]][source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
pytext.models.representations.pair_rep module¶
-
class
pytext.models.representations.pair_rep.
PairRepresentation
(config: pytext.models.representations.pair_rep.PairRepresentation.Config, embed_dim: Tuple[int, ...])[source]¶ Bases:
pytext.models.representations.representation_base.RepresentationBase
Wrapper representation for a pair of inputs.
Takes a tuple of inputs: the left sentence, and the right sentence(s). Returns a representation of the pair of sentences, either as a concatenation of the two sentence embeddings or as a “siamese” representation which also includes their difference and elementwise product (arXiv:1705.02364). If more than two inputs are provided, the extra inputs are assumed to be extra “right” sentences, and the output will be the stacked pair representations of the left sentence together with all right sentences. This is more efficient than separately computing all these pair representations, because the left sentence will not need to be re-embedded multiple times.
-
forward
(embeddings: Tuple[torch.Tensor, ...], *lengths) → torch.Tensor[source]¶ Computes the pair representations.
Parameters: - embeddings – token embeddings of the left sentence, followed by the token embeddings of the right sentence(s).
- lengths – the corresponding sequence lengths.
Returns: A tensor of shape (num_right_inputs, batch_size, rep_size), with the first dimension squeezed if one.
-
pytext.models.representations.pass_through module¶
-
class
pytext.models.representations.pass_through.
PassThroughRepresentation
(config: pytext.config.component.ComponentMeta.__new__.<locals>.Config, embed_dim: int)[source]¶ Bases:
pytext.models.representations.representation_base.RepresentationBase
-
forward
(embedded_tokens: torch.Tensor, *args) → torch.Tensor[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
pytext.models.representations.pooling module¶
-
class
pytext.models.representations.pooling.
BoundaryPool
(config: pytext.models.representations.pooling.BoundaryPool.Config, n_input: int)[source]¶ Bases:
pytext.models.module.Module
-
forward
(inputs: torch.Tensor, seq_lengths: torch.Tensor = None) → torch.Tensor[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
-
class
pytext.models.representations.pooling.
LastTimestepPool
(config: pytext.config.module_config.ModuleConfig, n_input: int)[source]¶ Bases:
pytext.models.module.Module
-
forward
(inputs: torch.Tensor, seq_lengths: torch.Tensor) → torch.Tensor[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
-
class
pytext.models.representations.pooling.
MaxPool
(config: pytext.config.module_config.ModuleConfig, n_input: int)[source]¶ Bases:
pytext.models.module.Module
-
forward
(inputs: torch.Tensor, seq_lengths: torch.Tensor = None) → torch.Tensor[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
-
class
pytext.models.representations.pooling.
MeanPool
(config: pytext.config.module_config.ModuleConfig, n_input: int)[source]¶ Bases:
pytext.models.module.Module
-
forward
(inputs: torch.Tensor, seq_lengths: torch.Tensor) → torch.Tensor[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
-
class
pytext.models.representations.pooling.
NoPool
(config: pytext.config.module_config.ModuleConfig, n_input: int)[source]¶ Bases:
pytext.models.module.Module
-
forward
(inputs: torch.Tensor, seq_lengths: torch.Tensor = None) → torch.Tensor[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
-
class
pytext.models.representations.pooling.
SelfAttention
(config: pytext.models.representations.pooling.SelfAttention.Config, n_input: int)[source]¶ Bases:
pytext.models.module.Module
-
forward
(inputs: torch.Tensor, seq_lengths: torch.Tensor = None) → torch.Tensor[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
pytext.models.representations.pure_doc_attention module¶
-
class
pytext.models.representations.pure_doc_attention.
PureDocAttention
(config: pytext.models.representations.pure_doc_attention.PureDocAttention.Config, embed_dim: int)[source]¶ Bases:
pytext.models.representations.representation_base.RepresentationBase
pooling (e.g. max pooling or self attention) followed by optional MLP
-
forward
(embedded_tokens: torch.Tensor, seq_lengths: torch.Tensor = None, *args) → Any[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
pytext.models.representations.representation_base module¶
-
class
pytext.models.representations.representation_base.
RepresentationBase
(config)[source]¶ Bases:
pytext.models.module.Module
-
forward
(*inputs)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
pytext.models.representations.seq_rep module¶
-
class
pytext.models.representations.seq_rep.
SeqRepresentation
(config: pytext.models.representations.seq_rep.SeqRepresentation.Config, embed_dim: int)[source]¶ Bases:
pytext.models.representations.representation_base.RepresentationBase
Representation for a sequence of sentences Each sentence will be embedded with a DocNN model, then all the sentences are embedded with another DocNN/BiLSTM model
-
forward
(embedded_seqs: torch.Tensor, seq_lengths: torch.Tensor, *args) → torch.Tensor[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
pytext.models.representations.slot_attention module¶
-
class
pytext.models.representations.slot_attention.
SlotAttention
(config: pytext.models.representations.slot_attention.SlotAttention.Config, n_input: int, batch_first: bool = True)[source]¶ Bases:
pytext.models.module.Module
-
forward
(inputs: torch.Tensor) → torch.Tensor[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
pytext.models.representations.sparse_transformer_sentence_encoder module¶
-
class
pytext.models.representations.sparse_transformer_sentence_encoder.
SparseTransformerSentenceEncoder
(config: pytext.models.representations.sparse_transformer_sentence_encoder.SparseTransformerSentenceEncoder.Config, output_encoded_layers: bool, padding_idx: int, vocab_size: int, *args, **kwarg)[source]¶ Bases:
pytext.models.representations.transformer_sentence_encoder.TransformerSentenceEncoder
Implementation of the Transformer Sentence Encoder. This directly makes use of the TransformerSentenceEncoder module in Fairseq.
- A few interesting config options:
- encoder_normalize_before detemines whether the layer norm is applied before or after self_attention. This is similar to original implementation from Google.
- activation_fn can be set to ‘gelu’ instead of the default of ‘relu’.
- project_representation adds a linear projection + tanh to the pooled output in the style of BERT.
pytext.models.representations.stacked_bidirectional_rnn module¶
-
class
pytext.models.representations.stacked_bidirectional_rnn.
RnnType
[source]¶ Bases:
enum.Enum
An enumeration.
-
GRU
= 'gru'¶
-
LSTM
= 'lstm'¶
-
RNN
= 'rnn'¶
-
-
class
pytext.models.representations.stacked_bidirectional_rnn.
StackedBidirectionalRNN
(config: pytext.models.representations.stacked_bidirectional_rnn.StackedBidirectionalRNN.Config, input_size: int, padding_value: float = 0.0)[source]¶ Bases:
pytext.models.module.Module
StackedBidirectionalRNN implements a multi-layer bidirectional RNN with an option to return outputs from all the layers of RNN.
Parameters: - config (Config) – Configuration object of type BiLSTM.Config.
- embed_dim (int) – The number of expected features in the input.
- padding_value (float) – Value for the padded elements. Defaults to 0.0.
-
padding_value
¶ Value for the padded elements.
Type: float
-
dropout
¶ Dropout layer preceding the LSTM.
Type: nn.Dropout
-
lstm
¶ LSTM layer that operates on the inputs.
Type: nn.LSTM
-
representation_dim
¶ The calculated dimension of the output features of BiLSTM.
Type: int
pytext.models.representations.traced_transformer_encoder module¶
-
class
pytext.models.representations.traced_transformer_encoder.
TraceableTransformerWrapper
(eager_encoder: fairseq.modules.transformer_sentence_encoder.TransformerSentenceEncoder)[source]¶ Bases:
torch.nn.modules.module.Module
-
forward
(tokens: torch.Tensor, segment_labels: torch.Tensor = None, positions: torch.Tensor = None, token_embeddings: torch.Tensor = None, attn_mask: torch.Tensor = None) → Tuple[torch.Tensor, torch.Tensor][source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
-
class
pytext.models.representations.traced_transformer_encoder.
TracedTransformerEncoder
(eager_encoder: fairseq.modules.transformer_sentence_encoder.TransformerSentenceEncoder, tokens: torch.Tensor, segment_labels: torch.Tensor = None, positions: torch.Tensor = None, token_embeddings: torch.Tensor = None, attn_mask: torch.Tensor = None)[source]¶ Bases:
torch.nn.modules.module.Module
-
forward
(tokens: torch.Tensor, segment_labels: torch.Tensor = None, positions: torch.Tensor = None, token_embeddings: torch.Tensor = None, attn_mask: torch.Tensor = None)[source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
pytext.models.representations.transformer_sentence_encoder module¶
-
class
pytext.models.representations.transformer_sentence_encoder.
TransformerSentenceEncoder
(config: pytext.models.representations.transformer_sentence_encoder.TransformerSentenceEncoder.Config, output_encoded_layers: bool, padding_idx: int, vocab_size: int, *args, **kwarg)[source]¶ Bases:
pytext.models.representations.transformer_sentence_encoder_base.TransformerSentenceEncoderBase
Implementation of the Transformer Sentence Encoder. This directly makes use of the TransformerSentenceEncoder module in Fairseq.
- A few interesting config options:
- encoder_normalize_before detemines whether the layer norm is applied before or after self_attention. This is similar to original implementation from Google.
- activation_fn can be set to ‘gelu’ instead of the default of ‘relu’.
-
load_state_dict
(state_dict)[source]¶ Copies parameters and buffers from
state_dict
into this module and its descendants. Ifstrict
isTrue
, then the keys ofstate_dict
must exactly match the keys returned by this module’sstate_dict()
function.Parameters: - state_dict (dict) – a dict containing parameters and persistent buffers.
- strict (bool, optional) – whether to strictly enforce that the keys
in
state_dict
match the keys returned by this module’sstate_dict()
function. Default:True
Returns: - missing_keys is a list of str containing the missing keys
- unexpected_keys is a list of str containing the unexpected keys
Return type: NamedTuple
withmissing_keys
andunexpected_keys
fields
pytext.models.representations.transformer_sentence_encoder_base module¶
-
class
pytext.models.representations.transformer_sentence_encoder_base.
PoolingMethod
[source]¶ Bases:
enum.Enum
Pooling Methods are chosen from the “Feature-based Approachs” section in https://arxiv.org/pdf/1810.04805.pdf
-
AVG_CONCAT_LAST_4_LAYERS
= 'avg_concat_last_4_layers'¶
-
AVG_LAST_LAYER
= 'avg_last_layer'¶
-
AVG_SECOND_TO_LAST_LAYER
= 'avg_second_to_last_layer'¶
-
AVG_SUM_LAST_4_LAYERS
= 'avg_sum_last_4_layers'¶
-
CLS_TOKEN
= 'cls_token'¶
-
NO_POOL
= 'no_pool'¶
-
-
class
pytext.models.representations.transformer_sentence_encoder_base.
TransformerSentenceEncoderBase
(config: pytext.models.representations.transformer_sentence_encoder_base.TransformerSentenceEncoderBase.Config, output_encoded_layers=False, *args, **kwargs)[source]¶ Bases:
pytext.models.representations.representation_base.RepresentationBase
Base class for all Bi-directional Transformer based Sentence Encoders. All children of this class should implement an _encoder function which takes as input: tokens, [optional] segment labels and a pad mask and outputs both the sentence representation (output of _pool_encoded_layers) and the output states of all the intermediate Transformer layers as a list of tensors.
Input tuple consists of the following elements: 1) tokens: torch tensor of size B x T which contains tokens ids 2) pad_mask: torch tensor of size B x T generated with the condition tokens != self.vocab.get_pad_index() 3) segment_labels: torch tensor of size B x T which contains the segment id of each token
Output tuple consists of the following elements: 1) encoded_layers: List of torch tensors where each tensor has shape B x T x C and there are num_transformer_layers + 1 of these. Each tensor represents the output of the intermediate transformer layers with the 0th element being the input to the first transformer layer (token + segment + position emebdding). 2) [Optional] pooled_output: Output of the pooling operation associated with config.pooling_method to the encoded_layers. Size B x C (or B x 4C if pooling = AVG_CONCAT_LAST_4_LAYERS)
-
forward
(input_tuple: Tuple[torch.Tensor, ...], *args) → Tuple[torch.Tensor, ...][source]¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-