pytext.metric_reporters package

Submodules

pytext.metric_reporters.calibration_metric_reporter module

class pytext.metric_reporters.calibration_metric_reporter.CalibrationMetricReporter(channels: List[pytext.metric_reporters.channel.Channel], pad_index: int = -1)[source]

Bases: pytext.metric_reporters.metric_reporter.MetricReporter

aggregate_preds(batch_preds: torch.Tensor, batch_context=typing.Dict[str, typing.Any])[source]
aggregate_scores(batch_scores: torch.Tensor)[source]
aggregate_targets(batch_targets: torch.Tensor, batch_context=typing.Dict[str, typing.Any])[source]
calculate_metric()[source]

Calculate metrics, each sub class should implement it

classmethod from_config(config: pytext.config.pytext_config.PyTextConfig, pad_index: int = -1)[source]

pytext.metric_reporters.channel module

class pytext.metric_reporters.channel.Channel(stages: Tuple[pytext.common.constants.Stage, ...] = (<Stage.TRAIN: 'Training'>, <Stage.EVAL: 'Evaluation'>, <Stage.TEST: 'Test'>, <Stage.OTHERS: 'Others'>))[source]

Bases: object

Channel defines how to format and report the result of a PyText job to an output stream.

stages

in which stages the report will be triggered, default is all stages, which includes train, eval, test

close()[source]
export(model, input_to_model=None, **kwargs)[source]
report(stage, epoch, metrics, model_select_metric, loss, preds, targets, scores, context, *args)[source]

Defines how to format and report data to the output channel.

Parameters:
  • stage (Stage) – train, eval or test
  • epoch (int) – current epoch
  • metrics (Any) – all metrics
  • model_select_metric (double) – a single numeric metric to pick best model
  • loss (double) – average loss
  • preds (List[Any]) – list of predictions
  • targets (List[Any]) – list of targets
  • scores (List[Any]) – list of scores
  • context (Dict[str, List[Any]]) – dict of any additional context data, each context is a list of data that maps to each example
class pytext.metric_reporters.channel.ConsoleChannel(stages: Tuple[pytext.common.constants.Stage, ...] = (<Stage.TRAIN: 'Training'>, <Stage.EVAL: 'Evaluation'>, <Stage.TEST: 'Test'>, <Stage.OTHERS: 'Others'>))[source]

Bases: pytext.metric_reporters.channel.Channel

Simple Channel that prints results to console.

report(stage, epoch, metrics, model_select_metric, loss, preds, targets, scores, context, *args)[source]

Defines how to format and report data to the output channel.

Parameters:
  • stage (Stage) – train, eval or test
  • epoch (int) – current epoch
  • metrics (Any) – all metrics
  • model_select_metric (double) – a single numeric metric to pick best model
  • loss (double) – average loss
  • preds (List[Any]) – list of predictions
  • targets (List[Any]) – list of targets
  • scores (List[Any]) – list of scores
  • context (Dict[str, List[Any]]) – dict of any additional context data, each context is a list of data that maps to each example
class pytext.metric_reporters.channel.FileChannel(stages, file_path)[source]

Bases: pytext.metric_reporters.channel.Channel

Simple Channel that writes results to a TSV file.

gen_content(metrics, loss, preds, targets, scores, context)[source]
get_title(context_keys=())[source]
report(stage, epoch, metrics, model_select_metric, loss, preds, targets, scores, context, *args)[source]

Defines how to format and report data to the output channel.

Parameters:
  • stage (Stage) – train, eval or test
  • epoch (int) – current epoch
  • metrics (Any) – all metrics
  • model_select_metric (double) – a single numeric metric to pick best model
  • loss (double) – average loss
  • preds (List[Any]) – list of predictions
  • targets (List[Any]) – list of targets
  • scores (List[Any]) – list of scores
  • context (Dict[str, List[Any]]) – dict of any additional context data, each context is a list of data that maps to each example
class pytext.metric_reporters.channel.TensorBoardChannel(summary_writer=None, metric_name='accuracy')[source]

Bases: pytext.metric_reporters.channel.Channel

TensorBoardChannel defines how to format and report the result of a PyText job to TensorBoard.

summary_writer

An instance of the TensorBoard SummaryWriter class, or an object that implements the same interface. https://pytorch.org/docs/stable/tensorboard.html

metric_name

The name of the default metric to display on the TensorBoard dashboard, defaults to “accuracy”

train_step

The training step count

add_scalars(prefix, metrics, epoch)[source]

Recursively flattens the metrics object and adds each field name and value as a scalar for the corresponding epoch using the summary writer.

Parameters:
  • prefix (str) – The tag prefix for the metric. Each field name in the metrics object will be prepended with the prefix.
  • metrics (Any) – The metrics object.
add_texts(tag, metrics)[source]

Recursively flattens the metrics object and adds each field name and value as a text using the summary writer. For example, if tag = “test”, and metrics = { accuracy: 0.7, scores: { precision: 0.8, recall: 0.6 } }, then under “tag=test” we will display “accuracy=0.7”, and under “tag=test/scores” we will display “precision=0.8” and “recall=0.6” in TensorBoard.

Parameters:
  • tag (str) – The tag name for the metric. If a field needs to be flattened further, it will be prepended as a prefix to the field name.
  • metrics (Any) – The metrics object/dict.
close()[source]

Closes the summary writer.

export(model, input_to_model=None, **kwargs)[source]

Draws the neural network representation graph in TensorBoard.

Parameters:
  • model (Any) – the model object.
  • input_to_model (Any) – the input to the model (required for PyTorch models, since its execution graph is defined by run).
log_loss(prefix, loss, epoch)[source]
log_vector(key, val, epoch)[source]
report(stage, epoch, metrics, model_select_metric, loss, preds, targets, scores, context, meta, model, optimizer, log_gradient, gradients, *args)[source]

Defines how to format and report data to TensorBoard using the summary writer. In the current implementation, during the train/eval phase we recursively report each metric field as scalars, and during the test phase we report the final metrics to be displayed as texts.

Also visualizes the internal model states (weights, biases) as histograms in TensorBoard.

Parameters:
  • stage (Stage) – train, eval or test
  • epoch (int) – current epoch
  • metrics (Any) – all metrics
  • model_select_metric (double) – a single numeric metric to pick best model
  • loss (double) – average loss
  • preds (List[Any]) – list of predictions
  • targets (List[Any]) – list of targets
  • scores (List[Any]) – list of scores
  • context (Dict[str, List[Any]]) – dict of any additional context data, each context is a list of data that maps to each example
  • meta (Dict[str, Any]) – global metadata, such as target names
  • model (nn.Module) – the PyTorch neural network model

pytext.metric_reporters.classification_metric_reporter module

class pytext.metric_reporters.classification_metric_reporter.ClassificationMetricReporter(label_names: List[str], channels: List[pytext.metric_reporters.channel.Channel], model_select_metric: pytext.metric_reporters.classification_metric_reporter.ComparableClassificationMetric = <ComparableClassificationMetric.ACCURACY: 'accuracy'>, target_label: Optional[str] = None, text_column_names: List[str] = ['text'], additional_column_names: List[str] = [], recall_at_precision_thresholds: List[float] = [0.2, 0.4, 0.6, 0.8, 0.9], is_memory_efficient: bool = False)[source]

Bases: pytext.metric_reporters.metric_reporter.MetricReporter

add_batch_stats(n_batches, preds, targets, scores, loss, m_input, **context)[source]

Aggregates a batch of output data (predictions, scores, targets/true labels and loss).

Parameters:
  • n_batches (int) – number of current batch
  • preds (torch.Tensor) – predictions of current batch
  • targets (torch.Tensor) – targets of current batch
  • scores (torch.Tensor) – scores of current batch
  • loss (double) – average loss of current batch
  • m_input (Tuple[torch.Tensor, ..]) – model inputs of current batch
  • context (Dict[str, Any]) – any additional context data, it could be either a list of data which maps to each example, or a single value for the batch
batch_context(raw_batch, batch)[source]
calculate_metric()[source]

Calculate metrics, each sub class should implement it

classmethod from_config(config, meta: pytext.data.data_handler.CommonMetadata = None, tensorizers=None)[source]
classmethod from_config_and_label_names(config, label_names: List[str])[source]
get_meta()[source]

Get global meta data that is not specific to any batch, the data will be pass along to channels

get_model_select_metric(metrics)[source]

Return a single numeric metric value that is used for model selection, returns the metric itself by default, but usually metrics will be more complicated data structures

predictions_to_report()[source]

Generate human readable predictions

targets_to_report()[source]

Generate human readable targets

class pytext.metric_reporters.classification_metric_reporter.ComparableClassificationMetric[source]

Bases: enum.Enum

An enumeration.

ACCURACY = 'accuracy'
LABEL_AVG_PRECISION = 'label_avg_precision'
LABEL_F1 = 'label_f1'
LABEL_ROC_AUC = 'label_roc_auc'
MACRO_F1 = 'macro_f1'
MCC = 'mcc'
NEGATIVE_LOSS = 'negative_loss'
ROC_AUC = 'roc_auc'
class pytext.metric_reporters.classification_metric_reporter.MultiLabelClassificationMetricReporter(label_names: List[str], channels: List[pytext.metric_reporters.channel.Channel], model_select_metric: pytext.metric_reporters.classification_metric_reporter.ComparableClassificationMetric = <ComparableClassificationMetric.ACCURACY: 'accuracy'>, target_label: Optional[str] = None, text_column_names: List[str] = ['text'], additional_column_names: List[str] = [], recall_at_precision_thresholds: List[float] = [0.2, 0.4, 0.6, 0.8, 0.9], is_memory_efficient: bool = False)[source]

Bases: pytext.metric_reporters.classification_metric_reporter.ClassificationMetricReporter

calculate_metric()[source]

Calculate metrics, each sub class should implement it

predictions_to_report()[source]

Generate human readable predictions

targets_to_report()[source]

Generate human readable targets

pytext.metric_reporters.compositional_metric_reporter module

class pytext.metric_reporters.compositional_metric_reporter.CompositionalMetricReporter(actions_vocab, channels: List[pytext.metric_reporters.channel.Channel], text_column_name: str = 'tokenized_text', tokenizer: pytext.data.tokenizers.tokenizer.Tokenizer = None)[source]

Bases: pytext.metric_reporters.metric_reporter.MetricReporter

batch_context(raw_batch, batch)[source]
calculate_metric()[source]

Calculate metrics, each sub class should implement it

create_frame_prediction_pairs()[source]
classmethod from_config(config, metadata: pytext.data.data_handler.CommonMetadata = None, tensorizers: Dict[str, pytext.data.tensorizers.Tensorizer] = None)[source]
gen_extra_context(*args)[source]

Generate any extra intermediate context data for metric calculation

get_model_select_metric(metrics)[source]

Return a single numeric metric value that is used for model selection, returns the metric itself by default, but usually metrics will be more complicated data structures

static node_to_metrics_node(node: Union[pytext.data.data_structures.annotation.Intent, pytext.data.data_structures.annotation.Slot], start: int = 0) → pytext.metrics.intent_slot_metrics.Node[source]

The input start is the absolute start position in utterance

predictions_to_report()[source]

Generate human readable predictions

targets_to_report()[source]

Generate human readable targets

static tree_from_tokens_and_indx_actions(token_str_list: List[str], actions_vocab: List[str], actions_indices: List[int], validate_tree: bool = True)[source]
static tree_to_metric_node(tree: pytext.data.data_structures.annotation.Tree) → pytext.metrics.intent_slot_metrics.Node[source]

Creates a Node from tree assuming the utterance is a concatenation of the tokens by whitespaces. The function does not necessarily reproduce the original utterance as extra whitespaces can be introduced.

pytext.metric_reporters.compositional_utils module

pytext.metric_reporters.compositional_utils.extract_beam_subtrees(beam: List[List[str]]) → List[List[str]][source]
pytext.metric_reporters.compositional_utils.extract_subtree(beam: List[str]) → Optional[List[str]][source]
pytext.metric_reporters.compositional_utils.filter_invalid_beams(beam: List[List[str]]) → List[List[str]][source]
pytext.metric_reporters.compositional_utils.is_valid_tree(beam: List[str]) → bool[source]

pytext.metric_reporters.dense_retrieval_metric_reporter module

class pytext.metric_reporters.dense_retrieval_metric_reporter.DenseRetrievalMetricNames[source]

Bases: enum.Enum

An enumeration.

ACCURACY = 'accuracy'
AVG_RANK = 'avg_rank'
MEAN_RECIPROCAL_RANK = 'mean_reciprocal_rank'
NEGATIVE_LOSS = 'negative_loss'
class pytext.metric_reporters.dense_retrieval_metric_reporter.DenseRetrievalMetricReporter(channels: List[pytext.metric_reporters.channel.Channel], text_column_names: List[str], model_select_metric: pytext.metric_reporters.dense_retrieval_metric_reporter.DenseRetrievalMetricNames, task_batch_size: int, num_negative_ctxs: int = 0)[source]

Bases: pytext.metric_reporters.metric_reporter.MetricReporter

aggregate_preds(preds, context)[source]
batch_context(raw_batch, batch) → Dict[str, Any][source]
calculate_metric() → pytext.metrics.dense_retrieval_metrics.DenseRetrievalMetrics[source]

Calculate metrics, each sub class should implement it

classmethod from_config(config, *args, tensorizers=None, **kwargs)[source]
get_model_select_metric(metrics: pytext.metrics.dense_retrieval_metrics.DenseRetrievalMetrics)[source]

Return a single numeric metric value that is used for model selection, returns the metric itself by default, but usually metrics will be more complicated data structures

pytext.metric_reporters.disjoint_multitask_metric_reporter module

class pytext.metric_reporters.disjoint_multitask_metric_reporter.DisjointMultitaskMetricReporter(reporters: Dict[str, pytext.metric_reporters.metric_reporter.MetricReporter], loss_weights: Dict[str, float], target_task_name: Optional[str], use_subtask_select_metric: bool)[source]

Bases: pytext.metric_reporters.metric_reporter.MetricReporter

add_batch_stats(n_batches, preds, targets, scores, loss, m_input, **context)[source]

Aggregates a batch of output data (predictions, scores, targets/true labels and loss).

Parameters:
  • n_batches (int) – number of current batch
  • preds (torch.Tensor) – predictions of current batch
  • targets (torch.Tensor) – targets of current batch
  • scores (torch.Tensor) – scores of current batch
  • loss (double) – average loss of current batch
  • m_input (Tuple[torch.Tensor, ..]) – model inputs of current batch
  • context (Dict[str, Any]) – any additional context data, it could be either a list of data which maps to each example, or a single value for the batch
add_channel(channel)[source]
batch_context(raw_batch, batch)[source]
get_model_select_metric(metrics)[source]

Return a single numeric metric value that is used for model selection, returns the metric itself by default, but usually metrics will be more complicated data structures

lower_is_better = False
report_metric(model, stage, epoch, reset=True, print_to_channels=True, optimizer=None)[source]

Calculate metrics and average loss, report all statistic data to channels

Parameters:
  • model (nn.Module) – the PyTorch neural network model.
  • stage (Stage) – training, evaluation or test
  • epoch (int) – current epoch
  • reset (bool) – if all data should be reset after report, default is True
  • print_to_channels (bool) – if report data to channels, default is True
report_realtime_metric(stage)[source]

pytext.metric_reporters.intent_slot_detection_metric_reporter module

class pytext.metric_reporters.intent_slot_detection_metric_reporter.IntentSlotMetricReporter(doc_label_names: List[str], word_label_names: List[str], use_bio_labels: bool, channels: List[pytext.metric_reporters.channel.Channel], slot_column_name: str = 'slots', text_column_name: str = 'text', token_tensorizer_name: str = 'tokens')[source]

Bases: pytext.metric_reporters.metric_reporter.MetricReporter

aggregate_preds(batch_preds, batch_context)[source]
aggregate_scores(batch_scores)[source]
aggregate_targets(batch_targets, batch_context)[source]
batch_context(raw_batch, batch)[source]
calculate_metric()[source]

Calculate metrics, each sub class should implement it

classmethod from_config(config, tensorizers: Optional[Dict[KT, VT]] = None)[source]
get_model_select_metric(metrics)[source]

Return a single numeric metric value that is used for model selection, returns the metric itself by default, but usually metrics will be more complicated data structures

get_raw_slot_str(raw_data_row)[source]
predictions_to_report()[source]

Generate human readable predictions

targets_to_report()[source]

Generate human readable targets

pytext.metric_reporters.intent_slot_detection_metric_reporter.create_frame(text, intent_label, slot_names_str, byte_len)[source]
pytext.metric_reporters.intent_slot_detection_metric_reporter.frame_to_str(frame: pytext.metrics.intent_slot_metrics.Node)[source]

pytext.metric_reporters.language_model_metric_reporter module

class pytext.metric_reporters.language_model_metric_reporter.LanguageModelChannel(stages, file_path)[source]

Bases: pytext.metric_reporters.channel.FileChannel

gen_content(metrics, loss, preds, targets, scores, contexts)[source]
get_title(context_keys=())[source]
class pytext.metric_reporters.language_model_metric_reporter.LanguageModelMetricReporter(channels, metadata, tensorizers, aggregate_metrics, perplexity_type, pep_format, log_gradient=False)[source]

Bases: pytext.metric_reporters.metric_reporter.MetricReporter

LABELS_COLUMN = 'labels'
RAW_TEXT_COLUMN = 'text'
TOKENS_COLUMN = 'tokens'
UTTERANCE_COLUMN = 'utterance'
add_batch_stats(n_batches, preds, targets, scores, loss, m_input, **context)[source]

Aggregates a batch of output data (predictions, scores, targets/true labels and loss).

Parameters:
  • n_batches (int) – number of current batch
  • preds (torch.Tensor) – predictions of current batch
  • targets (torch.Tensor) – targets of current batch
  • scores (torch.Tensor) – scores of current batch
  • loss (double) – average loss of current batch
  • m_input (Tuple[torch.Tensor, ..]) – model inputs of current batch
  • context (Dict[str, Any]) – any additional context data, it could be either a list of data which maps to each example, or a single value for the batch
aggregate_context(context)[source]
aggregate_scores(scores)[source]
batch_context(raw_batch, batch)[source]
calculate_loss() → float[source]

Calculate the average loss for all aggregated batch

calculate_metric() → pytext.metrics.language_model_metrics.LanguageModelMetric[source]

Calculate metrics, each sub class should implement it

compute_scores(logits, targets)[source]
classmethod from_config(config: pytext.metric_reporters.language_model_metric_reporter.LanguageModelMetricReporter.Config, meta: pytext.data.data_handler.CommonMetadata = None, tensorizers=None)[source]
get_model_select_metric(metrics) → float[source]

Return a single numeric metric value that is used for model selection, returns the metric itself by default, but usually metrics will be more complicated data structures

lower_is_better = True
class pytext.metric_reporters.language_model_metric_reporter.MaskedLMMetricReporter(channels, metadata, tensorizers, aggregate_metrics, perplexity_type, pep_format, log_gradient=False)[source]

Bases: pytext.metric_reporters.language_model_metric_reporter.LanguageModelMetricReporter

add_batch_stats(n_batches, preds, targets, scores, loss, m_input, **context)[source]

Aggregates a batch of output data (predictions, scores, targets/true labels and loss).

Parameters:
  • n_batches (int) – number of current batch
  • preds (torch.Tensor) – predictions of current batch
  • targets (torch.Tensor) – targets of current batch
  • scores (torch.Tensor) – scores of current batch
  • loss (double) – average loss of current batch
  • m_input (Tuple[torch.Tensor, ..]) – model inputs of current batch
  • context (Dict[str, Any]) – any additional context data, it could be either a list of data which maps to each example, or a single value for the batch
calculate_loss() → float[source]

Calculate the average loss for all aggregated batch

classmethod from_config(config, meta: pytext.data.data_handler.CommonMetadata = None, tensorizers=None)[source]
report_realtime_metric(stage)[source]
pytext.metric_reporters.language_model_metric_reporter.get_perplexity_func(perplexity_type)[source]

pytext.metric_reporters.mask_compositional module

pytext.metric_reporters.metric_reporter module

class pytext.metric_reporters.metric_reporter.MetricReporter(channels, log_gradient=False, pep_format=False)[source]

Bases: pytext.config.component.Component

MetricReporter is responsible of three things:

  1. Aggregate output from trainer, which includes model inputs, predictions, targets, scores, and loss.
  2. Calculate metrics using the aggregated output, and define how the metric is used to find best model
  3. Optionally report the metrics and aggregated output to various channels
lower_is_better

Whether a lower metric indicates better performance. Set to True for e.g. perplexity, and False for e.g. accuracy. Default is False

Type:bool
channels

A list of Channel that will receive metrics and the aggregated trainer output then format and report them in any customized way.

Type:List[Channel]

MetricReporter is tightly-coupled with metric aggregation and computation which makes inheritance hard to reuse the parent functionalities and attributes. Next step is to decouple the metric aggregation and computation vs metric reporting.

add_batch_stats(n_batches, preds, targets, scores, loss, m_input, **context)[source]

Aggregates a batch of output data (predictions, scores, targets/true labels and loss).

Parameters:
  • n_batches (int) – number of current batch
  • preds (torch.Tensor) – predictions of current batch
  • targets (torch.Tensor) – targets of current batch
  • scores (torch.Tensor) – scores of current batch
  • loss (double) – average loss of current batch
  • m_input (Tuple[torch.Tensor, ..]) – model inputs of current batch
  • context (Dict[str, Any]) – any additional context data, it could be either a list of data which maps to each example, or a single value for the batch
add_channel(channel)[source]
add_gradients(model)[source]
classmethod aggregate_data(all_data, new_batch)[source]

Aggregate a batch of data, basically just convert tensors to list of native python data

aggregate_preds(batch_preds, batch_context=None)[source]
aggregate_scores(batch_scores)[source]
aggregate_targets(batch_targets, batch_context=None)[source]
batch_context(raw_batch, batch)[source]
calculate_loss()[source]

Calculate the average loss for all aggregated batch

calculate_metric()[source]

Calculate metrics, each sub class should implement it

compare_metric(new_metric, old_metric)[source]

Check if new metric indicates better model performance

Returns:bool, true if model with new_metric performs better
gen_extra_context(*args)[source]

Generate any extra intermediate context data for metric calculation

get_gradients()[source]
get_meta()[source]

Get global meta data that is not specific to any batch, the data will be pass along to channels

get_model_select_metric(metrics)[source]

Return a single numeric metric value that is used for model selection, returns the metric itself by default, but usually metrics will be more complicated data structures

log_gradient = False
lower_is_better = False
predictions_to_report()[source]

Generate human readable predictions

report_metric(model, stage, epoch, reset=True, print_to_channels=True, optimizer=None)[source]

Calculate metrics and average loss, report all statistic data to channels

Parameters:
  • model (nn.Module) – the PyTorch neural network model.
  • stage (Stage) – training, evaluation or test
  • epoch (int) – current epoch
  • reset (bool) – if all data should be reset after report, default is True
  • print_to_channels (bool) – if report data to channels, default is True
report_realtime_metric(stage)[source]
targets_to_report()[source]

Generate human readable targets

class pytext.metric_reporters.metric_reporter.PureLossMetricReporter(channels, log_gradient=False, pep_format=False)[source]

Bases: pytext.metric_reporters.metric_reporter.MetricReporter

calculate_metric()[source]

Calculate metrics, each sub class should implement it

classmethod from_config(config, *args, **kwargs)[source]
lower_is_better = True

pytext.metric_reporters.pairwise_ranking_metric_reporter module

class pytext.metric_reporters.pairwise_ranking_metric_reporter.PairwiseRankingMetricReporter(channels, log_gradient=False, pep_format=False)[source]

Bases: pytext.metric_reporters.metric_reporter.MetricReporter

add_batch_stats(n_batches, preds, targets, scores, loss, m_input, **context)[source]

Aggregates a batch of output data (predictions, scores, targets/true labels and loss).

Parameters:
  • n_batches (int) – number of current batch
  • preds (torch.Tensor) – predictions of current batch
  • targets (torch.Tensor) – targets of current batch
  • scores (torch.Tensor) – scores of current batch
  • loss (double) – average loss of current batch
  • m_input (Tuple[torch.Tensor, ..]) – model inputs of current batch
  • context (Dict[str, Any]) – any additional context data, it could be either a list of data which maps to each example, or a single value for the batch
calculate_metric()[source]

Calculate metrics, each sub class should implement it

classmethod from_config(config, meta: pytext.data.data_handler.CommonMetadata = None, tensorizers=None)[source]
static get_model_select_metric(metrics)[source]

Return a single numeric metric value that is used for model selection, returns the metric itself by default, but usually metrics will be more complicated data structures

pytext.metric_reporters.regression_metric_reporter module

class pytext.metric_reporters.regression_metric_reporter.RegressionMetricReporter(channels, log_gradient=False, pep_format=False)[source]

Bases: pytext.metric_reporters.metric_reporter.MetricReporter

calculate_metric()[source]

Calculate metrics, each sub class should implement it

classmethod from_config(config, tensorizers=None)[source]
get_model_select_metric(metrics)[source]

Return a single numeric metric value that is used for model selection, returns the metric itself by default, but usually metrics will be more complicated data structures

lower_is_better = False

pytext.metric_reporters.seq2seq_compositional module

class pytext.metric_reporters.seq2seq_compositional.CompositionalSeq2SeqFileChannel(stages, file_path, tensorizers, accept_flat_intents_slots)[source]

Bases: pytext.metric_reporters.seq2seq_metric_reporter.Seq2SeqFileChannel

gen_content(metrics, loss, preds, targets, scores, context)[source]
get_title(context_keys=())[source]
validated_annotation(predicted_output_sequence)[source]
class pytext.metric_reporters.seq2seq_compositional.Seq2SeqCompositionalMetricReporter(channels, log_gradient, tensorizers, accept_flat_intents_slots)[source]

Bases: pytext.metric_reporters.seq2seq_metric_reporter.Seq2SeqMetricReporter

aggregate_preds(new_batch, context=None)[source]
aggregate_targets(new_batch, context=None)[source]
batch_context(raw_batch, batch)[source]
calculate_metric()[source]

Calculate metrics, each sub class should implement it

create_frame_prediction_pairs()[source]
classmethod from_config(config: pytext.metric_reporters.seq2seq_compositional.Seq2SeqCompositionalMetricReporter.Config, tensorizers: Dict[str, pytext.data.tensorizers.Tensorizer])[source]
get_annotation_from_string(stringified_tree_str: str) → pytext.data.data_structures.annotation.Annotation[source]
stringify_annotation_tree(tree_tokens, tree_vocab)[source]

pytext.metric_reporters.seq2seq_metric_reporter module

class pytext.metric_reporters.seq2seq_metric_reporter.Seq2SeqFileChannel(stages, file_path, tensorizers)[source]

Bases: pytext.metric_reporters.channel.FileChannel

gen_content(metrics, loss, preds, targets, scores, context)[source]
get_title(context_keys=())[source]
class pytext.metric_reporters.seq2seq_metric_reporter.Seq2SeqMetricReporter(channels, log_gradient, tensorizers)[source]

Bases: pytext.metric_reporters.metric_reporter.MetricReporter

add_batch_stats(n_batches, preds, targets, scores, loss, m_input, **context)[source]

Aggregates a batch of output data (predictions, scores, targets/true labels and loss).

Parameters:
  • n_batches (int) – number of current batch
  • preds (torch.Tensor) – predictions of current batch
  • targets (torch.Tensor) – targets of current batch
  • scores (torch.Tensor) – scores of current batch
  • loss (double) – average loss of current batch
  • m_input (Tuple[torch.Tensor, ..]) – model inputs of current batch
  • context (Dict[str, Any]) – any additional context data, it could be either a list of data which maps to each example, or a single value for the batch
aggregate_preds(new_batch, context=None)[source]
aggregate_src_tokens(new_batch)[source]
aggregate_targets(new_batch, context=None)[source]
batch_context(raw_batch, batch)[source]
calculate_metric()[source]

Calculate metrics, each sub class should implement it

classmethod from_config(config: pytext.metric_reporters.seq2seq_metric_reporter.Seq2SeqMetricReporter.Config, tensorizers: Dict[str, pytext.data.tensorizers.Tensorizer])[source]
gen_extra_context(*args)[source]

Generate any extra intermediate context data for metric calculation

get_model_select_metric(metrics)[source]

Return a single numeric metric value that is used for model selection, returns the metric itself by default, but usually metrics will be more complicated data structures

lower_is_better = True

pytext.metric_reporters.seq2seq_utils module

pytext.metric_reporters.seq2seq_utils.stringify(token_indices, vocab)[source]

pytext.metric_reporters.squad_metric_reporter module

class pytext.metric_reporters.squad_metric_reporter.SquadFileChannel(stages, file_path)[source]

Bases: pytext.metric_reporters.channel.FileChannel

gen_content(metrics, loss, preds, targets, scores, contexts, *args)[source]
get_title(context_keys=())[source]
class pytext.metric_reporters.squad_metric_reporter.SquadMetricReporter(channels: List[pytext.metric_reporters.channel.Channel], n_best_size: int, max_answer_length: int, ignore_impossible: bool, has_answer_labels: List[str], tensorizer=None, false_label='False')[source]

Bases: pytext.metric_reporters.metric_reporter.MetricReporter

ANSWERS_COLUMN = 'answers'
DOC_COLUMN = 'doc'
QUES_COLUMN = 'question'
ROW_INDEX = 'id'
add_batch_stats(n_batches, preds, targets, scores, loss, m_input, **contexts)[source]

Aggregates a batch of output data (predictions, scores, targets/true labels and loss).

Parameters:
  • n_batches (int) – number of current batch
  • preds (torch.Tensor) – predictions of current batch
  • targets (torch.Tensor) – targets of current batch
  • scores (torch.Tensor) – scores of current batch
  • loss (double) – average loss of current batch
  • m_input (Tuple[torch.Tensor, ..]) – model inputs of current batch
  • context (Dict[str, Any]) – any additional context data, it could be either a list of data which maps to each example, or a single value for the batch
aggregate_preds(new_batch, context=None)[source]
aggregate_scores(new_batch)[source]
aggregate_targets(new_batch, context=None)[source]
batch_context(raw_batch, batch)[source]
calculate_metric()[source]

Calculate metrics, each sub class should implement it

classmethod from_config(config, *args, tensorizers=None, **kwargs)[source]
get_model_select_metric(metric: pytext.metrics.squad_metrics.SquadMetrics)[source]

Return a single numeric metric value that is used for model selection, returns the metric itself by default, but usually metrics will be more complicated data structures

pytext.metric_reporters.word_tagging_metric_reporter module

class pytext.metric_reporters.word_tagging_metric_reporter.MultiLabelSequenceTaggingMetricReporter(label_names, pad_idx, channels, label_vocabs=None)[source]

Bases: pytext.metric_reporters.metric_reporter.MetricReporter

aggregate_preds(batch_preds, batch_context=None)[source]
aggregate_scores(batch_scores)[source]
aggregate_targets(batch_targets, batch_context=None)[source]
aggregate_tuple_data(all_data, new_batch)[source]
batch_context(raw_batch, batch)[source]
calculate_metric()[source]

Calculate metrics, each sub class should implement it

classmethod from_config(config, tensorizers)[source]
static get_model_select_metric(metrics)[source]

Return a single numeric metric value that is used for model selection, returns the metric itself by default, but usually metrics will be more complicated data structures

class pytext.metric_reporters.word_tagging_metric_reporter.NERMetricReporter(label_names: List[str], pad_idx: int, channels: List[pytext.metric_reporters.channel.Channel], use_bio_labels: bool = True)[source]

Bases: pytext.metric_reporters.metric_reporter.MetricReporter

batch_context(raw_batch, batch)[source]
calculate_metric() → pytext.metrics.PRF1Metrics[source]

Calculate metrics, each sub class should implement it

classmethod from_config(config, tensorizer)[source]
static get_model_select_metric(metrics)[source]

Return a single numeric metric value that is used for model selection, returns the metric itself by default, but usually metrics will be more complicated data structures

class pytext.metric_reporters.word_tagging_metric_reporter.SequenceTaggingMetricReporter(label_names, pad_idx, channels)[source]

Bases: pytext.metric_reporters.metric_reporter.MetricReporter

batch_context(raw_batch, batch)[source]
calculate_metric()[source]

Calculate metrics, each sub class should implement it

classmethod from_config(config, tensorizer)[source]
static get_model_select_metric(metrics)[source]

Return a single numeric metric value that is used for model selection, returns the metric itself by default, but usually metrics will be more complicated data structures

class pytext.metric_reporters.word_tagging_metric_reporter.Span(label, start, end)[source]

Bases: tuple

end

Alias for field number 2

label

Alias for field number 0

start

Alias for field number 1

class pytext.metric_reporters.word_tagging_metric_reporter.WordTaggingMetricReporter(label_names: List[str], use_bio_labels: bool, channels: List[pytext.metric_reporters.channel.Channel])[source]

Bases: pytext.metric_reporters.metric_reporter.MetricReporter

calculate_loss()[source]

Calculate the average loss for all aggregated batch

calculate_metric()[source]

Calculate metrics, each sub class should implement it

classmethod from_config(config, meta: pytext.data.data_handler.CommonMetadata)[source]
get_model_select_metric(metrics)[source]

Return a single numeric metric value that is used for model selection, returns the metric itself by default, but usually metrics will be more complicated data structures

process_pred(pred: List[int]) → List[str][source]

pred is a list of token label index

pytext.metric_reporters.word_tagging_metric_reporter.convert_bio_to_spans(bio_sequence: List[str]) → List[pytext.metric_reporters.word_tagging_metric_reporter.Span][source]

Process the output and convert to spans for evaluation.

pytext.metric_reporters.word_tagging_metric_reporter.get_slots(word_names)[source]

Module contents

class pytext.metric_reporters.Channel(stages: Tuple[pytext.common.constants.Stage, ...] = (<Stage.TRAIN: 'Training'>, <Stage.EVAL: 'Evaluation'>, <Stage.TEST: 'Test'>, <Stage.OTHERS: 'Others'>))[source]

Bases: object

Channel defines how to format and report the result of a PyText job to an output stream.

stages

in which stages the report will be triggered, default is all stages, which includes train, eval, test

close()[source]
export(model, input_to_model=None, **kwargs)[source]
report(stage, epoch, metrics, model_select_metric, loss, preds, targets, scores, context, *args)[source]

Defines how to format and report data to the output channel.

Parameters:
  • stage (Stage) – train, eval or test
  • epoch (int) – current epoch
  • metrics (Any) – all metrics
  • model_select_metric (double) – a single numeric metric to pick best model
  • loss (double) – average loss
  • preds (List[Any]) – list of predictions
  • targets (List[Any]) – list of targets
  • scores (List[Any]) – list of scores
  • context (Dict[str, List[Any]]) – dict of any additional context data, each context is a list of data that maps to each example
class pytext.metric_reporters.MetricReporter(channels, log_gradient=False, pep_format=False)[source]

Bases: pytext.config.component.Component

MetricReporter is responsible of three things:

  1. Aggregate output from trainer, which includes model inputs, predictions, targets, scores, and loss.
  2. Calculate metrics using the aggregated output, and define how the metric is used to find best model
  3. Optionally report the metrics and aggregated output to various channels
lower_is_better

Whether a lower metric indicates better performance. Set to True for e.g. perplexity, and False for e.g. accuracy. Default is False

Type:bool
channels

A list of Channel that will receive metrics and the aggregated trainer output then format and report them in any customized way.

Type:List[Channel]

MetricReporter is tightly-coupled with metric aggregation and computation which makes inheritance hard to reuse the parent functionalities and attributes. Next step is to decouple the metric aggregation and computation vs metric reporting.

add_batch_stats(n_batches, preds, targets, scores, loss, m_input, **context)[source]

Aggregates a batch of output data (predictions, scores, targets/true labels and loss).

Parameters:
  • n_batches (int) – number of current batch
  • preds (torch.Tensor) – predictions of current batch
  • targets (torch.Tensor) – targets of current batch
  • scores (torch.Tensor) – scores of current batch
  • loss (double) – average loss of current batch
  • m_input (Tuple[torch.Tensor, ..]) – model inputs of current batch
  • context (Dict[str, Any]) – any additional context data, it could be either a list of data which maps to each example, or a single value for the batch
add_channel(channel)[source]
add_gradients(model)[source]
classmethod aggregate_data(all_data, new_batch)[source]

Aggregate a batch of data, basically just convert tensors to list of native python data

aggregate_preds(batch_preds, batch_context=None)[source]
aggregate_scores(batch_scores)[source]
aggregate_targets(batch_targets, batch_context=None)[source]
batch_context(raw_batch, batch)[source]
calculate_loss()[source]

Calculate the average loss for all aggregated batch

calculate_metric()[source]

Calculate metrics, each sub class should implement it

compare_metric(new_metric, old_metric)[source]

Check if new metric indicates better model performance

Returns:bool, true if model with new_metric performs better
gen_extra_context(*args)[source]

Generate any extra intermediate context data for metric calculation

get_gradients()[source]
get_meta()[source]

Get global meta data that is not specific to any batch, the data will be pass along to channels

get_model_select_metric(metrics)[source]

Return a single numeric metric value that is used for model selection, returns the metric itself by default, but usually metrics will be more complicated data structures

log_gradient = False
lower_is_better = False
predictions_to_report()[source]

Generate human readable predictions

report_metric(model, stage, epoch, reset=True, print_to_channels=True, optimizer=None)[source]

Calculate metrics and average loss, report all statistic data to channels

Parameters:
  • model (nn.Module) – the PyTorch neural network model.
  • stage (Stage) – training, evaluation or test
  • epoch (int) – current epoch
  • reset (bool) – if all data should be reset after report, default is True
  • print_to_channels (bool) – if report data to channels, default is True
report_realtime_metric(stage)[source]
targets_to_report()[source]

Generate human readable targets

class pytext.metric_reporters.CalibrationMetricReporter(channels: List[pytext.metric_reporters.channel.Channel], pad_index: int = -1)[source]

Bases: pytext.metric_reporters.metric_reporter.MetricReporter

aggregate_preds(batch_preds: torch.Tensor, batch_context=typing.Dict[str, typing.Any])[source]
aggregate_scores(batch_scores: torch.Tensor)[source]
aggregate_targets(batch_targets: torch.Tensor, batch_context=typing.Dict[str, typing.Any])[source]
calculate_metric()[source]

Calculate metrics, each sub class should implement it

classmethod from_config(config: pytext.config.pytext_config.PyTextConfig, pad_index: int = -1)[source]
class pytext.metric_reporters.ClassificationMetricReporter(label_names: List[str], channels: List[pytext.metric_reporters.channel.Channel], model_select_metric: pytext.metric_reporters.classification_metric_reporter.ComparableClassificationMetric = <ComparableClassificationMetric.ACCURACY: 'accuracy'>, target_label: Optional[str] = None, text_column_names: List[str] = ['text'], additional_column_names: List[str] = [], recall_at_precision_thresholds: List[float] = [0.2, 0.4, 0.6, 0.8, 0.9], is_memory_efficient: bool = False)[source]

Bases: pytext.metric_reporters.metric_reporter.MetricReporter

add_batch_stats(n_batches, preds, targets, scores, loss, m_input, **context)[source]

Aggregates a batch of output data (predictions, scores, targets/true labels and loss).

Parameters:
  • n_batches (int) – number of current batch
  • preds (torch.Tensor) – predictions of current batch
  • targets (torch.Tensor) – targets of current batch
  • scores (torch.Tensor) – scores of current batch
  • loss (double) – average loss of current batch
  • m_input (Tuple[torch.Tensor, ..]) – model inputs of current batch
  • context (Dict[str, Any]) – any additional context data, it could be either a list of data which maps to each example, or a single value for the batch
batch_context(raw_batch, batch)[source]
calculate_metric()[source]

Calculate metrics, each sub class should implement it

classmethod from_config(config, meta: pytext.data.data_handler.CommonMetadata = None, tensorizers=None)[source]
classmethod from_config_and_label_names(config, label_names: List[str])[source]
get_meta()[source]

Get global meta data that is not specific to any batch, the data will be pass along to channels

get_model_select_metric(metrics)[source]

Return a single numeric metric value that is used for model selection, returns the metric itself by default, but usually metrics will be more complicated data structures

predictions_to_report()[source]

Generate human readable predictions

targets_to_report()[source]

Generate human readable targets

class pytext.metric_reporters.MultiLabelClassificationMetricReporter(label_names: List[str], channels: List[pytext.metric_reporters.channel.Channel], model_select_metric: pytext.metric_reporters.classification_metric_reporter.ComparableClassificationMetric = <ComparableClassificationMetric.ACCURACY: 'accuracy'>, target_label: Optional[str] = None, text_column_names: List[str] = ['text'], additional_column_names: List[str] = [], recall_at_precision_thresholds: List[float] = [0.2, 0.4, 0.6, 0.8, 0.9], is_memory_efficient: bool = False)[source]

Bases: pytext.metric_reporters.classification_metric_reporter.ClassificationMetricReporter

calculate_metric()[source]

Calculate metrics, each sub class should implement it

predictions_to_report()[source]

Generate human readable predictions

targets_to_report()[source]

Generate human readable targets

class pytext.metric_reporters.MultiLabelSequenceTaggingMetricReporter(label_names, pad_idx, channels, label_vocabs=None)[source]

Bases: pytext.metric_reporters.metric_reporter.MetricReporter

aggregate_preds(batch_preds, batch_context=None)[source]
aggregate_scores(batch_scores)[source]
aggregate_targets(batch_targets, batch_context=None)[source]
aggregate_tuple_data(all_data, new_batch)[source]
batch_context(raw_batch, batch)[source]
calculate_metric()[source]

Calculate metrics, each sub class should implement it

classmethod from_config(config, tensorizers)[source]
static get_model_select_metric(metrics)[source]

Return a single numeric metric value that is used for model selection, returns the metric itself by default, but usually metrics will be more complicated data structures

class pytext.metric_reporters.RegressionMetricReporter(channels, log_gradient=False, pep_format=False)[source]

Bases: pytext.metric_reporters.metric_reporter.MetricReporter

calculate_metric()[source]

Calculate metrics, each sub class should implement it

classmethod from_config(config, tensorizers=None)[source]
get_model_select_metric(metrics)[source]

Return a single numeric metric value that is used for model selection, returns the metric itself by default, but usually metrics will be more complicated data structures

lower_is_better = False
class pytext.metric_reporters.IntentSlotMetricReporter(doc_label_names: List[str], word_label_names: List[str], use_bio_labels: bool, channels: List[pytext.metric_reporters.channel.Channel], slot_column_name: str = 'slots', text_column_name: str = 'text', token_tensorizer_name: str = 'tokens')[source]

Bases: pytext.metric_reporters.metric_reporter.MetricReporter

aggregate_preds(batch_preds, batch_context)[source]
aggregate_scores(batch_scores)[source]
aggregate_targets(batch_targets, batch_context)[source]
batch_context(raw_batch, batch)[source]
calculate_metric()[source]

Calculate metrics, each sub class should implement it

classmethod from_config(config, tensorizers: Optional[Dict[KT, VT]] = None)[source]
get_model_select_metric(metrics)[source]

Return a single numeric metric value that is used for model selection, returns the metric itself by default, but usually metrics will be more complicated data structures

get_raw_slot_str(raw_data_row)[source]
predictions_to_report()[source]

Generate human readable predictions

targets_to_report()[source]

Generate human readable targets

class pytext.metric_reporters.LanguageModelMetricReporter(channels, metadata, tensorizers, aggregate_metrics, perplexity_type, pep_format, log_gradient=False)[source]

Bases: pytext.metric_reporters.metric_reporter.MetricReporter

LABELS_COLUMN = 'labels'
RAW_TEXT_COLUMN = 'text'
TOKENS_COLUMN = 'tokens'
UTTERANCE_COLUMN = 'utterance'
add_batch_stats(n_batches, preds, targets, scores, loss, m_input, **context)[source]

Aggregates a batch of output data (predictions, scores, targets/true labels and loss).

Parameters:
  • n_batches (int) – number of current batch
  • preds (torch.Tensor) – predictions of current batch
  • targets (torch.Tensor) – targets of current batch
  • scores (torch.Tensor) – scores of current batch
  • loss (double) – average loss of current batch
  • m_input (Tuple[torch.Tensor, ..]) – model inputs of current batch
  • context (Dict[str, Any]) – any additional context data, it could be either a list of data which maps to each example, or a single value for the batch
aggregate_context(context)[source]
aggregate_scores(scores)[source]
batch_context(raw_batch, batch)[source]
calculate_loss() → float[source]

Calculate the average loss for all aggregated batch

calculate_metric() → pytext.metrics.language_model_metrics.LanguageModelMetric[source]

Calculate metrics, each sub class should implement it

compute_scores(logits, targets)[source]
classmethod from_config(config: pytext.metric_reporters.language_model_metric_reporter.LanguageModelMetricReporter.Config, meta: pytext.data.data_handler.CommonMetadata = None, tensorizers=None)[source]
get_model_select_metric(metrics) → float[source]

Return a single numeric metric value that is used for model selection, returns the metric itself by default, but usually metrics will be more complicated data structures

lower_is_better = True
class pytext.metric_reporters.SquadMetricReporter(channels: List[pytext.metric_reporters.channel.Channel], n_best_size: int, max_answer_length: int, ignore_impossible: bool, has_answer_labels: List[str], tensorizer=None, false_label='False')[source]

Bases: pytext.metric_reporters.metric_reporter.MetricReporter

ANSWERS_COLUMN = 'answers'
DOC_COLUMN = 'doc'
QUES_COLUMN = 'question'
ROW_INDEX = 'id'
add_batch_stats(n_batches, preds, targets, scores, loss, m_input, **contexts)[source]

Aggregates a batch of output data (predictions, scores, targets/true labels and loss).

Parameters:
  • n_batches (int) – number of current batch
  • preds (torch.Tensor) – predictions of current batch
  • targets (torch.Tensor) – targets of current batch
  • scores (torch.Tensor) – scores of current batch
  • loss (double) – average loss of current batch
  • m_input (Tuple[torch.Tensor, ..]) – model inputs of current batch
  • context (Dict[str, Any]) – any additional context data, it could be either a list of data which maps to each example, or a single value for the batch
aggregate_preds(new_batch, context=None)[source]
aggregate_scores(new_batch)[source]
aggregate_targets(new_batch, context=None)[source]
batch_context(raw_batch, batch)[source]
calculate_metric()[source]

Calculate metrics, each sub class should implement it

classmethod from_config(config, *args, tensorizers=None, **kwargs)[source]
get_model_select_metric(metric: pytext.metrics.squad_metrics.SquadMetrics)[source]

Return a single numeric metric value that is used for model selection, returns the metric itself by default, but usually metrics will be more complicated data structures

class pytext.metric_reporters.WordTaggingMetricReporter(label_names: List[str], use_bio_labels: bool, channels: List[pytext.metric_reporters.channel.Channel])[source]

Bases: pytext.metric_reporters.metric_reporter.MetricReporter

calculate_loss()[source]

Calculate the average loss for all aggregated batch

calculate_metric()[source]

Calculate metrics, each sub class should implement it

classmethod from_config(config, meta: pytext.data.data_handler.CommonMetadata)[source]
get_model_select_metric(metrics)[source]

Return a single numeric metric value that is used for model selection, returns the metric itself by default, but usually metrics will be more complicated data structures

process_pred(pred: List[int]) → List[str][source]

pred is a list of token label index

class pytext.metric_reporters.CompositionalMetricReporter(actions_vocab, channels: List[pytext.metric_reporters.channel.Channel], text_column_name: str = 'tokenized_text', tokenizer: pytext.data.tokenizers.tokenizer.Tokenizer = None)[source]

Bases: pytext.metric_reporters.metric_reporter.MetricReporter

batch_context(raw_batch, batch)[source]
calculate_metric()[source]

Calculate metrics, each sub class should implement it

create_frame_prediction_pairs()[source]
classmethod from_config(config, metadata: pytext.data.data_handler.CommonMetadata = None, tensorizers: Dict[str, pytext.data.tensorizers.Tensorizer] = None)[source]
gen_extra_context(*args)[source]

Generate any extra intermediate context data for metric calculation

get_model_select_metric(metrics)[source]

Return a single numeric metric value that is used for model selection, returns the metric itself by default, but usually metrics will be more complicated data structures

static node_to_metrics_node(node: Union[pytext.data.data_structures.annotation.Intent, pytext.data.data_structures.annotation.Slot], start: int = 0) → pytext.metrics.intent_slot_metrics.Node[source]

The input start is the absolute start position in utterance

predictions_to_report()[source]

Generate human readable predictions

targets_to_report()[source]

Generate human readable targets

static tree_from_tokens_and_indx_actions(token_str_list: List[str], actions_vocab: List[str], actions_indices: List[int], validate_tree: bool = True)[source]
static tree_to_metric_node(tree: pytext.data.data_structures.annotation.Tree) → pytext.metrics.intent_slot_metrics.Node[source]

Creates a Node from tree assuming the utterance is a concatenation of the tokens by whitespaces. The function does not necessarily reproduce the original utterance as extra whitespaces can be introduced.

class pytext.metric_reporters.PairwiseRankingMetricReporter(channels, log_gradient=False, pep_format=False)[source]

Bases: pytext.metric_reporters.metric_reporter.MetricReporter

add_batch_stats(n_batches, preds, targets, scores, loss, m_input, **context)[source]

Aggregates a batch of output data (predictions, scores, targets/true labels and loss).

Parameters:
  • n_batches (int) – number of current batch
  • preds (torch.Tensor) – predictions of current batch
  • targets (torch.Tensor) – targets of current batch
  • scores (torch.Tensor) – scores of current batch
  • loss (double) – average loss of current batch
  • m_input (Tuple[torch.Tensor, ..]) – model inputs of current batch
  • context (Dict[str, Any]) – any additional context data, it could be either a list of data which maps to each example, or a single value for the batch
calculate_metric()[source]

Calculate metrics, each sub class should implement it

classmethod from_config(config, meta: pytext.data.data_handler.CommonMetadata = None, tensorizers=None)[source]
static get_model_select_metric(metrics)[source]

Return a single numeric metric value that is used for model selection, returns the metric itself by default, but usually metrics will be more complicated data structures

class pytext.metric_reporters.SequenceTaggingMetricReporter(label_names, pad_idx, channels)[source]

Bases: pytext.metric_reporters.metric_reporter.MetricReporter

batch_context(raw_batch, batch)[source]
calculate_metric()[source]

Calculate metrics, each sub class should implement it

classmethod from_config(config, tensorizer)[source]
static get_model_select_metric(metrics)[source]

Return a single numeric metric value that is used for model selection, returns the metric itself by default, but usually metrics will be more complicated data structures

class pytext.metric_reporters.PureLossMetricReporter(channels, log_gradient=False, pep_format=False)[source]

Bases: pytext.metric_reporters.metric_reporter.MetricReporter

calculate_metric()[source]

Calculate metrics, each sub class should implement it

classmethod from_config(config, *args, **kwargs)[source]
lower_is_better = True
class pytext.metric_reporters.NERMetricReporter(label_names: List[str], pad_idx: int, channels: List[pytext.metric_reporters.channel.Channel], use_bio_labels: bool = True)[source]

Bases: pytext.metric_reporters.metric_reporter.MetricReporter

batch_context(raw_batch, batch)[source]
calculate_metric() → pytext.metrics.PRF1Metrics[source]

Calculate metrics, each sub class should implement it

classmethod from_config(config, tensorizer)[source]
static get_model_select_metric(metrics)[source]

Return a single numeric metric value that is used for model selection, returns the metric itself by default, but usually metrics will be more complicated data structures

class pytext.metric_reporters.DenseRetrievalMetricReporter(channels: List[pytext.metric_reporters.channel.Channel], text_column_names: List[str], model_select_metric: pytext.metric_reporters.dense_retrieval_metric_reporter.DenseRetrievalMetricNames, task_batch_size: int, num_negative_ctxs: int = 0)[source]

Bases: pytext.metric_reporters.metric_reporter.MetricReporter

aggregate_preds(preds, context)[source]
batch_context(raw_batch, batch) → Dict[str, Any][source]
calculate_metric() → pytext.metrics.dense_retrieval_metrics.DenseRetrievalMetrics[source]

Calculate metrics, each sub class should implement it

classmethod from_config(config, *args, tensorizers=None, **kwargs)[source]
get_model_select_metric(metrics: pytext.metrics.dense_retrieval_metrics.DenseRetrievalMetrics)[source]

Return a single numeric metric value that is used for model selection, returns the metric itself by default, but usually metrics will be more complicated data structures