pytext.metric_reporters package¶

Submodules¶

pytext.metric_reporters.calibration_metric_reporter module¶

class pytext.metric_reporters.calibration_metric_reporter.CalibrationMetricReporter(channels: List[pytext.metric_reporters.channel.Channel], pad_index: int = -1)[source]¶

Bases: pytext.metric_reporters.metric_reporter.MetricReporter

aggregate_preds(batch_preds: torch.Tensor, batch_context=typing.Dict[str, typing.Any])[source]¶

aggregate_scores(batch_scores: torch.Tensor)[source]¶

aggregate_targets(batch_targets: torch.Tensor, batch_context=typing.Dict[str, typing.Any])[source]¶

calculate_metric()[source]¶: Calculate metrics, each sub class should implement it

classmethod from_config(config: pytext.config.pytext_config.PyTextConfig, pad_index: int = -1)[source]¶

pytext.metric_reporters.channel module¶

class pytext.metric_reporters.channel.Channel(stages: Tuple[pytext.common.constants.Stage, ...] = (<Stage.TRAIN: 'Training'>, <Stage.EVAL: 'Evaluation'>, <Stage.TEST: 'Test'>, <Stage.OTHERS: 'Others'>))[source]¶

Bases: object

Channel defines how to format and report the result of a PyText job to an output stream.

stages¶: in which stages the report will be triggered, default is all stages, which includes train, eval, test

close()[source]¶

export(model, input_to_model=None, **kwargs)[source]¶

report(stage, epoch, metrics, model_select_metric, loss, preds, targets, scores, context, *args)[source]¶

Defines how to format and report data to the output channel.

Parameters:

stage (Stage) – train, eval or test
epoch (int) – current epoch
metrics (Any) – all metrics
model_select_metric (double) – a single numeric metric to pick best model
loss (double) – average loss
preds (List[Any]) – list of predictions
targets (List[Any]) – list of targets
scores (List[Any]) – list of scores
context (Dict[str, List[Any]]) – dict of any additional context data, each context is a list of data that maps to each example

class pytext.metric_reporters.channel.ConsoleChannel(stages: Tuple[pytext.common.constants.Stage, ...] = (<Stage.TRAIN: 'Training'>, <Stage.EVAL: 'Evaluation'>, <Stage.TEST: 'Test'>, <Stage.OTHERS: 'Others'>))[source]¶

Bases: pytext.metric_reporters.channel.Channel

Simple Channel that prints results to console.

report(stage, epoch, metrics, model_select_metric, loss, preds, targets, scores, context, *args)[source]¶

Defines how to format and report data to the output channel.

Parameters:

stage (Stage) – train, eval or test
epoch (int) – current epoch
metrics (Any) – all metrics
model_select_metric (double) – a single numeric metric to pick best model
loss (double) – average loss
preds (List[Any]) – list of predictions
targets (List[Any]) – list of targets
scores (List[Any]) – list of scores
context (Dict[str, List[Any]]) – dict of any additional context data, each context is a list of data that maps to each example

class pytext.metric_reporters.channel.FileChannel(stages, file_path)[source]¶

Bases: pytext.metric_reporters.channel.Channel

Simple Channel that writes results to a TSV file.

gen_content(metrics, loss, preds, targets, scores, context)[source]¶

get_title(context_keys=())[source]¶

report(stage, epoch, metrics, model_select_metric, loss, preds, targets, scores, context, *args)[source]¶

Defines how to format and report data to the output channel.

Parameters:

stage (Stage) – train, eval or test
epoch (int) – current epoch
metrics (Any) – all metrics
model_select_metric (double) – a single numeric metric to pick best model
loss (double) – average loss
preds (List[Any]) – list of predictions
targets (List[Any]) – list of targets
scores (List[Any]) – list of scores
context (Dict[str, List[Any]]) – dict of any additional context data, each context is a list of data that maps to each example

class pytext.metric_reporters.channel.TensorBoardChannel(summary_writer=None, metric_name='accuracy')[source]¶

Bases: pytext.metric_reporters.channel.Channel

TensorBoardChannel defines how to format and report the result of a PyText job to TensorBoard.

summary_writer¶: An instance of the TensorBoard SummaryWriter class, or an object that implements the same interface. https://pytorch.org/docs/stable/tensorboard.html

metric_name¶: The name of the default metric to display on the TensorBoard dashboard, defaults to “accuracy”

train_step¶: The training step count

add_scalars(prefix, metrics, epoch)[source]¶

Recursively flattens the metrics object and adds each field name and value as a scalar for the corresponding epoch using the summary writer.

Parameters:	prefix (str) – The tag prefix for the metric. Each field name in the metrics object will be prepended with the prefix. metrics (Any) – The metrics object.

add_texts(tag, metrics)[source]¶

Recursively flattens the metrics object and adds each field name and value as a text using the summary writer. For example, if tag = “test”, and metrics = { accuracy: 0.7, scores: { precision: 0.8, recall: 0.6 } }, then under “tag=test” we will display “accuracy=0.7”, and under “tag=test/scores” we will display “precision=0.8” and “recall=0.6” in TensorBoard.

Parameters:	tag (str) – The tag name for the metric. If a field needs to be flattened further, it will be prepended as a prefix to the field name. metrics (Any) – The metrics object/dict.

close()[source]¶: Closes the summary writer.

export(model, input_to_model=None, **kwargs)[source]¶

Draws the neural network representation graph in TensorBoard.

Parameters:	model (Any) – the model object. input_to_model (Any) – the input to the model (required for PyTorch models, since its execution graph is defined by run).

log_loss(prefix, loss, epoch)[source]¶

log_vector(key, val, epoch)[source]¶

report(stage, epoch, metrics, model_select_metric, loss, preds, targets, scores, context, meta, model, optimizer, log_gradient, gradients, *args)[source]¶

Defines how to format and report data to TensorBoard using the summary writer. In the current implementation, during the train/eval phase we recursively report each metric field as scalars, and during the test phase we report the final metrics to be displayed as texts.

Also visualizes the internal model states (weights, biases) as histograms in TensorBoard.

Parameters:

stage (Stage) – train, eval or test
epoch (int) – current epoch
metrics (Any) – all metrics
model_select_metric (double) – a single numeric metric to pick best model
loss (double) – average loss
preds (List[Any]) – list of predictions
targets (List[Any]) – list of targets
scores (List[Any]) – list of scores
context (Dict[str, List[Any]]) – dict of any additional context data, each context is a list of data that maps to each example
meta (Dict[str, Any]) – global metadata, such as target names
model (nn.Module) – the PyTorch neural network model

pytext.metric_reporters.classification_metric_reporter module¶

class pytext.metric_reporters.classification_metric_reporter.ClassificationMetricReporter(label_names: List[str], channels: List[pytext.metric_reporters.channel.Channel], model_select_metric: pytext.metric_reporters.classification_metric_reporter.ComparableClassificationMetric = <ComparableClassificationMetric.ACCURACY: 'accuracy'>, target_label: Optional[str] = None, text_column_names: List[str] = ['text'], additional_column_names: List[str] = [], recall_at_precision_thresholds: List[float] = [0.2, 0.4, 0.6, 0.8, 0.9], is_memory_efficient: bool = False)[source]¶

Bases: pytext.metric_reporters.metric_reporter.MetricReporter

add_batch_stats(n_batches, preds, targets, scores, loss, m_input, **context)[source]¶

Aggregates a batch of output data (predictions, scores, targets/true labels and loss).

Parameters:

n_batches (int) – number of current batch
preds (torch.Tensor) – predictions of current batch
targets (torch.Tensor) – targets of current batch
scores (torch.Tensor) – scores of current batch
loss (double) – average loss of current batch
m_input (Tuple[torch.Tensor, ..]) – model inputs of current batch
context (Dict[str, Any]) – any additional context data, it could be either a list of data which maps to each example, or a single value for the batch

batch_context(raw_batch, batch)[source]¶

calculate_metric()[source]¶: Calculate metrics, each sub class should implement it

classmethod from_config(config, meta: pytext.data.data_handler.CommonMetadata = None, tensorizers=None)[source]¶

classmethod from_config_and_label_names(config, label_names: List[str])[source]¶

get_meta()[source]¶: Get global meta data that is not specific to any batch, the data will be pass along to channels

get_model_select_metric(metrics)[source]¶: Return a single numeric metric value that is used for model selection, returns the metric itself by default, but usually metrics will be more complicated data structures

predictions_to_report()[source]¶: Generate human readable predictions

targets_to_report()[source]¶: Generate human readable targets

class pytext.metric_reporters.classification_metric_reporter.ComparableClassificationMetric[source]¶

Bases: enum.Enum

An enumeration.

ACCURACY = 'accuracy'¶

LABEL_AVG_PRECISION = 'label_avg_precision'¶

LABEL_F1 = 'label_f1'¶

LABEL_ROC_AUC = 'label_roc_auc'¶

MACRO_F1 = 'macro_f1'¶

MCC = 'mcc'¶

NEGATIVE_LOSS = 'negative_loss'¶

ROC_AUC = 'roc_auc'¶

class pytext.metric_reporters.classification_metric_reporter.MultiLabelClassificationMetricReporter(label_names: List[str], channels: List[pytext.metric_reporters.channel.Channel], model_select_metric: pytext.metric_reporters.classification_metric_reporter.ComparableClassificationMetric = <ComparableClassificationMetric.ACCURACY: 'accuracy'>, target_label: Optional[str] = None, text_column_names: List[str] = ['text'], additional_column_names: List[str] = [], recall_at_precision_thresholds: List[float] = [0.2, 0.4, 0.6, 0.8, 0.9], is_memory_efficient: bool = False)[source]¶

Bases: pytext.metric_reporters.classification_metric_reporter.ClassificationMetricReporter

calculate_metric()[source]¶: Calculate metrics, each sub class should implement it

predictions_to_report()[source]¶: Generate human readable predictions

targets_to_report()[source]¶: Generate human readable targets

pytext.metric_reporters.compositional_metric_reporter module¶

class pytext.metric_reporters.compositional_metric_reporter.CompositionalMetricReporter(actions_vocab, channels: List[pytext.metric_reporters.channel.Channel], text_column_name: str = 'tokenized_text', tokenizer: pytext.data.tokenizers.tokenizer.Tokenizer = None)[source]¶

Bases: pytext.metric_reporters.metric_reporter.MetricReporter

batch_context(raw_batch, batch)[source]¶

calculate_metric()[source]¶: Calculate metrics, each sub class should implement it

create_frame_prediction_pairs()[source]¶

classmethod from_config(config, metadata: pytext.data.data_handler.CommonMetadata = None, tensorizers: Dict[str, pytext.data.tensorizers.Tensorizer] = None)[source]¶

gen_extra_context(*args)[source]¶: Generate any extra intermediate context data for metric calculation

get_model_select_metric(metrics)[source]¶: Return a single numeric metric value that is used for model selection, returns the metric itself by default, but usually metrics will be more complicated data structures

static node_to_metrics_node(node: Union[pytext.data.data_structures.annotation.Intent, pytext.data.data_structures.annotation.Slot], start: int = 0) → pytext.metrics.intent_slot_metrics.Node[source]¶: The input start is the absolute start position in utterance

predictions_to_report()[source]¶: Generate human readable predictions

targets_to_report()[source]¶: Generate human readable targets

static tree_from_tokens_and_indx_actions(token_str_list: List[str], actions_vocab: List[str], actions_indices: List[int], validate_tree: bool = True)[source]¶

static tree_to_metric_node(tree: pytext.data.data_structures.annotation.Tree) → pytext.metrics.intent_slot_metrics.Node[source]¶: Creates a Node from tree assuming the utterance is a concatenation of the tokens by whitespaces. The function does not necessarily reproduce the original utterance as extra whitespaces can be introduced.

pytext.metric_reporters.compositional_utils module¶

pytext.metric_reporters.compositional_utils.extract_beam_subtrees(beam: List[List[str]]) → List[List[str]][source]¶

pytext.metric_reporters.compositional_utils.extract_subtree(beam: List[str]) → Optional[List[str]][source]¶

pytext.metric_reporters.compositional_utils.filter_invalid_beams(beam: List[List[str]]) → List[List[str]][source]¶

pytext.metric_reporters.compositional_utils.is_valid_tree(beam: List[str]) → bool[source]¶

pytext.metric_reporters.dense_retrieval_metric_reporter module¶

class pytext.metric_reporters.dense_retrieval_metric_reporter.DenseRetrievalMetricNames[source]¶

Bases: enum.Enum

An enumeration.

ACCURACY = 'accuracy'¶

AVG_RANK = 'avg_rank'¶

MEAN_RECIPROCAL_RANK = 'mean_reciprocal_rank'¶

NEGATIVE_LOSS = 'negative_loss'¶

class pytext.metric_reporters.dense_retrieval_metric_reporter.DenseRetrievalMetricReporter(channels: List[pytext.metric_reporters.channel.Channel], text_column_names: List[str], model_select_metric: pytext.metric_reporters.dense_retrieval_metric_reporter.DenseRetrievalMetricNames, task_batch_size: int, num_negative_ctxs: int = 0)[source]¶

Bases: pytext.metric_reporters.metric_reporter.MetricReporter

aggregate_preds(preds, context)[source]¶

batch_context(raw_batch, batch) → Dict[str, Any][source]¶

calculate_metric() → pytext.metrics.dense_retrieval_metrics.DenseRetrievalMetrics[source]¶: Calculate metrics, each sub class should implement it

classmethod from_config(config, *args, tensorizers=None, **kwargs)[source]¶

get_model_select_metric(metrics: pytext.metrics.dense_retrieval_metrics.DenseRetrievalMetrics)[source]¶: Return a single numeric metric value that is used for model selection, returns the metric itself by default, but usually metrics will be more complicated data structures

pytext.metric_reporters.disjoint_multitask_metric_reporter module¶

class pytext.metric_reporters.disjoint_multitask_metric_reporter.DisjointMultitaskMetricReporter(reporters: Dict[str, pytext.metric_reporters.metric_reporter.MetricReporter], loss_weights: Dict[str, float], target_task_name: Optional[str], use_subtask_select_metric: bool)[source]¶

Bases: pytext.metric_reporters.metric_reporter.MetricReporter

add_batch_stats(n_batches, preds, targets, scores, loss, m_input, **context)[source]¶

Aggregates a batch of output data (predictions, scores, targets/true labels and loss).

Parameters:

n_batches (int) – number of current batch
preds (torch.Tensor) – predictions of current batch
targets (torch.Tensor) – targets of current batch
scores (torch.Tensor) – scores of current batch
loss (double) – average loss of current batch
m_input (Tuple[torch.Tensor, ..]) – model inputs of current batch
context (Dict[str, Any]) – any additional context data, it could be either a list of data which maps to each example, or a single value for the batch

add_channel(channel)[source]¶

batch_context(raw_batch, batch)[source]¶

get_model_select_metric(metrics)[source]¶: Return a single numeric metric value that is used for model selection, returns the metric itself by default, but usually metrics will be more complicated data structures

lower_is_better = False¶

report_metric(model, stage, epoch, reset=True, print_to_channels=True, optimizer=None)[source]¶

Calculate metrics and average loss, report all statistic data to channels

Parameters:	model (nn.Module) – the PyTorch neural network model. stage (Stage) – training, evaluation or test epoch (int) – current epoch reset (bool) – if all data should be reset after report, default is True print_to_channels (bool) – if report data to channels, default is True

report_realtime_metric(stage)[source]¶

pytext.metric_reporters.intent_slot_detection_metric_reporter module¶

class pytext.metric_reporters.intent_slot_detection_metric_reporter.IntentSlotMetricReporter(doc_label_names: List[str], word_label_names: List[str], use_bio_labels: bool, channels: List[pytext.metric_reporters.channel.Channel], slot_column_name: str = 'slots', text_column_name: str = 'text', token_tensorizer_name: str = 'tokens')[source]¶

Bases: pytext.metric_reporters.metric_reporter.MetricReporter

aggregate_preds(batch_preds, batch_context)[source]¶

aggregate_scores(batch_scores)[source]¶

aggregate_targets(batch_targets, batch_context)[source]¶

batch_context(raw_batch, batch)[source]¶

calculate_metric()[source]¶: Calculate metrics, each sub class should implement it

classmethod from_config(config, tensorizers: Optional[Dict[KT, VT]] = None)[source]¶

get_model_select_metric(metrics)[source]¶: Return a single numeric metric value that is used for model selection, returns the metric itself by default, but usually metrics will be more complicated data structures

get_raw_slot_str(raw_data_row)[source]¶

predictions_to_report()[source]¶: Generate human readable predictions

targets_to_report()[source]¶: Generate human readable targets

pytext.metric_reporters.intent_slot_detection_metric_reporter.create_frame(text, intent_label, slot_names_str, byte_len)[source]¶

pytext.metric_reporters.intent_slot_detection_metric_reporter.frame_to_str(frame: pytext.metrics.intent_slot_metrics.Node)[source]¶

pytext.metric_reporters.language_model_metric_reporter module¶

class pytext.metric_reporters.language_model_metric_reporter.LanguageModelChannel(stages, file_path)[source]¶

Bases: pytext.metric_reporters.channel.FileChannel

gen_content(metrics, loss, preds, targets, scores, contexts)[source]¶

get_title(context_keys=())[source]¶

class pytext.metric_reporters.language_model_metric_reporter.LanguageModelMetricReporter(channels, metadata, tensorizers, aggregate_metrics, perplexity_type, pep_format, log_gradient=False)[source]¶

Bases: pytext.metric_reporters.metric_reporter.MetricReporter

LABELS_COLUMN = 'labels'¶

RAW_TEXT_COLUMN = 'text'¶

TOKENS_COLUMN = 'tokens'¶

UTTERANCE_COLUMN = 'utterance'¶

add_batch_stats(n_batches, preds, targets, scores, loss, m_input, **context)[source]¶

Aggregates a batch of output data (predictions, scores, targets/true labels and loss).

Parameters:

n_batches (int) – number of current batch
preds (torch.Tensor) – predictions of current batch
targets (torch.Tensor) – targets of current batch
scores (torch.Tensor) – scores of current batch
loss (double) – average loss of current batch
m_input (Tuple[torch.Tensor, ..]) – model inputs of current batch
context (Dict[str, Any]) – any additional context data, it could be either a list of data which maps to each example, or a single value for the batch

aggregate_context(context)[source]¶

aggregate_scores(scores)[source]¶

batch_context(raw_batch, batch)[source]¶

calculate_loss() → float[source]¶: Calculate the average loss for all aggregated batch

calculate_metric() → pytext.metrics.language_model_metrics.LanguageModelMetric[source]¶: Calculate metrics, each sub class should implement it

compute_scores(logits, targets)[source]¶

classmethod from_config(config: pytext.metric_reporters.language_model_metric_reporter.LanguageModelMetricReporter.Config, meta: pytext.data.data_handler.CommonMetadata = None, tensorizers=None)[source]¶

get_model_select_metric(metrics) → float[source]¶: Return a single numeric metric value that is used for model selection, returns the metric itself by default, but usually metrics will be more complicated data structures

lower_is_better = True¶

class pytext.metric_reporters.language_model_metric_reporter.MaskedLMMetricReporter(channels, metadata, tensorizers, aggregate_metrics, perplexity_type, pep_format, log_gradient=False)[source]¶

Bases: pytext.metric_reporters.language_model_metric_reporter.LanguageModelMetricReporter

add_batch_stats(n_batches, preds, targets, scores, loss, m_input, **context)[source]¶

Aggregates a batch of output data (predictions, scores, targets/true labels and loss).

Parameters:

n_batches (int) – number of current batch
preds (torch.Tensor) – predictions of current batch
targets (torch.Tensor) – targets of current batch
scores (torch.Tensor) – scores of current batch
loss (double) – average loss of current batch
m_input (Tuple[torch.Tensor, ..]) – model inputs of current batch
context (Dict[str, Any]) – any additional context data, it could be either a list of data which maps to each example, or a single value for the batch

calculate_loss() → float[source]¶: Calculate the average loss for all aggregated batch

classmethod from_config(config, meta: pytext.data.data_handler.CommonMetadata = None, tensorizers=None)[source]¶

report_realtime_metric(stage)[source]¶

pytext.metric_reporters.language_model_metric_reporter.get_perplexity_func(perplexity_type)[source]¶

pytext.metric_reporters.mask_compositional module¶

pytext.metric_reporters.metric_reporter module¶

class pytext.metric_reporters.metric_reporter.MetricReporter(channels, log_gradient=False, pep_format=False)[source]¶

Bases: pytext.config.component.Component

MetricReporter is responsible of three things:

Aggregate output from trainer, which includes model inputs, predictions, targets, scores, and loss.
Calculate metrics using the aggregated output, and define how the metric is used to find best model
Optionally report the metrics and aggregated output to various channels

lower_is_better¶

Whether a lower metric indicates better performance. Set to True for e.g. perplexity, and False for e.g. accuracy. Default is False

Type:	bool

channels¶

A list of Channel that will receive metrics and the aggregated trainer output then format and report them in any customized way.

Type:	List[Channel]

MetricReporter is tightly-coupled with metric aggregation and computation which makes inheritance hard to reuse the parent functionalities and attributes. Next step is to decouple the metric aggregation and computation vs metric reporting.

add_batch_stats(n_batches, preds, targets, scores, loss, m_input, **context)[source]¶

Aggregates a batch of output data (predictions, scores, targets/true labels and loss).

Parameters:

n_batches (int) – number of current batch
preds (torch.Tensor) – predictions of current batch
targets (torch.Tensor) – targets of current batch
scores (torch.Tensor) – scores of current batch
loss (double) – average loss of current batch
m_input (Tuple[torch.Tensor, ..]) – model inputs of current batch
context (Dict[str, Any]) – any additional context data, it could be either a list of data which maps to each example, or a single value for the batch

add_channel(channel)[source]¶

add_gradients(model)[source]¶

classmethod aggregate_data(all_data, new_batch)[source]¶: Aggregate a batch of data, basically just convert tensors to list of native python data

aggregate_preds(batch_preds, batch_context=None)[source]¶

aggregate_scores(batch_scores)[source]¶

aggregate_targets(batch_targets, batch_context=None)[source]¶

batch_context(raw_batch, batch)[source]¶

calculate_loss()[source]¶: Calculate the average loss for all aggregated batch

calculate_metric()[source]¶: Calculate metrics, each sub class should implement it

compare_metric(new_metric, old_metric)[source]¶

Check if new metric indicates better model performance

Returns:	bool, true if model with new_metric performs better

gen_extra_context(*args)[source]¶: Generate any extra intermediate context data for metric calculation

get_gradients()[source]¶

get_meta()[source]¶: Get global meta data that is not specific to any batch, the data will be pass along to channels

get_model_select_metric(metrics)[source]¶: Return a single numeric metric value that is used for model selection, returns the metric itself by default, but usually metrics will be more complicated data structures

log_gradient = False¶

lower_is_better = False

predictions_to_report()[source]¶: Generate human readable predictions

report_metric(model, stage, epoch, reset=True, print_to_channels=True, optimizer=None)[source]¶

Calculate metrics and average loss, report all statistic data to channels

Parameters:	model (nn.Module) – the PyTorch neural network model. stage (Stage) – training, evaluation or test epoch (int) – current epoch reset (bool) – if all data should be reset after report, default is True print_to_channels (bool) – if report data to channels, default is True

report_realtime_metric(stage)[source]¶

targets_to_report()[source]¶: Generate human readable targets

class pytext.metric_reporters.metric_reporter.PureLossMetricReporter(channels, log_gradient=False, pep_format=False)[source]¶

Bases: pytext.metric_reporters.metric_reporter.MetricReporter

calculate_metric()[source]¶: Calculate metrics, each sub class should implement it

classmethod from_config(config, *args, **kwargs)[source]¶

lower_is_better = True¶

pytext.metric_reporters.pairwise_ranking_metric_reporter module¶

class pytext.metric_reporters.pairwise_ranking_metric_reporter.PairwiseRankingMetricReporter(channels, log_gradient=False, pep_format=False)[source]¶

Bases: pytext.metric_reporters.metric_reporter.MetricReporter

add_batch_stats(n_batches, preds, targets, scores, loss, m_input, **context)[source]¶

Aggregates a batch of output data (predictions, scores, targets/true labels and loss).

Parameters:

n_batches (int) – number of current batch
preds (torch.Tensor) – predictions of current batch
targets (torch.Tensor) – targets of current batch
scores (torch.Tensor) – scores of current batch
loss (double) – average loss of current batch
m_input (Tuple[torch.Tensor, ..]) – model inputs of current batch
context (Dict[str, Any]) – any additional context data, it could be either a list of data which maps to each example, or a single value for the batch

calculate_metric()[source]¶: Calculate metrics, each sub class should implement it

classmethod from_config(config, meta: pytext.data.data_handler.CommonMetadata = None, tensorizers=None)[source]¶

static get_model_select_metric(metrics)[source]¶: Return a single numeric metric value that is used for model selection, returns the metric itself by default, but usually metrics will be more complicated data structures

pytext.metric_reporters.regression_metric_reporter module¶

class pytext.metric_reporters.regression_metric_reporter.RegressionMetricReporter(channels, log_gradient=False, pep_format=False)[source]¶

Bases: pytext.metric_reporters.metric_reporter.MetricReporter

calculate_metric()[source]¶: Calculate metrics, each sub class should implement it

classmethod from_config(config, tensorizers=None)[source]¶

get_model_select_metric(metrics)[source]¶: Return a single numeric metric value that is used for model selection, returns the metric itself by default, but usually metrics will be more complicated data structures

lower_is_better = False¶

pytext.metric_reporters.seq2seq_compositional module¶

class pytext.metric_reporters.seq2seq_compositional.CompositionalSeq2SeqFileChannel(stages, file_path, tensorizers, accept_flat_intents_slots)[source]¶

Bases: pytext.metric_reporters.seq2seq_metric_reporter.Seq2SeqFileChannel

gen_content(metrics, loss, preds, targets, scores, context)[source]¶

get_title(context_keys=())[source]¶

validated_annotation(predicted_output_sequence)[source]¶

class pytext.metric_reporters.seq2seq_compositional.Seq2SeqCompositionalMetricReporter(channels, log_gradient, tensorizers, accept_flat_intents_slots)[source]¶

Bases: pytext.metric_reporters.seq2seq_metric_reporter.Seq2SeqMetricReporter

aggregate_preds(new_batch, context=None)[source]¶

aggregate_targets(new_batch, context=None)[source]¶

batch_context(raw_batch, batch)[source]¶

calculate_metric()[source]¶: Calculate metrics, each sub class should implement it

create_frame_prediction_pairs()[source]¶

classmethod from_config(config: pytext.metric_reporters.seq2seq_compositional.Seq2SeqCompositionalMetricReporter.Config, tensorizers: Dict[str, pytext.data.tensorizers.Tensorizer])[source]¶

get_annotation_from_string(stringified_tree_str: str) → pytext.data.data_structures.annotation.Annotation[source]¶

stringify_annotation_tree(tree_tokens, tree_vocab)[source]¶

pytext.metric_reporters.seq2seq_metric_reporter module¶

class pytext.metric_reporters.seq2seq_metric_reporter.Seq2SeqFileChannel(stages, file_path, tensorizers)[source]¶

Bases: pytext.metric_reporters.channel.FileChannel

gen_content(metrics, loss, preds, targets, scores, context)[source]¶

get_title(context_keys=())[source]¶

class pytext.metric_reporters.seq2seq_metric_reporter.Seq2SeqMetricReporter(channels, log_gradient, tensorizers)[source]¶

Bases: pytext.metric_reporters.metric_reporter.MetricReporter

add_batch_stats(n_batches, preds, targets, scores, loss, m_input, **context)[source]¶

Aggregates a batch of output data (predictions, scores, targets/true labels and loss).

Parameters:

n_batches (int) – number of current batch
preds (torch.Tensor) – predictions of current batch
targets (torch.Tensor) – targets of current batch
scores (torch.Tensor) – scores of current batch
loss (double) – average loss of current batch
m_input (Tuple[torch.Tensor, ..]) – model inputs of current batch
context (Dict[str, Any]) – any additional context data, it could be either a list of data which maps to each example, or a single value for the batch

aggregate_preds(new_batch, context=None)[source]¶

aggregate_src_tokens(new_batch)[source]¶

aggregate_targets(new_batch, context=None)[source]¶

batch_context(raw_batch, batch)[source]¶

calculate_metric()[source]¶: Calculate metrics, each sub class should implement it

classmethod from_config(config: pytext.metric_reporters.seq2seq_metric_reporter.Seq2SeqMetricReporter.Config, tensorizers: Dict[str, pytext.data.tensorizers.Tensorizer])[source]¶

gen_extra_context(*args)[source]¶: Generate any extra intermediate context data for metric calculation

get_model_select_metric(metrics)[source]¶: Return a single numeric metric value that is used for model selection, returns the metric itself by default, but usually metrics will be more complicated data structures

lower_is_better = True¶

pytext.metric_reporters.seq2seq_utils module¶

pytext.metric_reporters.seq2seq_utils.stringify(token_indices, vocab)[source]¶

pytext.metric_reporters.squad_metric_reporter module¶

class pytext.metric_reporters.squad_metric_reporter.SquadFileChannel(stages, file_path)[source]¶

Bases: pytext.metric_reporters.channel.FileChannel

gen_content(metrics, loss, preds, targets, scores, contexts, *args)[source]¶

get_title(context_keys=())[source]¶

class pytext.metric_reporters.squad_metric_reporter.SquadMetricReporter(channels: List[pytext.metric_reporters.channel.Channel], n_best_size: int, max_answer_length: int, ignore_impossible: bool, has_answer_labels: List[str], tensorizer=None, false_label='False')[source]¶

Bases: pytext.metric_reporters.metric_reporter.MetricReporter

ANSWERS_COLUMN = 'answers'¶

DOC_COLUMN = 'doc'¶

QUES_COLUMN = 'question'¶

ROW_INDEX = 'id'¶

add_batch_stats(n_batches, preds, targets, scores, loss, m_input, **contexts)[source]¶

Aggregates a batch of output data (predictions, scores, targets/true labels and loss).

Parameters:

n_batches (int) – number of current batch
preds (torch.Tensor) – predictions of current batch
targets (torch.Tensor) – targets of current batch
scores (torch.Tensor) – scores of current batch
loss (double) – average loss of current batch
m_input (Tuple[torch.Tensor, ..]) – model inputs of current batch
context (Dict[str, Any]) – any additional context data, it could be either a list of data which maps to each example, or a single value for the batch

aggregate_preds(new_batch, context=None)[source]¶

aggregate_scores(new_batch)[source]¶

aggregate_targets(new_batch, context=None)[source]¶

batch_context(raw_batch, batch)[source]¶

calculate_metric()[source]¶: Calculate metrics, each sub class should implement it

classmethod from_config(config, *args, tensorizers=None, **kwargs)[source]¶

get_model_select_metric(metric: pytext.metrics.squad_metrics.SquadMetrics)[source]¶: Return a single numeric metric value that is used for model selection, returns the metric itself by default, but usually metrics will be more complicated data structures

pytext.metric_reporters.word_tagging_metric_reporter module¶

class pytext.metric_reporters.word_tagging_metric_reporter.MultiLabelSequenceTaggingMetricReporter(label_names, pad_idx, channels, label_vocabs=None)[source]¶

Bases: pytext.metric_reporters.metric_reporter.MetricReporter

aggregate_preds(batch_preds, batch_context=None)[source]¶

aggregate_scores(batch_scores)[source]¶

aggregate_targets(batch_targets, batch_context=None)[source]¶

aggregate_tuple_data(all_data, new_batch)[source]¶

batch_context(raw_batch, batch)[source]¶

calculate_metric()[source]¶: Calculate metrics, each sub class should implement it

classmethod from_config(config, tensorizers)[source]¶

static get_model_select_metric(metrics)[source]¶: Return a single numeric metric value that is used for model selection, returns the metric itself by default, but usually metrics will be more complicated data structures

class pytext.metric_reporters.word_tagging_metric_reporter.NERMetricReporter(label_names: List[str], pad_idx: int, channels: List[pytext.metric_reporters.channel.Channel], use_bio_labels: bool = True)[source]¶

Bases: pytext.metric_reporters.metric_reporter.MetricReporter

batch_context(raw_batch, batch)[source]¶

calculate_metric() → pytext.metrics.PRF1Metrics[source]¶: Calculate metrics, each sub class should implement it

classmethod from_config(config, tensorizer)[source]¶

static get_model_select_metric(metrics)[source]¶: Return a single numeric metric value that is used for model selection, returns the metric itself by default, but usually metrics will be more complicated data structures

class pytext.metric_reporters.word_tagging_metric_reporter.SequenceTaggingMetricReporter(label_names, pad_idx, channels)[source]¶

Bases: pytext.metric_reporters.metric_reporter.MetricReporter

batch_context(raw_batch, batch)[source]¶

calculate_metric()[source]¶: Calculate metrics, each sub class should implement it

classmethod from_config(config, tensorizer)[source]¶

static get_model_select_metric(metrics)[source]¶: Return a single numeric metric value that is used for model selection, returns the metric itself by default, but usually metrics will be more complicated data structures

class pytext.metric_reporters.word_tagging_metric_reporter.Span(label, start, end)[source]¶

Bases: tuple

end¶: Alias for field number 2

label¶: Alias for field number 0

start¶: Alias for field number 1

class pytext.metric_reporters.word_tagging_metric_reporter.WordTaggingMetricReporter(label_names: List[str], use_bio_labels: bool, channels: List[pytext.metric_reporters.channel.Channel])[source]¶

Bases: pytext.metric_reporters.metric_reporter.MetricReporter

calculate_loss()[source]¶: Calculate the average loss for all aggregated batch

calculate_metric()[source]¶: Calculate metrics, each sub class should implement it

classmethod from_config(config, meta: pytext.data.data_handler.CommonMetadata)[source]¶

get_model_select_metric(metrics)[source]¶: Return a single numeric metric value that is used for model selection, returns the metric itself by default, but usually metrics will be more complicated data structures

process_pred(pred: List[int]) → List[str][source]¶: pred is a list of token label index

pytext.metric_reporters.word_tagging_metric_reporter.convert_bio_to_spans(bio_sequence: List[str]) → List[pytext.metric_reporters.word_tagging_metric_reporter.Span][source]¶: Process the output and convert to spans for evaluation.

pytext.metric_reporters.word_tagging_metric_reporter.get_slots(word_names)[source]¶

Module contents¶

class pytext.metric_reporters.Channel(stages: Tuple[pytext.common.constants.Stage, ...] = (<Stage.TRAIN: 'Training'>, <Stage.EVAL: 'Evaluation'>, <Stage.TEST: 'Test'>, <Stage.OTHERS: 'Others'>))[source]¶

Bases: object

Channel defines how to format and report the result of a PyText job to an output stream.

stages¶: in which stages the report will be triggered, default is all stages, which includes train, eval, test

close()[source]¶

export(model, input_to_model=None, **kwargs)[source]¶

report(stage, epoch, metrics, model_select_metric, loss, preds, targets, scores, context, *args)[source]¶

Defines how to format and report data to the output channel.

Parameters:

stage (Stage) – train, eval or test
epoch (int) – current epoch
metrics (Any) – all metrics
model_select_metric (double) – a single numeric metric to pick best model
loss (double) – average loss
preds (List[Any]) – list of predictions
targets (List[Any]) – list of targets
scores (List[Any]) – list of scores
context (Dict[str, List[Any]]) – dict of any additional context data, each context is a list of data that maps to each example

class pytext.metric_reporters.MetricReporter(channels, log_gradient=False, pep_format=False)[source]¶

Bases: pytext.config.component.Component

MetricReporter is responsible of three things:

Aggregate output from trainer, which includes model inputs, predictions, targets, scores, and loss.
Calculate metrics using the aggregated output, and define how the metric is used to find best model
Optionally report the metrics and aggregated output to various channels

lower_is_better¶

Whether a lower metric indicates better performance. Set to True for e.g. perplexity, and False for e.g. accuracy. Default is False

Type:	bool

channels¶

A list of Channel that will receive metrics and the aggregated trainer output then format and report them in any customized way.

Type:	List[Channel]

MetricReporter is tightly-coupled with metric aggregation and computation which makes inheritance hard to reuse the parent functionalities and attributes. Next step is to decouple the metric aggregation and computation vs metric reporting.

add_batch_stats(n_batches, preds, targets, scores, loss, m_input, **context)[source]¶

Aggregates a batch of output data (predictions, scores, targets/true labels and loss).

Parameters:

n_batches (int) – number of current batch
preds (torch.Tensor) – predictions of current batch
targets (torch.Tensor) – targets of current batch
scores (torch.Tensor) – scores of current batch
loss (double) – average loss of current batch
m_input (Tuple[torch.Tensor, ..]) – model inputs of current batch
context (Dict[str, Any]) – any additional context data, it could be either a list of data which maps to each example, or a single value for the batch

add_channel(channel)[source]¶

add_gradients(model)[source]¶

classmethod aggregate_data(all_data, new_batch)[source]¶: Aggregate a batch of data, basically just convert tensors to list of native python data

aggregate_preds(batch_preds, batch_context=None)[source]¶

aggregate_scores(batch_scores)[source]¶

aggregate_targets(batch_targets, batch_context=None)[source]¶

batch_context(raw_batch, batch)[source]¶

calculate_loss()[source]¶: Calculate the average loss for all aggregated batch

calculate_metric()[source]¶: Calculate metrics, each sub class should implement it

compare_metric(new_metric, old_metric)[source]¶

Check if new metric indicates better model performance

Returns:	bool, true if model with new_metric performs better

gen_extra_context(*args)[source]¶: Generate any extra intermediate context data for metric calculation

get_gradients()[source]¶

get_meta()[source]¶: Get global meta data that is not specific to any batch, the data will be pass along to channels

get_model_select_metric(metrics)[source]¶: Return a single numeric metric value that is used for model selection, returns the metric itself by default, but usually metrics will be more complicated data structures

log_gradient = False¶

lower_is_better = False

predictions_to_report()[source]¶: Generate human readable predictions

report_metric(model, stage, epoch, reset=True, print_to_channels=True, optimizer=None)[source]¶

Calculate metrics and average loss, report all statistic data to channels

Parameters:	model (nn.Module) – the PyTorch neural network model. stage (Stage) – training, evaluation or test epoch (int) – current epoch reset (bool) – if all data should be reset after report, default is True print_to_channels (bool) – if report data to channels, default is True

report_realtime_metric(stage)[source]¶

targets_to_report()[source]¶: Generate human readable targets

class pytext.metric_reporters.CalibrationMetricReporter(channels: List[pytext.metric_reporters.channel.Channel], pad_index: int = -1)[source]¶

Bases: pytext.metric_reporters.metric_reporter.MetricReporter

aggregate_preds(batch_preds: torch.Tensor, batch_context=typing.Dict[str, typing.Any])[source]¶

aggregate_scores(batch_scores: torch.Tensor)[source]¶

aggregate_targets(batch_targets: torch.Tensor, batch_context=typing.Dict[str, typing.Any])[source]¶

calculate_metric()[source]¶: Calculate metrics, each sub class should implement it

classmethod from_config(config: pytext.config.pytext_config.PyTextConfig, pad_index: int = -1)[source]¶

class pytext.metric_reporters.ClassificationMetricReporter(label_names: List[str], channels: List[pytext.metric_reporters.channel.Channel], model_select_metric: pytext.metric_reporters.classification_metric_reporter.ComparableClassificationMetric = <ComparableClassificationMetric.ACCURACY: 'accuracy'>, target_label: Optional[str] = None, text_column_names: List[str] = ['text'], additional_column_names: List[str] = [], recall_at_precision_thresholds: List[float] = [0.2, 0.4, 0.6, 0.8, 0.9], is_memory_efficient: bool = False)[source]¶

Bases: pytext.metric_reporters.metric_reporter.MetricReporter

add_batch_stats(n_batches, preds, targets, scores, loss, m_input, **context)[source]¶

Aggregates a batch of output data (predictions, scores, targets/true labels and loss).

Parameters:

n_batches (int) – number of current batch
preds (torch.Tensor) – predictions of current batch
targets (torch.Tensor) – targets of current batch
scores (torch.Tensor) – scores of current batch
loss (double) – average loss of current batch
m_input (Tuple[torch.Tensor, ..]) – model inputs of current batch
context (Dict[str, Any]) – any additional context data, it could be either a list of data which maps to each example, or a single value for the batch

batch_context(raw_batch, batch)[source]¶

calculate_metric()[source]¶: Calculate metrics, each sub class should implement it

classmethod from_config(config, meta: pytext.data.data_handler.CommonMetadata = None, tensorizers=None)[source]¶

classmethod from_config_and_label_names(config, label_names: List[str])[source]¶

get_meta()[source]¶: Get global meta data that is not specific to any batch, the data will be pass along to channels

get_model_select_metric(metrics)[source]¶: Return a single numeric metric value that is used for model selection, returns the metric itself by default, but usually metrics will be more complicated data structures

predictions_to_report()[source]¶: Generate human readable predictions

targets_to_report()[source]¶: Generate human readable targets

class pytext.metric_reporters.MultiLabelClassificationMetricReporter(label_names: List[str], channels: List[pytext.metric_reporters.channel.Channel], model_select_metric: pytext.metric_reporters.classification_metric_reporter.ComparableClassificationMetric = <ComparableClassificationMetric.ACCURACY: 'accuracy'>, target_label: Optional[str] = None, text_column_names: List[str] = ['text'], additional_column_names: List[str] = [], recall_at_precision_thresholds: List[float] = [0.2, 0.4, 0.6, 0.8, 0.9], is_memory_efficient: bool = False)[source]¶

Bases: pytext.metric_reporters.classification_metric_reporter.ClassificationMetricReporter

calculate_metric()[source]¶: Calculate metrics, each sub class should implement it

predictions_to_report()[source]¶: Generate human readable predictions

targets_to_report()[source]¶: Generate human readable targets

class pytext.metric_reporters.MultiLabelSequenceTaggingMetricReporter(label_names, pad_idx, channels, label_vocabs=None)[source]¶

Bases: pytext.metric_reporters.metric_reporter.MetricReporter

aggregate_preds(batch_preds, batch_context=None)[source]¶

aggregate_scores(batch_scores)[source]¶

aggregate_targets(batch_targets, batch_context=None)[source]¶

aggregate_tuple_data(all_data, new_batch)[source]¶

batch_context(raw_batch, batch)[source]¶

calculate_metric()[source]¶: Calculate metrics, each sub class should implement it

classmethod from_config(config, tensorizers)[source]¶

static get_model_select_metric(metrics)[source]¶: Return a single numeric metric value that is used for model selection, returns the metric itself by default, but usually metrics will be more complicated data structures

class pytext.metric_reporters.RegressionMetricReporter(channels, log_gradient=False, pep_format=False)[source]¶

Bases: pytext.metric_reporters.metric_reporter.MetricReporter

calculate_metric()[source]¶: Calculate metrics, each sub class should implement it

classmethod from_config(config, tensorizers=None)[source]¶

get_model_select_metric(metrics)[source]¶: Return a single numeric metric value that is used for model selection, returns the metric itself by default, but usually metrics will be more complicated data structures

lower_is_better = False¶

class pytext.metric_reporters.IntentSlotMetricReporter(doc_label_names: List[str], word_label_names: List[str], use_bio_labels: bool, channels: List[pytext.metric_reporters.channel.Channel], slot_column_name: str = 'slots', text_column_name: str = 'text', token_tensorizer_name: str = 'tokens')[source]¶

Bases: pytext.metric_reporters.metric_reporter.MetricReporter

aggregate_preds(batch_preds, batch_context)[source]¶

aggregate_scores(batch_scores)[source]¶

aggregate_targets(batch_targets, batch_context)[source]¶

batch_context(raw_batch, batch)[source]¶

calculate_metric()[source]¶: Calculate metrics, each sub class should implement it

classmethod from_config(config, tensorizers: Optional[Dict[KT, VT]] = None)[source]¶

get_model_select_metric(metrics)[source]¶: Return a single numeric metric value that is used for model selection, returns the metric itself by default, but usually metrics will be more complicated data structures

get_raw_slot_str(raw_data_row)[source]¶

predictions_to_report()[source]¶: Generate human readable predictions

targets_to_report()[source]¶: Generate human readable targets

class pytext.metric_reporters.LanguageModelMetricReporter(channels, metadata, tensorizers, aggregate_metrics, perplexity_type, pep_format, log_gradient=False)[source]¶

Bases: pytext.metric_reporters.metric_reporter.MetricReporter

LABELS_COLUMN = 'labels'¶

RAW_TEXT_COLUMN = 'text'¶

TOKENS_COLUMN = 'tokens'¶

UTTERANCE_COLUMN = 'utterance'¶

add_batch_stats(n_batches, preds, targets, scores, loss, m_input, **context)[source]¶

Aggregates a batch of output data (predictions, scores, targets/true labels and loss).

Parameters:

n_batches (int) – number of current batch
preds (torch.Tensor) – predictions of current batch
targets (torch.Tensor) – targets of current batch
scores (torch.Tensor) – scores of current batch
loss (double) – average loss of current batch
m_input (Tuple[torch.Tensor, ..]) – model inputs of current batch
context (Dict[str, Any]) – any additional context data, it could be either a list of data which maps to each example, or a single value for the batch

aggregate_context(context)[source]¶

aggregate_scores(scores)[source]¶

batch_context(raw_batch, batch)[source]¶

calculate_loss() → float[source]¶: Calculate the average loss for all aggregated batch

calculate_metric() → pytext.metrics.language_model_metrics.LanguageModelMetric[source]¶: Calculate metrics, each sub class should implement it

compute_scores(logits, targets)[source]¶

classmethod from_config(config: pytext.metric_reporters.language_model_metric_reporter.LanguageModelMetricReporter.Config, meta: pytext.data.data_handler.CommonMetadata = None, tensorizers=None)[source]¶

get_model_select_metric(metrics) → float[source]¶: Return a single numeric metric value that is used for model selection, returns the metric itself by default, but usually metrics will be more complicated data structures

lower_is_better = True¶

class pytext.metric_reporters.SquadMetricReporter(channels: List[pytext.metric_reporters.channel.Channel], n_best_size: int, max_answer_length: int, ignore_impossible: bool, has_answer_labels: List[str], tensorizer=None, false_label='False')[source]¶

Bases: pytext.metric_reporters.metric_reporter.MetricReporter

ANSWERS_COLUMN = 'answers'¶

DOC_COLUMN = 'doc'¶

QUES_COLUMN = 'question'¶

ROW_INDEX = 'id'¶

add_batch_stats(n_batches, preds, targets, scores, loss, m_input, **contexts)[source]¶

Aggregates a batch of output data (predictions, scores, targets/true labels and loss).

Parameters:

n_batches (int) – number of current batch
preds (torch.Tensor) – predictions of current batch
targets (torch.Tensor) – targets of current batch
scores (torch.Tensor) – scores of current batch
loss (double) – average loss of current batch
m_input (Tuple[torch.Tensor, ..]) – model inputs of current batch
context (Dict[str, Any]) – any additional context data, it could be either a list of data which maps to each example, or a single value for the batch

aggregate_preds(new_batch, context=None)[source]¶

aggregate_scores(new_batch)[source]¶

aggregate_targets(new_batch, context=None)[source]¶

batch_context(raw_batch, batch)[source]¶

calculate_metric()[source]¶: Calculate metrics, each sub class should implement it

classmethod from_config(config, *args, tensorizers=None, **kwargs)[source]¶

get_model_select_metric(metric: pytext.metrics.squad_metrics.SquadMetrics)[source]¶: Return a single numeric metric value that is used for model selection, returns the metric itself by default, but usually metrics will be more complicated data structures

class pytext.metric_reporters.WordTaggingMetricReporter(label_names: List[str], use_bio_labels: bool, channels: List[pytext.metric_reporters.channel.Channel])[source]¶

Bases: pytext.metric_reporters.metric_reporter.MetricReporter

calculate_loss()[source]¶: Calculate the average loss for all aggregated batch

calculate_metric()[source]¶: Calculate metrics, each sub class should implement it

classmethod from_config(config, meta: pytext.data.data_handler.CommonMetadata)[source]¶

get_model_select_metric(metrics)[source]¶: Return a single numeric metric value that is used for model selection, returns the metric itself by default, but usually metrics will be more complicated data structures

process_pred(pred: List[int]) → List[str][source]¶: pred is a list of token label index

class pytext.metric_reporters.CompositionalMetricReporter(actions_vocab, channels: List[pytext.metric_reporters.channel.Channel], text_column_name: str = 'tokenized_text', tokenizer: pytext.data.tokenizers.tokenizer.Tokenizer = None)[source]¶

Bases: pytext.metric_reporters.metric_reporter.MetricReporter

batch_context(raw_batch, batch)[source]¶

calculate_metric()[source]¶: Calculate metrics, each sub class should implement it

create_frame_prediction_pairs()[source]¶

classmethod from_config(config, metadata: pytext.data.data_handler.CommonMetadata = None, tensorizers: Dict[str, pytext.data.tensorizers.Tensorizer] = None)[source]¶

gen_extra_context(*args)[source]¶: Generate any extra intermediate context data for metric calculation

get_model_select_metric(metrics)[source]¶: Return a single numeric metric value that is used for model selection, returns the metric itself by default, but usually metrics will be more complicated data structures

static node_to_metrics_node(node: Union[pytext.data.data_structures.annotation.Intent, pytext.data.data_structures.annotation.Slot], start: int = 0) → pytext.metrics.intent_slot_metrics.Node[source]¶: The input start is the absolute start position in utterance

predictions_to_report()[source]¶: Generate human readable predictions

targets_to_report()[source]¶: Generate human readable targets

static tree_from_tokens_and_indx_actions(token_str_list: List[str], actions_vocab: List[str], actions_indices: List[int], validate_tree: bool = True)[source]¶

static tree_to_metric_node(tree: pytext.data.data_structures.annotation.Tree) → pytext.metrics.intent_slot_metrics.Node[source]¶: Creates a Node from tree assuming the utterance is a concatenation of the tokens by whitespaces. The function does not necessarily reproduce the original utterance as extra whitespaces can be introduced.

class pytext.metric_reporters.PairwiseRankingMetricReporter(channels, log_gradient=False, pep_format=False)[source]¶

Bases: pytext.metric_reporters.metric_reporter.MetricReporter

add_batch_stats(n_batches, preds, targets, scores, loss, m_input, **context)[source]¶

Aggregates a batch of output data (predictions, scores, targets/true labels and loss).

Parameters:

n_batches (int) – number of current batch
preds (torch.Tensor) – predictions of current batch
targets (torch.Tensor) – targets of current batch
scores (torch.Tensor) – scores of current batch
loss (double) – average loss of current batch
m_input (Tuple[torch.Tensor, ..]) – model inputs of current batch
context (Dict[str, Any]) – any additional context data, it could be either a list of data which maps to each example, or a single value for the batch

calculate_metric()[source]¶: Calculate metrics, each sub class should implement it

classmethod from_config(config, meta: pytext.data.data_handler.CommonMetadata = None, tensorizers=None)[source]¶

static get_model_select_metric(metrics)[source]¶: Return a single numeric metric value that is used for model selection, returns the metric itself by default, but usually metrics will be more complicated data structures

class pytext.metric_reporters.SequenceTaggingMetricReporter(label_names, pad_idx, channels)[source]¶

Bases: pytext.metric_reporters.metric_reporter.MetricReporter

batch_context(raw_batch, batch)[source]¶

calculate_metric()[source]¶: Calculate metrics, each sub class should implement it

classmethod from_config(config, tensorizer)[source]¶

static get_model_select_metric(metrics)[source]¶: Return a single numeric metric value that is used for model selection, returns the metric itself by default, but usually metrics will be more complicated data structures

class pytext.metric_reporters.PureLossMetricReporter(channels, log_gradient=False, pep_format=False)[source]¶

Bases: pytext.metric_reporters.metric_reporter.MetricReporter

calculate_metric()[source]¶: Calculate metrics, each sub class should implement it

classmethod from_config(config, *args, **kwargs)[source]¶

lower_is_better = True¶

class pytext.metric_reporters.NERMetricReporter(label_names: List[str], pad_idx: int, channels: List[pytext.metric_reporters.channel.Channel], use_bio_labels: bool = True)[source]¶

Bases: pytext.metric_reporters.metric_reporter.MetricReporter

batch_context(raw_batch, batch)[source]¶

calculate_metric() → pytext.metrics.PRF1Metrics[source]¶: Calculate metrics, each sub class should implement it

classmethod from_config(config, tensorizer)[source]¶

static get_model_select_metric(metrics)[source]¶: Return a single numeric metric value that is used for model selection, returns the metric itself by default, but usually metrics will be more complicated data structures

class pytext.metric_reporters.DenseRetrievalMetricReporter(channels: List[pytext.metric_reporters.channel.Channel], text_column_names: List[str], model_select_metric: pytext.metric_reporters.dense_retrieval_metric_reporter.DenseRetrievalMetricNames, task_batch_size: int, num_negative_ctxs: int = 0)[source]¶

Bases: pytext.metric_reporters.metric_reporter.MetricReporter

aggregate_preds(preds, context)[source]¶

batch_context(raw_batch, batch) → Dict[str, Any][source]¶

calculate_metric() → pytext.metrics.dense_retrieval_metrics.DenseRetrievalMetrics[source]¶: Calculate metrics, each sub class should implement it

classmethod from_config(config, *args, tensorizers=None, **kwargs)[source]¶

get_model_select_metric(metrics: pytext.metrics.dense_retrieval_metrics.DenseRetrievalMetrics)[source]¶: Return a single numeric metric value that is used for model selection, returns the metric itself by default, but usually metrics will be more complicated data structures