TSVDataSource.Config¶

Component: TSVDataSource

class TSVDataSource.Config[source]: Bases: RootDataSource.Config

All Attributes (including base classes)

column_mapping: dict[str, str] = {}

train_filename: Optional[str] = None

Filename of training set. If not set, iteration will be empty.

test_filename: Optional[str] = None

Filename of testing set. If not set, iteration will be empty.

eval_filename: Optional[str] = None

Filename of eval set. If not set, iteration will be empty.

field_names: Optional[list[str]] = None

Field names for the TSV. If this is not set, the first line of each file will be assumed to be a header containing the field names.

delimiter: str = '\t'

The column delimiter passed to Python’s csv library. Change to “,” for csv.

quoted: bool = False

Whether the columns can use quotes to include delimiters or not. Rows with unclosed quotes will be merged with n inside. Change to True for quoted csv.

drop_incomplete_rows: bool = False

Subclasses

BlockShardedTSVDataSource.Config
MultilingualTSVDataSource.Config
SessionTSVDataSource.Config

Default JSON

{
    "column_mapping": {},
    "train_filename": null,
    "test_filename": null,
    "eval_filename": null,
    "field_names": null,
    "delimiter": "\t",
    "quoted": false,
    "drop_incomplete_rows": false
}