scandeval package¶
Subpackages¶
- scandeval.benchmarks package
- Subpackages
- scandeval.benchmarks.abstract package
- Submodules
- scandeval.benchmarks.abstract.base module
- scandeval.benchmarks.abstract.dep module
- scandeval.benchmarks.abstract.ner module
- scandeval.benchmarks.abstract.pos module
- scandeval.benchmarks.abstract.text_classification module
- scandeval.benchmarks.abstract.token_classification module
- Module contents
- scandeval.benchmarks.abstract package
- Submodules
- scandeval.benchmarks.absabank_imm module
- scandeval.benchmarks.angry_tweets module
- scandeval.benchmarks.dalaj module
- scandeval.benchmarks.dane module
- scandeval.benchmarks.ddt_dep module
- scandeval.benchmarks.ddt_pos module
- scandeval.benchmarks.dkhate module
- scandeval.benchmarks.europarl module
- scandeval.benchmarks.fdt_dep module
- scandeval.benchmarks.fdt_pos module
- scandeval.benchmarks.idt_dep module
- scandeval.benchmarks.idt_pos module
- scandeval.benchmarks.lcc module
- scandeval.benchmarks.mim_gold_ner module
- scandeval.benchmarks.ndt_nb_dep module
- scandeval.benchmarks.ndt_nb_pos module
- scandeval.benchmarks.ndt_nn_dep module
- scandeval.benchmarks.ndt_nn_pos module
- scandeval.benchmarks.nordial module
- scandeval.benchmarks.norec module
- scandeval.benchmarks.norec_fo module
- scandeval.benchmarks.norec_is module
- scandeval.benchmarks.norne_nb module
- scandeval.benchmarks.norne_nn module
- scandeval.benchmarks.sdt_dep module
- scandeval.benchmarks.sdt_pos module
- scandeval.benchmarks.suc3 module
- scandeval.benchmarks.twitter_sent module
- scandeval.benchmarks.wikiann_fo module
- Module contents
- Subpackages
Submodules¶
scandeval.benchmark module¶
Fetches an updated list of all Scandinavian models on the HuggingFace Hub
- class scandeval.benchmark.Benchmark(progress_bar: bool = True, save_results: bool = False, language: Union[str, List[str]] = ['da', 'sv', 'no', 'nb', 'nn', 'is', 'fo'], task: Union[str, List[str]] = 'all', evaluate_train: bool = False, verbose: bool = False)¶
Bases:
object
Benchmarking all the Scandinavian language models.
- Parameters
progress_bar (bool, optional) – Whether progress bars should be shown. Defaults to True.
save_results (bool, optional) – Whether to save the benchmark results to ‘scandeval_benchmark_results.json’. Defaults to False.
language (str or list of str, optional) – The language codes of the languages to include in the list. Set this to ‘all’ if all languages (also non-Scandinavian) should be considered. Defaults to [‘da’, ‘sv’, ‘no’, ‘nb’, ‘nn’, ‘is’, ‘fo’].
task (str or list of str, optional) – The tasks to consider in the list. Set this to ‘all’ if all tasks should be considered. Defaults to ‘all’.
evaluate_train (bool, optional) – Whether to evaluate the training set as well. Defaults to False.
verbose (bool, optional) – Whether to output additional output. Defaults to False.
- progress_bar¶
Whether progress bars should be shown.
- Type
bool
- save_results¶
Whether to save the benchmark results.
- Type
bool
- language¶
The languages to include in the list.
- Type
str or list of str
- task¶
The tasks to consider in the list.
- Type
str or list of str
- evaluate_train¶
Whether to evaluate the training set as well.
- Type
bool
- verbose¶
Whether to output additional output.
- Type
bool
- benchmark_results¶
The benchmark results.
- Type
dict
- benchmark(model_id: Optional[Union[str, List[str]]] = None, dataset: Optional[Union[str, List[str]]] = None, progress_bar: Optional[bool] = None, save_results: Optional[bool] = None, language: Optional[Union[str, List[str]]] = None, task: Optional[Union[str, List[str]]] = None, evaluate_train: Optional[bool] = None, verbose: Optional[bool] = None) Dict[str, Dict[str, dict]] ¶
Benchmarks models on datasets.
- Parameters
model_id (str, list of str or None, optional) – The model ID(s) of the models to benchmark. If None then all relevant model IDs will be benchmarked. Defaults to None.
dataset (str, list of str or None, optional) – The datasets to benchmark on. If None then all datasets will be benchmarked. Defaults to None.
progress_bar (bool or None, optional) – Whether progress bars should be shown. If None then the default value from the constructor will be used. Defaults to None.
save_results (bool or None, optional) – Whether to save the benchmark results to ‘scandeval_benchmark_results.json’. If None then the default value from the constructor will be used. Defaults to None.
language (str, list of str or None, optional) – The language codes of the languages to include in the list. Set this to ‘all’ if all languages (also non-Scandinavian) should be considered. If None then the default value from the constructor will be used. Defaults to None.
task (str, list of str or None, optional) – The tasks to consider in the list. Set this to ‘all’ if all tasks should be considered. If None then the default value from the constructor will be used. Defaults to None.
evaluate_train (bool or None, optional) – Whether to evaluate the training set as well. If None then the default value from the constructor will be used. Defaults to None.
verbose (bool or None, optional) – Whether to output additional output. If None then the default value from the constructor will be used. Defaults to None.
- Returns
A nested dictionary of the benchmark results. The keys are the names of the datasets, with values being new dictionaries having the model IDs as keys.
- Return type
dict
scandeval.cli module¶
Command-line interface for benchmarking
scandeval.datasets module¶
Functions that load datasets
- scandeval.datasets.load_absabank_imm() Tuple[pandas.core.frame.DataFrame, pandas.core.frame.DataFrame, pandas.core.frame.DataFrame, pandas.core.frame.DataFrame] ¶
Load the ABSAbank-Imm dataset.
- Returns
Four dataframes, X_train, X_test, y_train and y_test, where X_train and X_test corresponds to the feature matrices for the training and test split, respectively, and y_train and y_test contains the target vectors.
- Return type
tuple
- scandeval.datasets.load_angry_tweets() Tuple[pandas.core.frame.DataFrame, pandas.core.frame.DataFrame, pandas.core.frame.DataFrame, pandas.core.frame.DataFrame] ¶
Load the AngryTweets dataset.
- Returns
Four dataframes, X_train, X_test, y_train and y_test, where X_train and X_test corresponds to the feature matrices for the training and test split, respectively, and y_train and y_test contains the target vectors.
- Return type
tuple
- scandeval.datasets.load_dalaj() Tuple[pandas.core.frame.DataFrame, pandas.core.frame.DataFrame, pandas.core.frame.DataFrame, pandas.core.frame.DataFrame] ¶
Load the DaLaJ dataset.
- Returns
Four dataframes, X_train, X_test, y_train and y_test, where X_train and X_test corresponds to the feature matrices for the training and test split, respectively, and y_train and y_test contains the target vectors.
- Return type
tuple
- scandeval.datasets.load_dane() Tuple[pandas.core.frame.DataFrame, pandas.core.frame.DataFrame, pandas.core.frame.DataFrame, pandas.core.frame.DataFrame] ¶
Load the the DaNE dataset.
- Returns
Four dataframes, X_train, X_test, y_train and y_test, where X_train and X_test corresponds to the feature matrices for the training and test split, respectively, and y_train and y_test contains the target vectors.
- Return type
tuple
- scandeval.datasets.load_dataset(name: str) Tuple[pandas.core.frame.DataFrame, pandas.core.frame.DataFrame, pandas.core.frame.DataFrame, pandas.core.frame.DataFrame] ¶
Load a benchmark dataset.
- Parameters
name (str) – Name of the dataset.
- Returns
Four dataframes, X_train, X_test, y_train and y_test, where X_train and X_test corresponds to the feature matrices for the training and test split, respectively, and y_train and y_test contains the target vectors.
- Return type
tuple
- Raises
RuntimeError – If name is not a valid dataset name.
- scandeval.datasets.load_ddt_dep() Tuple[pandas.core.frame.DataFrame, pandas.core.frame.DataFrame, pandas.core.frame.DataFrame, pandas.core.frame.DataFrame] ¶
Load the dependency parsing part of the DDT dataset.
- Returns
Four dataframes, X_train, X_test, y_train and y_test, where X_train and X_test corresponds to the feature matrices for the training and test split, respectively, and y_train and y_test contains the target vectors.
- Return type
tuple
- scandeval.datasets.load_ddt_pos() Tuple[pandas.core.frame.DataFrame, pandas.core.frame.DataFrame, pandas.core.frame.DataFrame, pandas.core.frame.DataFrame] ¶
Load the POS part of the DDT dataset.
- Returns
Four dataframes, X_train, X_test, y_train and y_test, where X_train and X_test corresponds to the feature matrices for the training and test split, respectively, and y_train and y_test contains the target vectors.
- Return type
tuple
- scandeval.datasets.load_dkhate() Tuple[pandas.core.frame.DataFrame, pandas.core.frame.DataFrame, pandas.core.frame.DataFrame, pandas.core.frame.DataFrame] ¶
Load the DKHate dataset.
- Returns
Four dataframes, X_train, X_test, y_train and y_test, where X_train and X_test corresponds to the feature matrices for the training and test split, respectively, and y_train and y_test contains the target vectors.
- Return type
tuple
- scandeval.datasets.load_europarl() Tuple[pandas.core.frame.DataFrame, pandas.core.frame.DataFrame, pandas.core.frame.DataFrame, pandas.core.frame.DataFrame] ¶
Load the Europarl dataset.
- Returns
Four dataframes, X_train, X_test, y_train and y_test, where X_train and X_test corresponds to the feature matrices for the training and test split, respectively, and y_train and y_test contains the target vectors.
- Return type
tuple
- scandeval.datasets.load_fdt_dep() Tuple[pandas.core.frame.DataFrame, pandas.core.frame.DataFrame, pandas.core.frame.DataFrame, pandas.core.frame.DataFrame] ¶
Load the dependency parsing part of the FDT dataset.
- Returns
Four dataframes, X_train, X_test, y_train and y_test, where X_train and X_test corresponds to the feature matrices for the training and test split, respectively, and y_train and y_test contains the target vectors.
- Return type
tuple
- scandeval.datasets.load_fdt_pos() Tuple[pandas.core.frame.DataFrame, pandas.core.frame.DataFrame, pandas.core.frame.DataFrame, pandas.core.frame.DataFrame] ¶
Load the POS part of the FDT dataset.
- Returns
Four dataframes, X_train, X_test, y_train and y_test, where X_train and X_test corresponds to the feature matrices for the training and test split, respectively, and y_train and y_test contains the target vectors.
- Return type
tuple
- scandeval.datasets.load_idt_dep() Tuple[pandas.core.frame.DataFrame, pandas.core.frame.DataFrame, pandas.core.frame.DataFrame, pandas.core.frame.DataFrame] ¶
Load the dependency parsing part of the IDT dataset.
- Returns
Four dataframes, X_train, X_test, y_train and y_test, where X_train and X_test corresponds to the feature matrices for the training and test split, respectively, and y_train and y_test contains the target vectors.
- Return type
tuple
- scandeval.datasets.load_idt_pos() Tuple[pandas.core.frame.DataFrame, pandas.core.frame.DataFrame, pandas.core.frame.DataFrame, pandas.core.frame.DataFrame] ¶
Load the POS part of the IDT dataset.
- Returns
Four dataframes, X_train, X_test, y_train and y_test, where X_train and X_test corresponds to the feature matrices for the training and test split, respectively, and y_train and y_test contains the target vectors.
- Return type
tuple
- scandeval.datasets.load_lcc() Tuple[pandas.core.frame.DataFrame, pandas.core.frame.DataFrame, pandas.core.frame.DataFrame, pandas.core.frame.DataFrame] ¶
Load the LCC dataset.
This dataset is the concatenation of the LCC1 and LCC2 datasets.
- Returns
Four dataframes, X_train, X_test, y_train and y_test, where X_train and X_test corresponds to the feature matrices for the training and test split, respectively, and y_train and y_test contains the target vectors.
- Return type
tuple
- scandeval.datasets.load_mim_gold_ner() Tuple[pandas.core.frame.DataFrame, pandas.core.frame.DataFrame, pandas.core.frame.DataFrame, pandas.core.frame.DataFrame] ¶
Load the the MIM-GOLD-NER dataset.
- Returns
Four dataframes, X_train, X_test, y_train and y_test, where X_train and X_test corresponds to the feature matrices for the training and test split, respectively, and y_train and y_test contains the target vectors.
- Return type
tuple
- scandeval.datasets.load_ndt_nb_dep() Tuple[pandas.core.frame.DataFrame, pandas.core.frame.DataFrame, pandas.core.frame.DataFrame, pandas.core.frame.DataFrame] ¶
Load the Bokmål POS part of the NDT dataset.
- Returns
Four dataframes, X_train, X_test, y_train and y_test, where X_train and X_test corresponds to the feature matrices for the training and test split, respectively, and y_train and y_test contains the target vectors.
- Return type
tuple
- scandeval.datasets.load_ndt_nb_pos() Tuple[pandas.core.frame.DataFrame, pandas.core.frame.DataFrame, pandas.core.frame.DataFrame, pandas.core.frame.DataFrame] ¶
Load the Bokmål POS part of the NDT dataset.
- Returns
Four dataframes, X_train, X_test, y_train and y_test, where X_train and X_test corresponds to the feature matrices for the training and test split, respectively, and y_train and y_test contains the target vectors.
- Return type
tuple
- scandeval.datasets.load_ndt_nn_dep() Tuple[pandas.core.frame.DataFrame, pandas.core.frame.DataFrame, pandas.core.frame.DataFrame, pandas.core.frame.DataFrame] ¶
Load the Nynorsk POS part of the NDT dataset.
- Returns
Four dataframes, X_train, X_test, y_train and y_test, where X_train and X_test corresponds to the feature matrices for the training and test split, respectively, and y_train and y_test contains the target vectors.
- Return type
tuple
- scandeval.datasets.load_ndt_nn_pos() Tuple[pandas.core.frame.DataFrame, pandas.core.frame.DataFrame, pandas.core.frame.DataFrame, pandas.core.frame.DataFrame] ¶
Load the Nynorsk POS part of the NDT dataset.
- Returns
Four dataframes, X_train, X_test, y_train and y_test, where X_train and X_test corresponds to the feature matrices for the training and test split, respectively, and y_train and y_test contains the target vectors.
- Return type
tuple
- scandeval.datasets.load_nordial() Tuple[pandas.core.frame.DataFrame, pandas.core.frame.DataFrame, pandas.core.frame.DataFrame, pandas.core.frame.DataFrame] ¶
Load the Bokmål/Nynorsk part of the NorDial dataset.
- Returns
Four dataframes, X_train, X_test, y_train and y_test, where X_train and X_test corresponds to the feature matrices for the training and test split, respectively, and y_train and y_test contains the target vectors.
- Return type
tuple
- scandeval.datasets.load_norec() Tuple[pandas.core.frame.DataFrame, pandas.core.frame.DataFrame, pandas.core.frame.DataFrame, pandas.core.frame.DataFrame] ¶
Load the NoReC dataset.
- Returns
Four dataframes, X_train, X_test, y_train and y_test, where X_train and X_test corresponds to the feature matrices for the training and test split, respectively, and y_train and y_test contains the target vectors.
- Return type
tuple
- scandeval.datasets.load_norec_fo() Tuple[pandas.core.frame.DataFrame, pandas.core.frame.DataFrame, pandas.core.frame.DataFrame, pandas.core.frame.DataFrame] ¶
Load the NoReC-FO dataset.
- Returns
Four dataframes, X_train, X_test, y_train and y_test, where X_train and X_test corresponds to the feature matrices for the training and test split, respectively, and y_train and y_test contains the target vectors.
- Return type
tuple
- scandeval.datasets.load_norec_is() Tuple[pandas.core.frame.DataFrame, pandas.core.frame.DataFrame, pandas.core.frame.DataFrame, pandas.core.frame.DataFrame] ¶
Load the NoReC-IS dataset.
- Returns
Four dataframes, X_train, X_test, y_train and y_test, where X_train and X_test corresponds to the feature matrices for the training and test split, respectively, and y_train and y_test contains the target vectors.
- Return type
tuple
- scandeval.datasets.load_norne_nb() Tuple[pandas.core.frame.DataFrame, pandas.core.frame.DataFrame, pandas.core.frame.DataFrame, pandas.core.frame.DataFrame] ¶
Load the the Bokmål part of the NorNE dataset.
- Returns
Four dataframes, X_train, X_test, y_train and y_test, where X_train and X_test corresponds to the feature matrices for the training and test split, respectively, and y_train and y_test contains the target vectors.
- Return type
tuple
- scandeval.datasets.load_norne_nn() Tuple[pandas.core.frame.DataFrame, pandas.core.frame.DataFrame, pandas.core.frame.DataFrame, pandas.core.frame.DataFrame] ¶
Load the the Nynorsk part of the NorNE dataset.
- Returns
Four dataframes, X_train, X_test, y_train and y_test, where X_train and X_test corresponds to the feature matrices for the training and test split, respectively, and y_train and y_test contains the target vectors.
- Return type
tuple
- scandeval.datasets.load_sdt_dep() Tuple[pandas.core.frame.DataFrame, pandas.core.frame.DataFrame, pandas.core.frame.DataFrame, pandas.core.frame.DataFrame] ¶
Load the dependency parsing part of the SDT dataset.
- Returns
Four dataframes, X_train, X_test, y_train and y_test, where X_train and X_test corresponds to the feature matrices for the training and test split, respectively, and y_train and y_test contains the target vectors.
- Return type
tuple
- scandeval.datasets.load_sdt_pos() Tuple[pandas.core.frame.DataFrame, pandas.core.frame.DataFrame, pandas.core.frame.DataFrame, pandas.core.frame.DataFrame] ¶
Load the POS part of the SDT dataset.
- Returns
Four dataframes, X_train, X_test, y_train and y_test, where X_train and X_test corresponds to the feature matrices for the training and test split, respectively, and y_train and y_test contains the target vectors.
- Return type
tuple
- scandeval.datasets.load_suc3() Tuple[pandas.core.frame.DataFrame, pandas.core.frame.DataFrame, pandas.core.frame.DataFrame, pandas.core.frame.DataFrame] ¶
Load the SUC 3.0 dataset.
- Returns
Four dataframes, X_train, X_test, y_train and y_test, where X_train and X_test corresponds to the feature matrices for the training and test split, respectively, and y_train and y_test contains the target vectors.
- Return type
tuple
- scandeval.datasets.load_twitter_sent() Tuple[pandas.core.frame.DataFrame, pandas.core.frame.DataFrame, pandas.core.frame.DataFrame, pandas.core.frame.DataFrame] ¶
Load the TwitterSent dataset.
- Returns
Four dataframes, X_train, X_test, y_train and y_test, where X_train and X_test corresponds to the feature matrices for the training and test split, respectively, and y_train and y_test contains the target vectors.
- Return type
tuple
- scandeval.datasets.load_wikiann_fo() Tuple[pandas.core.frame.DataFrame, pandas.core.frame.DataFrame, pandas.core.frame.DataFrame, pandas.core.frame.DataFrame] ¶
Load the the Faroese WikiANN dataset.
- Returns
Four dataframes, X_train, X_test, y_train and y_test, where X_train and X_test corresponds to the feature matrices for the training and test split, respectively, and y_train and y_test contains the target vectors.
- Return type
tuple
scandeval.utils module¶
Utility functions to be used in other scripts
- class scandeval.utils.DocInherit(mthd: Callable)¶
Bases:
object
Docstring inheriting method descriptor.
The class itself is also used as a decorator.
- get_no_inst(cls)¶
- get_with_inst(obj, cls)¶
- use_parent_doc(func, source)¶
- exception scandeval.utils.InvalidBenchmark(message: str = 'This model cannot be benchmarked on the given dataset.')¶
Bases:
Exception
- class scandeval.utils.NeverLeaveProgressCallback¶
Bases:
transformers.trainer_callback.ProgressCallback
Progress callback which never leaves the progress bar
- on_prediction_step(args, state, control, eval_dataloader=None, **kwargs)¶
Event called after a prediction step.
- on_train_begin(args, state, control, **kwargs)¶
Event called at the beginning of training.
- class scandeval.utils.TwolabelTrainer(split_point: int, **kwargs)¶
Bases:
transformers.trainer.Trainer
Trainer class which deals with two labels.
- compute_loss(model, inputs, return_outputs=False)¶
How the loss is computed by Trainer. By default, all models return the loss in the first element.
Subclass and override for custom behavior.
- scandeval.utils.block_terminal_output()¶
Blocks libraries from writing output to the terminal
- scandeval.utils.doc_inherit¶
alias of
scandeval.utils.DocInherit
- scandeval.utils.get_all_datasets() list ¶
Load a list of all datasets.
- Returns
First entry in each tuple is the short name of the dataset, second entry the long name, third entry the benchmark class and fourth entry the loading function.
- Return type
list of tuples
- scandeval.utils.is_module_installed(module: str) bool ¶
Check if a module is installed.
- Parameters
module (str) – The name of the module.
- Returns
Whether the module is installed or not.
- Return type
bool