evaluation

Set of functions needed for the evaluation of the results.

With a call of the evaluateResults() function the module takes the output of the atoml tests (predictions folder) and the algorithm-descriptions folder as input. From that it generates result metrics and plots. The results can be saved in an archive.

class atoml_cmp.evaluation.Algorithm(filename: str, path: Optional[str] = None, print_all: bool = True)[source]

Holds all information about one specific implementation tested on one specific dataset.

filename

csv file name of the data file

Type

str

path

path to the csv file folder

Type

str

framework

framework of the implementation

Type

str

name

name of the implemented algorithm

Type

str

dataset_type

name of the dataset the implementation was tested with

Type

str

training_as_test

flag for the use of training data as test data

Type

bool

predictions

predicted labels of the implementation on the data set

Type

pd.Series

probabilities

tuple of predicted probabilities for both classes of the implementation on the data set

Type

pd.Series

actuals

actual labels of the data set

Type

pd.Series

class atoml_cmp.evaluation.Archive(name: Optional[str] = None, archive_folder=None, yaml_folder=None, pred_folder=None, test_folder=None, print_all: bool = True)[source]

Instance of an archive where all the generated metrics and plots can be saved.

The archive folder name is generated with a time stamp by default. Alternatively, a name can be given to the constructor. In addition to the evaluation results also the foundation can be saved, meaning the current yaml-folder. Be aware that this can lead to inconsistencies, when changing the folder’s content during runtime. Moreover, the predictions can be saved for reproducing the evaluations.

path

path of the archive

Type

str

print_all

if flag is set results are printed in the function

Type

bool

archive_data_frame(df: pandas.core.frame.DataFrame, filename: str = 'result_table.csv', by_dataset: bool = False, by_algorithm: bool = False)[source]

Archives a dataframe in a csv file in the archive path.

Parameters
  • df – dataframe to be archived

  • filename – filename of the generated csv file

  • by_dataset – True if the dataframe is a view for a specific dataset and should be saved in a separate folder.

  • by_algorithm – True if the dataframe is a view for a specific algorithm type and should be saved in a separate folder. Only done if by_dataset is False

atoml_cmp.evaluation.chi2_statistic(pred1: pandas.core.series.Series, pred2: pandas.core.series.Series, print_all: bool = True)[source]

Does the Chi-squared test with two sets of prediction labels.

Parameters
  • pred1 – Series of predicted labels

  • pred2 – Series of predicted labels

  • print_all – if flag is set results are printed in the function

Returns: p-value of chi-squared test

atoml_cmp.evaluation.compare_two_algorithms(x: atoml_cmp.evaluation.Algorithm, y: atoml_cmp.evaluation.Algorithm, df: pandas.core.frame.DataFrame, print_all: bool = True) pandas.core.frame.DataFrame[source]

Compares two prediction results and creates different metrics. The metrics are the confusion matrix, the Kolmogorov–Smirnov test result, the Chi2 test result and also the accuracy of the two prediction sets compared to the actual values

Parameters
  • x – the results for one specific algorithm implementation on one dataset

  • y – the results for one specific algorithm implementation on one dataset

  • df – result overview dataframe with different metrics

  • print_all – if flag is set results are printed in the function

Returns: result overview dataframe with different metrics

atoml_cmp.evaluation.create_confusion_matrix(predictions1: pandas.core.series.Series, predictions2: pandas.core.series.Series, print_all: bool = True) Tuple[bool, numpy.ndarray][source]

Creates the confusion matrix between 2 sets of labels.

Parameters
  • predictions1 – Series of predicted labels

  • predictions2 – Series of predicted labels

  • print_all – if flag is set results are printed in the function

Returns

  • Equality flag - True, if predicted labels are identical

  • Confusion matrix

atoml_cmp.evaluation.create_views_by_algorithm(df: Optional[pandas.core.frame.DataFrame] = None, csv_file: Optional[str] = None, archive: Optional[atoml_cmp.evaluation.Archive] = None, print_all: bool = True)[source]

Creates a views on dataframe based on the algorithms.

The function creates smaller dataframes only containing comparison results based on the same algorithm out of a bigger dataframe. It works either directly with a DataFrame as input or with the path of a csv file containing the input dataframe. If both are given the DataFrame is used. The result can be shown and/or saved in an archive.

Parameters
  • df – DataFrame from which to create views for the single algorithms

  • csv_file – path to the csv file with the Dataframe from which to create views for the single algorithms

  • archive – archive instance which is used to save data

  • print_all – if flag is set results are printed in the function

Raises

RuntimeError – if neither Dataframe nor csv file are given as input.

atoml_cmp.evaluation.evaluate_results(prediction_folder: str, yaml_folder: Optional[str] = None, archive_folder: Optional[str] = None, gen_tests_folder: Optional[str] = None, print_all: bool = True) int[source]

Main function for the evaluation of the prediction csv files.

The function reads in all csv files from a specific folder. Gathers meta data from the csv file names and evaluates the content of the files. For that it creates different metrics and histograms which can be saved together with the current yaml folder for the csv file creation in an archive folder.

Parameters
  • prediction_folder – relative path to the folder with the prediction files

  • yaml_folder – relative path to the folder with the yaml definitions of the ML algorithms. If no folder is given, the yaml files will not be saved in the archive.

  • archive_folder – relative path to the folder where the archive should be saved. If no folder is given, no archive will be created.

  • gen_tests_folder – relative path to the folder where the test cases are located. This is only for the archiving. If no folder is given, the tests will not be stored in archive.

  • print_all – if flag is set results are printed in the function

Return: number of read csv files

atoml_cmp.evaluation.get_data_from_csv(filename: str, print_all: bool = True) Tuple[pandas.core.series.Series, pandas.core.series.Series, pandas.core.series.Series, pandas.core.series.Series][source]

Reads in a csv file with the specified format and extracts the different columns.

Parameters
  • filename – relative or absolute filepath

  • print_all – if flag is set results are printed in the function

Returns

  • predicted labels

  • predicted probability for class 0

  • predicted probability for class 1

  • actual labels

atoml_cmp.evaluation.get_delta_of_scores(pred_prob1: pandas.core.series.Series, pred_prob2: pandas.core.series.Series) int[source]

Compares the scores (pred_prob) of two algorithms and calculates a delta value.

Parameters
  • pred_prob1 – prediction probabilities (scores) of the first algorithm

  • pred_prob2 – prediction probabilities (scores) of the second algorithm

Returns: number of results where the difference between the scores is greater than a defined epsilon

atoml_cmp.evaluation.get_pred_file_metadata(filename: str) Tuple[str, str, str, bool][source]

Splits a prediction csv filename to get the information it contains.

Parameters

filename – The filename should consist of the keyword ‘pred’ and 3 identifier: for the framework (in capital letters), the algorithm and the used data set type. Example: ‘pred_FRAMEWORK_Algorithm_TestDataType.csv’

Returns

  • Name of the framework

  • Name of the algorithm

  • Name of the dataset

  • Training data as test data flag

Raises
  • RuntimeError – if the filename is not a csv filename

  • RuntimeWarning – if the filename does not contain the right amount of parameters

atoml_cmp.evaluation.ks_statistic(pred_prob1: pandas.core.series.Series, pred_prob2: pandas.core.series.Series, print_all: bool = True) Tuple[float, float][source]

Does the Kolmogorov–Smirnov test with two probability distributions.

Parameters
  • pred_prob1 – Series of predicted probabilities (scores)

  • pred_prob2 – Series of predicted probabilities (scores)

  • print_all – if flag is set results are printed in the function

Returns

  • p-value of KS test

  • KS test statistic

atoml_cmp.evaluation.plot_probabilities(algorithms: List[atoml_cmp.evaluation.Algorithm], archive: Optional[atoml_cmp.evaluation.Archive] = None, show_plot=False, print_all: bool = True)[source]

Plots the probability distributions of a list of implementations.

Parameters
  • algorithms – list of implementation instances which probabilities are to plot

  • archive – archive instance which is used to save data

  • show_plot – if flag is set the plot is shown in program

  • print_all – if flag is set results are printed in the function

atoml_cmp.evaluation.set_pandas_print_full_df()[source]

Sets the pandas options to print a full dataframe without cutting of parts.