evaluation

Set of functions needed for the evaluation of the results.

With a call of the evaluateResults() function the module takes the output of the atoml tests (predictions folder) and the algorithm-descriptions folder as input. From that it generates result metrics and plots. The results can be saved in an archive.

class atoml_cmp.evaluation.Algorithm(filename: str, path: Optional[str] = None, print_all: bool = True)[source]

Holds all information about one specific implementation tested on one specific dataset.

filename

csv file name of the data file

Type: str

path

path to the csv file folder

Type: str

framework

framework of the implementation

Type: str

name

name of the implemented algorithm

Type: str

dataset_type

name of the dataset the implementation was tested with

Type: str

training_as_test

flag for the use of training data as test data

Type: bool

predictions

predicted labels of the implementation on the data set

Type: pd.Series

probabilities

tuple of predicted probabilities for both classes of the implementation on the data set

Type: pd.Series

actuals

actual labels of the data set

Type: pd.Series

class atoml_cmp.evaluation.Archive(name: Optional[str] = None, archive_folder=None, yaml_folder=None, pred_folder=None, test_folder=None, print_all: bool = True)[source]

Instance of an archive where all the generated metrics and plots can be saved.

The archive folder name is generated with a time stamp by default. Alternatively, a name can be given to the constructor. In addition to the evaluation results also the foundation can be saved, meaning the current yaml-folder. Be aware that this can lead to inconsistencies, when changing the folder’s content during runtime. Moreover, the predictions can be saved for reproducing the evaluations.

path

path of the archive

Type: str

print_all

if flag is set results are printed in the function

Type: bool

archive_data_frame(df: pandas.core.frame.DataFrame, filename: str = 'result_table.csv', by_dataset: bool = False, by_algorithm: bool = False)[source]

Archives a dataframe in a csv file in the archive path.

Parameters

df – dataframe to be archived
filename – filename of the generated csv file
by_dataset – True if the dataframe is a view for a specific dataset and should be saved in a separate folder.
by_algorithm – True if the dataframe is a view for a specific algorithm type and should be saved in a separate folder. Only done if by_dataset is False

atoml_cmp.evaluation.chi2_statistic(pred1: pandas.core.series.Series, pred2: pandas.core.series.Series, print_all: bool = True)[source]

Does the Chi-squared test with two sets of prediction labels.

Parameters

pred1 – Series of predicted labels
pred2 – Series of predicted labels
print_all – if flag is set results are printed in the function

Returns: p-value of chi-squared test

atoml_cmp.evaluation.compare_two_algorithms(x: atoml_cmp.evaluation.Algorithm, y: atoml_cmp.evaluation.Algorithm, df: pandas.core.frame.DataFrame, print_all: bool = True) → pandas.core.frame.DataFrame[source]

Compares two prediction results and creates different metrics. The metrics are the confusion matrix, the Kolmogorov–Smirnov test result, the Chi2 test result and also the accuracy of the two prediction sets compared to the actual values

Parameters

x – the results for one specific algorithm implementation on one dataset
y – the results for one specific algorithm implementation on one dataset
df – result overview dataframe with different metrics
print_all – if flag is set results are printed in the function

Returns: result overview dataframe with different metrics

atoml_cmp.evaluation.create_confusion_matrix(predictions1: pandas.core.series.Series, predictions2: pandas.core.series.Series, print_all: bool = True) → Tuple[bool, numpy.ndarray][source]

Creates the confusion matrix between 2 sets of labels.

Parameters

predictions1 – Series of predicted labels
predictions2 – Series of predicted labels
print_all – if flag is set results are printed in the function

Returns

Equality flag - True, if predicted labels are identical
Confusion matrix

atoml_cmp.evaluation.create_views_by_algorithm(df: Optional[pandas.core.frame.DataFrame] = None, csv_file: Optional[str] = None, archive: Optional[atoml_cmp.evaluation.Archive] = None, print_all: bool = True)[source]

Creates a views on dataframe based on the algorithms.

The function creates smaller dataframes only containing comparison results based on the same algorithm out of a bigger dataframe. It works either directly with a DataFrame as input or with the path of a csv file containing the input dataframe. If both are given the DataFrame is used. The result can be shown and/or saved in an archive.

Parameters

df – DataFrame from which to create views for the single algorithms
csv_file – path to the csv file with the Dataframe from which to create views for the single algorithms
archive – archive instance which is used to save data
print_all – if flag is set results are printed in the function

Raises

RuntimeError – if neither Dataframe nor csv file are given as input.

atoml_cmp.evaluation.evaluate_results(prediction_folder: str, yaml_folder: Optional[str] = None, archive_folder: Optional[str] = None, gen_tests_folder: Optional[str] = None, print_all: bool = True) → int[source]

Main function for the evaluation of the prediction csv files.

The function reads in all csv files from a specific folder. Gathers meta data from the csv file names and evaluates the content of the files. For that it creates different metrics and histograms which can be saved together with the current yaml folder for the csv file creation in an archive folder.

Parameters

prediction_folder – relative path to the folder with the prediction files
yaml_folder – relative path to the folder with the yaml definitions of the ML algorithms. If no folder is given, the yaml files will not be saved in the archive.
archive_folder – relative path to the folder where the archive should be saved. If no folder is given, no archive will be created.
gen_tests_folder – relative path to the folder where the test cases are located. This is only for the archiving. If no folder is given, the tests will not be stored in archive.
print_all – if flag is set results are printed in the function

Return: number of read csv files

atoml_cmp.evaluation.get_data_from_csv(filename: str, print_all: bool = True) → Tuple[pandas.core.series.Series, pandas.core.series.Series, pandas.core.series.Series, pandas.core.series.Series][source]

Reads in a csv file with the specified format and extracts the different columns.

Parameters

filename – relative or absolute filepath
print_all – if flag is set results are printed in the function

Returns

predicted labels
predicted probability for class 0
predicted probability for class 1
actual labels

atoml_cmp.evaluation.get_delta_of_scores(pred_prob1: pandas.core.series.Series, pred_prob2: pandas.core.series.Series) → int[source]

Compares the scores (pred_prob) of two algorithms and calculates a delta value.

Parameters

pred_prob1 – prediction probabilities (scores) of the first algorithm
pred_prob2 – prediction probabilities (scores) of the second algorithm

Returns: number of results where the difference between the scores is greater than a defined epsilon

atoml_cmp.evaluation.get_pred_file_metadata(filename: str) → Tuple[str, str, str, bool][source]

Splits a prediction csv filename to get the information it contains.

Parameters

filename – The filename should consist of the keyword ‘pred’ and 3 identifier: for the framework (in capital letters), the algorithm and the used data set type. Example: ‘pred_FRAMEWORK_Algorithm_TestDataType.csv’

Returns

Name of the framework
Name of the algorithm
Name of the dataset
Training data as test data flag

Raises

RuntimeError – if the filename is not a csv filename
RuntimeWarning – if the filename does not contain the right amount of parameters

atoml_cmp.evaluation.ks_statistic(pred_prob1: pandas.core.series.Series, pred_prob2: pandas.core.series.Series, print_all: bool = True) → Tuple[float, float][source]

Does the Kolmogorov–Smirnov test with two probability distributions.

Parameters

pred_prob1 – Series of predicted probabilities (scores)
pred_prob2 – Series of predicted probabilities (scores)
print_all – if flag is set results are printed in the function

Returns

p-value of KS test
KS test statistic

atoml_cmp.evaluation.plot_probabilities(algorithms: List[atoml_cmp.evaluation.Algorithm], archive: Optional[atoml_cmp.evaluation.Archive] = None, show_plot=False, print_all: bool = True)[source]

Plots the probability distributions of a list of implementations.

Parameters

algorithms – list of implementation instances which probabilities are to plot
archive – archive instance which is used to save data
show_plot – if flag is set the plot is shown in program
print_all – if flag is set results are printed in the function

atoml_cmp.evaluation.set_pandas_print_full_df()[source]: Sets the pandas options to print a full dataframe without cutting of parts.