evaluation
Set of functions needed for the evaluation of the results.
With a call of the evaluateResults() function the module takes the output of the atoml tests (predictions folder) and the algorithm-descriptions folder as input. From that it generates result metrics and plots. The results can be saved in an archive.
- class atoml_cmp.evaluation.Algorithm(filename: str, path: Optional[str] = None, print_all: bool = True)[source]
Holds all information about one specific implementation tested on one specific dataset.
- filename
csv file name of the data file
- Type
str
- path
path to the csv file folder
- Type
str
- framework
framework of the implementation
- Type
str
- name
name of the implemented algorithm
- Type
str
- dataset_type
name of the dataset the implementation was tested with
- Type
str
- training_as_test
flag for the use of training data as test data
- Type
bool
- predictions
predicted labels of the implementation on the data set
- Type
pd.Series
- probabilities
tuple of predicted probabilities for both classes of the implementation on the data set
- Type
pd.Series
- actuals
actual labels of the data set
- Type
pd.Series
- class atoml_cmp.evaluation.Archive(name: Optional[str] = None, archive_folder=None, yaml_folder=None, pred_folder=None, test_folder=None, print_all: bool = True)[source]
Instance of an archive where all the generated metrics and plots can be saved.
The archive folder name is generated with a time stamp by default. Alternatively, a name can be given to the constructor. In addition to the evaluation results also the foundation can be saved, meaning the current yaml-folder. Be aware that this can lead to inconsistencies, when changing the folder’s content during runtime. Moreover, the predictions can be saved for reproducing the evaluations.
- path
path of the archive
- Type
str
- print_all
if flag is set results are printed in the function
- Type
bool
- archive_data_frame(df: pandas.core.frame.DataFrame, filename: str = 'result_table.csv', by_dataset: bool = False, by_algorithm: bool = False)[source]
Archives a dataframe in a csv file in the archive path.
- Parameters
df – dataframe to be archived
filename – filename of the generated csv file
by_dataset – True if the dataframe is a view for a specific dataset and should be saved in a separate folder.
by_algorithm – True if the dataframe is a view for a specific algorithm type and should be saved in a separate folder. Only done if by_dataset is False
- atoml_cmp.evaluation.chi2_statistic(pred1: pandas.core.series.Series, pred2: pandas.core.series.Series, print_all: bool = True)[source]
Does the Chi-squared test with two sets of prediction labels.
- Parameters
pred1 – Series of predicted labels
pred2 – Series of predicted labels
print_all – if flag is set results are printed in the function
Returns: p-value of chi-squared test
- atoml_cmp.evaluation.compare_two_algorithms(x: atoml_cmp.evaluation.Algorithm, y: atoml_cmp.evaluation.Algorithm, df: pandas.core.frame.DataFrame, print_all: bool = True) pandas.core.frame.DataFrame[source]
Compares two prediction results and creates different metrics. The metrics are the confusion matrix, the Kolmogorov–Smirnov test result, the Chi2 test result and also the accuracy of the two prediction sets compared to the actual values
- Parameters
x – the results for one specific algorithm implementation on one dataset
y – the results for one specific algorithm implementation on one dataset
df – result overview dataframe with different metrics
print_all – if flag is set results are printed in the function
Returns: result overview dataframe with different metrics
- atoml_cmp.evaluation.create_confusion_matrix(predictions1: pandas.core.series.Series, predictions2: pandas.core.series.Series, print_all: bool = True) Tuple[bool, numpy.ndarray][source]
Creates the confusion matrix between 2 sets of labels.
- Parameters
predictions1 – Series of predicted labels
predictions2 – Series of predicted labels
print_all – if flag is set results are printed in the function
- Returns
Equality flag - True, if predicted labels are identical
Confusion matrix
- atoml_cmp.evaluation.create_views_by_algorithm(df: Optional[pandas.core.frame.DataFrame] = None, csv_file: Optional[str] = None, archive: Optional[atoml_cmp.evaluation.Archive] = None, print_all: bool = True)[source]
Creates a views on dataframe based on the algorithms.
The function creates smaller dataframes only containing comparison results based on the same algorithm out of a bigger dataframe. It works either directly with a DataFrame as input or with the path of a csv file containing the input dataframe. If both are given the DataFrame is used. The result can be shown and/or saved in an archive.
- Parameters
df – DataFrame from which to create views for the single algorithms
csv_file – path to the csv file with the Dataframe from which to create views for the single algorithms
archive – archive instance which is used to save data
print_all – if flag is set results are printed in the function
- Raises
RuntimeError – if neither Dataframe nor csv file are given as input.
- atoml_cmp.evaluation.evaluate_results(prediction_folder: str, yaml_folder: Optional[str] = None, archive_folder: Optional[str] = None, gen_tests_folder: Optional[str] = None, print_all: bool = True) int[source]
Main function for the evaluation of the prediction csv files.
The function reads in all csv files from a specific folder. Gathers meta data from the csv file names and evaluates the content of the files. For that it creates different metrics and histograms which can be saved together with the current yaml folder for the csv file creation in an archive folder.
- Parameters
prediction_folder – relative path to the folder with the prediction files
yaml_folder – relative path to the folder with the yaml definitions of the ML algorithms. If no folder is given, the yaml files will not be saved in the archive.
archive_folder – relative path to the folder where the archive should be saved. If no folder is given, no archive will be created.
gen_tests_folder – relative path to the folder where the test cases are located. This is only for the archiving. If no folder is given, the tests will not be stored in archive.
print_all – if flag is set results are printed in the function
Return: number of read csv files
- atoml_cmp.evaluation.get_data_from_csv(filename: str, print_all: bool = True) Tuple[pandas.core.series.Series, pandas.core.series.Series, pandas.core.series.Series, pandas.core.series.Series][source]
Reads in a csv file with the specified format and extracts the different columns.
- Parameters
filename – relative or absolute filepath
print_all – if flag is set results are printed in the function
- Returns
predicted labels
predicted probability for class 0
predicted probability for class 1
actual labels
- atoml_cmp.evaluation.get_delta_of_scores(pred_prob1: pandas.core.series.Series, pred_prob2: pandas.core.series.Series) int[source]
Compares the scores (pred_prob) of two algorithms and calculates a delta value.
- Parameters
pred_prob1 – prediction probabilities (scores) of the first algorithm
pred_prob2 – prediction probabilities (scores) of the second algorithm
Returns: number of results where the difference between the scores is greater than a defined epsilon
- atoml_cmp.evaluation.get_pred_file_metadata(filename: str) Tuple[str, str, str, bool][source]
Splits a prediction csv filename to get the information it contains.
- Parameters
filename – The filename should consist of the keyword ‘pred’ and 3 identifier: for the framework (in capital letters), the algorithm and the used data set type. Example: ‘pred_FRAMEWORK_Algorithm_TestDataType.csv’
- Returns
Name of the framework
Name of the algorithm
Name of the dataset
Training data as test data flag
- Raises
RuntimeError – if the filename is not a csv filename
RuntimeWarning – if the filename does not contain the right amount of parameters
- atoml_cmp.evaluation.ks_statistic(pred_prob1: pandas.core.series.Series, pred_prob2: pandas.core.series.Series, print_all: bool = True) Tuple[float, float][source]
Does the Kolmogorov–Smirnov test with two probability distributions.
- Parameters
pred_prob1 – Series of predicted probabilities (scores)
pred_prob2 – Series of predicted probabilities (scores)
print_all – if flag is set results are printed in the function
- Returns
p-value of KS test
KS test statistic
- atoml_cmp.evaluation.plot_probabilities(algorithms: List[atoml_cmp.evaluation.Algorithm], archive: Optional[atoml_cmp.evaluation.Archive] = None, show_plot=False, print_all: bool = True)[source]
Plots the probability distributions of a list of implementations.
- Parameters
algorithms – list of implementation instances which probabilities are to plot
archive – archive instance which is used to save data
show_plot – if flag is set the plot is shown in program
print_all – if flag is set results are printed in the function