IO#

A module dedicated to various IO operations.

Users are encouraged to use this when trying to import confusion matrices from disk.

File Formats#

load_csv #

load_csv(
    location: str | Path,
    encoding: str = "utf-8",
    newline: str = "\n",
    dialect: str = "excel",
    delimiter: str = ",",
    lineterminator: str = "\r\n",
    dtype: DTypeLike = int64,
) -> Int[ndarray, " num_classes num_classes"]

Loads a CSV file into memory, and parses it as if it were a valid confusion matrix.

Parameters:

location (str | Path) –

the location of csv file containing the confusion matrix
encoding (str, default: 'utf-8' ) –

the encoding of the confusion matrix file
newline (str, default: '\n' ) –

the newline character used in the confusion matrix file
dialect (str, default: 'excel' ) –

the csv dialect, passed to csv.reader
delimiter (str, default: ',' ) –

the csv delimiter character, passed to csv.reader
lineterminator (str, default: '\r\n' ) –

the csv lineterminator character, passed to csv.reader
dtype (DTypeLike, default: int64 ) –

the desired dtype of the numpy array. Defaults to int64.

Returns:

Int[ndarray, ' num_classes num_classes'] –

Int[ndarray, 'num_classes num_classes']: the parsed confusion matrix

Utilities#

validate_confusion_matrix #

validate_confusion_matrix(
    confusion_matrix: Int[ArrayLike, " num_classes num_classes"],
    dtype: DTypeLike = int64,
) -> Int[ndarray, " num_classes num_classes"]

Validates a confusion matrix to prevent any future funny business.

For a confusion matrix to be valid, it: 1. Must be square matrix (i.e., arrays with 2 dimensions) 2. Must contain only positive integers 3. Must contain at least 2 classes 4. Must have at least one record for each ground-truth class 5. Should have at least one record for each prediction

Parameters:

confusion_matrix (Int[ndarray, 'num_classes num_classes']) –

the confusion matrix
dtype (DTypeLike, default: int64 ) –

the desired dtype of the numpy array. Defaults to int64.

Returns:

Int[ndarray, ' num_classes num_classes'] –

Int[ndarray, 'num_classes num_classes']: the validated confusion matrix as a numpy ndarray

pred_cond_to_confusion_matrix #

pred_cond_to_confusion_matrix(
    pred_cond: Int[ndarray, " num_samples 2"], *, pred_first: bool = True
) -> Int[ndarray, " num_classes num_classes"]

Converts an array-like of model prediction, ground truth pairs into an unnormalized confusion matrix.

Confusion matrix always has predictions on the columns, condition on the rows.

Parameters:

pred_cond (Int[ndarray, ' num_samples 2']) –

the arraylike collection of predictions
pred_first (bool, default: True ) –

whether the model prediction is on the first column, or the ground truth label. Defaults to True.

Returns:

Int[ndarray, ' num_classes num_classes'] –

jtyping.Int[np.ndarray, ' num_classes num_classes']

confusion_matrix_to_pred_cond #

confusion_matrix_to_pred_cond(
    confusion_matrix: Int[ndarray, " num_classes num_classes"],
    *,
    pred_first: bool = True,
) -> Int[ndarray, " num_samples 2"]

Converts an unnormalized confusion matrix into an array of model prediction, ground truth pairs.

Assumes predictions on the columns, condition on the rows of the confusion matrix.

Parameters:

confusion_matrix (Int[ndarray, ' num_classes num_classes']) –

the unnormalized confusion matrix
pred_first (bool, default: True ) –

whether the model prediction should be on the first column, or the ground truth label. Defaults to True.

Returns:

Int[ndarray, ' num_samples 2'] –

jtyping.Int[np.ndarray, ' num_samples 2']