Skip to content

IO#

A module dedicated to various IO operations.

Users are encouraged to use this when trying to import confusion matrices from disk.

File Formats#

load_csv #

load_csv(
    location: str | Path,
    encoding: str = "utf-8",
    newline: str = "\n",
    dialect: str = "excel",
    delimiter: str = ",",
    lineterminator: str = "\r\n",
    dtype: DTypeLike = int64,
) -> Int[ndarray, " num_classes num_classes"]

Loads a CSV file into memory, and parses it as if it were a valid confusion matrix.

Parameters:

  • location (str | Path) –

    the location of csv file containing the confusion matrix

  • encoding (str, default: 'utf-8' ) –

    the encoding of the confusion matrix file

  • newline (str, default: '\n' ) –

    the newline character used in the confusion matrix file

  • dialect (str, default: 'excel' ) –

    the csv dialect, passed to csv.reader

  • delimiter (str, default: ',' ) –

    the csv delimiter character, passed to csv.reader

  • lineterminator (str, default: '\r\n' ) –

    the csv lineterminator character, passed to csv.reader

  • dtype (DTypeLike, default: int64 ) –

    the desired dtype of the numpy array. Defaults to int64.

Returns:

  • Int[ndarray, ' num_classes num_classes']

    Int[ndarray, 'num_classes num_classes']: the parsed confusion matrix

Utilities#

validate_confusion_matrix #

validate_confusion_matrix(
    confusion_matrix: Int[ArrayLike, " num_classes num_classes"],
    dtype: DTypeLike = int64,
) -> Int[ndarray, " num_classes num_classes"]

Validates a confusion matrix to prevent any future funny business.

For a confusion matrix to be valid, it: 1. Must be square matrix (i.e., arrays with 2 dimensions) 2. Must contain only positive integers 3. Must contain at least 2 classes 4. Must have at least one record for each ground-truth class 5. Should have at least one record for each prediction

Parameters:

  • confusion_matrix (Int[ndarray, 'num_classes num_classes']) –

    the confusion matrix

  • dtype (DTypeLike, default: int64 ) –

    the desired dtype of the numpy array. Defaults to int64.

Returns:

  • Int[ndarray, ' num_classes num_classes']

    Int[ndarray, 'num_classes num_classes']: the validated confusion matrix as a numpy ndarray

pred_cond_to_confusion_matrix #

pred_cond_to_confusion_matrix(
    pred_cond: Int[ndarray, " num_samples 2"], *, pred_first: bool = True
) -> Int[ndarray, " num_classes num_classes"]

Converts an array-like of model prediction, ground truth pairs into an unnormalized confusion matrix.

Confusion matrix always has predictions on the columns, condition on the rows.

Parameters:

  • pred_cond (Int[ndarray, ' num_samples 2']) –

    the arraylike collection of predictions

  • pred_first (bool, default: True ) –

    whether the model prediction is on the first column, or the ground truth label. Defaults to True.

Returns:

  • Int[ndarray, ' num_classes num_classes']

    jtyping.Int[np.ndarray, ' num_classes num_classes']

confusion_matrix_to_pred_cond #

confusion_matrix_to_pred_cond(
    confusion_matrix: Int[ndarray, " num_classes num_classes"],
    *,
    pred_first: bool = True,
) -> Int[ndarray, " num_samples 2"]

Converts an unnormalized confusion matrix into an array of model prediction, ground truth pairs.

Assumes predictions on the columns, condition on the rows of the confusion matrix.

Parameters:

  • confusion_matrix (Int[ndarray, ' num_classes num_classes']) –

    the unnormalized confusion matrix

  • pred_first (bool, default: True ) –

    whether the model prediction should be on the first column, or the ground truth label. Defaults to True.

Returns:

  • Int[ndarray, ' num_samples 2']

    jtyping.Int[np.ndarray, ' num_samples 2']