IO#
A module dedicated to various IO operations.
Users are encouraged to use this when trying to import confusion matrices from disk.
File Formats#
load_csv
#
load_csv(
location: str | Path,
encoding: str = "utf-8",
newline: str = "\n",
dialect: str = "excel",
delimiter: str = ",",
lineterminator: str = "\r\n",
dtype: DTypeLike = int64,
) -> Int[ndarray, " num_classes num_classes"]
Loads a CSV file into memory, and parses it as if it were a valid confusion matrix.
Parameters:
-
location
(str | Path
) –the location of csv file containing the confusion matrix
-
encoding
(str
, default:'utf-8'
) –the encoding of the confusion matrix file
-
newline
(str
, default:'\n'
) –the newline character used in the confusion matrix file
-
dialect
(str
, default:'excel'
) –the csv dialect, passed to
csv.reader
-
delimiter
(str
, default:','
) –the csv delimiter character, passed to
csv.reader
-
lineterminator
(str
, default:'\r\n'
) –the csv lineterminator character, passed to
csv.reader
-
dtype
(DTypeLike
, default:int64
) –the desired dtype of the numpy array. Defaults to int64.
Returns:
-
Int[ndarray, ' num_classes num_classes']
–Int[ndarray, 'num_classes num_classes']: the parsed confusion matrix
Utilities#
validate_confusion_matrix
#
validate_confusion_matrix(
confusion_matrix: Int[ArrayLike, " num_classes num_classes"],
dtype: DTypeLike = int64,
) -> Int[ndarray, " num_classes num_classes"]
Validates a confusion matrix to prevent any future funny business.
For a confusion matrix to be valid, it: 1. Must be square matrix (i.e., arrays with 2 dimensions) 2. Must contain only positive integers 3. Must contain at least 2 classes 4. Must have at least one record for each ground-truth class 5. Should have at least one record for each prediction
Parameters:
-
confusion_matrix
(Int[ndarray, 'num_classes num_classes']
) –the confusion matrix
-
dtype
(DTypeLike
, default:int64
) –the desired dtype of the numpy array. Defaults to int64.
Returns:
-
Int[ndarray, ' num_classes num_classes']
–Int[ndarray, 'num_classes num_classes']: the validated confusion matrix as a numpy ndarray
pred_cond_to_confusion_matrix
#
pred_cond_to_confusion_matrix(
pred_cond: Int[ndarray, " num_samples 2"], *, pred_first: bool = True
) -> Int[ndarray, " num_classes num_classes"]
Converts an array-like of model prediction, ground truth pairs into an unnormalized confusion matrix.
Confusion matrix always has predictions on the columns, condition on the rows.
Parameters:
-
pred_cond
(Int[ndarray, ' num_samples 2']
) –the arraylike collection of predictions
-
pred_first
(bool
, default:True
) –whether the model prediction is on the first column, or the ground truth label. Defaults to True.
Returns:
-
Int[ndarray, ' num_classes num_classes']
–jtyping.Int[np.ndarray, ' num_classes num_classes']
confusion_matrix_to_pred_cond
#
confusion_matrix_to_pred_cond(
confusion_matrix: Int[ndarray, " num_classes num_classes"],
*,
pred_first: bool = True,
) -> Int[ndarray, " num_samples 2"]
Converts an unnormalized confusion matrix into an array of model prediction, ground truth pairs.
Assumes predictions on the columns, condition on the rows of the confusion matrix.
Parameters:
-
confusion_matrix
(Int[ndarray, ' num_classes num_classes']
) –the unnormalized confusion matrix
-
pred_first
(bool
, default:True
) –whether the model prediction should be on the first column, or the ground truth label. Defaults to True.
Returns:
-
Int[ndarray, ' num_samples 2']
–jtyping.Int[np.ndarray, ' num_samples 2']