**Q1:** The confusion matrix typically shows the performance of a classification model. Each row represents the instances of an actual class, while each column represents the instances of a predicted class. (check the definition here):

It is a matrix of numbers that tell us where a model gets confused. It is a class-wise distribution of the predictive performance of a classification model—that is, the confusion matrix is an organized way of mapping the predictions to the original classes to which the data belong.

The diagonal values in a confusion matrix generally represent true positives for each class. This means that the model correctly predicted the class. For instance, if a diagonal value for a specific class is 50, it indicates that the model correctly identified 50 instances of that class.

**Q2**: What you're observing, where the diagonal values are larger than the actual true positives, could be due to several factors. One possibility is that the tool you are using might be including other types of correct predictions (like true negatives for other classes) in this count. It might also be a result of how the tool aggregates or displays data. When you click into a value and see fewer true positives, it's showing a more detailed or filtered view. This might exclude certain types of correct predictions that are included in the broader diagonal count.

**Q3 :** Typically, confusion matrices contain integer values, as they count the number of instances. However, there could be cases where float values appear. This might happen if the matrix is normalized (dividing each value by the total number of instances) to provide proportional data instead of absolute counts. Another possibility is that the model outputs probabilities or confidence levels instead of definite class predictions, which are then translated into float values in the matrix.

You can also learn more : https://learn.microsoft.com/en-us/training/modules/machine-learning-confusion-matrix/