How To Add Correct Labels For Seaborn Confusion Matrix
I have plotted my data into a confusion matrix using seaborn but I ran into a problem. The problem is that it is only showing numbers from 0 to 11, on both axes, because I have 12
Solution 1:
When you factorize your categories, you should have retained the levels, so you can use that in conjunction with pd.crosstab
instead of confusion_matrix
to plot. Using iris as example:
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import make_classification
from sklearn.metrics import classification_report, confusion_matrix
df = pd.read_csv("http://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data",
header=None,names=["s.wid","s.len","p.wid","p.len","species"])
X = df.iloc[:,:4]
y,levels = pd.factorize(df['species'])
At this part, you get the labels y in [0,..1,..2] and levels as the original labels to which 0,1,2 corresponds to:
Index(['Iris-setosa', 'Iris-versicolor', 'Iris-virginica'], dtype='object')
So we fit and do like what you have:
clf = RandomForestClassifier(max_depth=2, random_state=0)
clf.fit(X,y)
y_pred = clf.predict(X)
print(classification_report(y,y_pred,target_names=levels))
And a confusion matrix with 0,1,2:
cf_matrix = confusion_matrix(y, y_pred)
sns.heatmap(cf_matrix, linewidths=1, annot=True, fmt='g')
We go back and use the levels:
cf_matrix = pd.crosstab(levels[y],levels[y_pred])
fig, ax = plt.subplots(figsize=(5,5))
sns.heatmap(cf_matrix, linewidths=1, annot=True, ax=ax, fmt='g')
Solution 2:
Labels are sorted alphabetically. So, use numpy to DISTINCT the ture_label you will get an alphabetically sorted ndarray
cm_labels = np.unique(true_label)
cm_array = confusion_matrix(true_label, predict_label)
cm_array_df = pd.DataFrame(cm_array, index=cm_labels, columns=cm_labels)
sn.heatmap(cm_array_df, annot=True, annot_kws={"size": 12})
Post a Comment for "How To Add Correct Labels For Seaborn Confusion Matrix"