文章详情页

利用python中的matplotlib打印混淆矩阵实例

浏览：170日期：2022-07-21 09:43:50

前面说过混淆矩阵是我们在处理分类问题时，很重要的指标，那么如何更好的把混淆矩阵给打印出来呢，直接做表或者是前端可视化，小编曾经就尝试过用前端（D5）做出来，然后截图，显得不那么好看。。

代码：

import itertoolsimport matplotlib.pyplot as pltimport numpy as np def plot_confusion_matrix(cm, classes, normalize=False, title=’Confusion matrix’, cmap=plt.cm.Blues): ''' This function prints and plots the confusion matrix. Normalization can be applied by setting `normalize=True`. ''' if normalize: cm = cm.astype(’float’) / cm.sum(axis=1)[:, np.newaxis] print('Normalized confusion matrix') else: print(’Confusion matrix, without normalization’) print(cm) plt.imshow(cm, interpolation=’nearest’, cmap=cmap) plt.title(title) plt.colorbar() tick_marks = np.arange(len(classes)) plt.xticks(tick_marks, classes, rotation=45) plt.yticks(tick_marks, classes) fmt = ’.2f’ if normalize else ’d’ thresh = cm.max() / 2. for i, j in itertools.product(range(cm.shape[0]), range(cm.shape[1])): plt.text(j, i, format(cm[i, j], fmt), horizontalalignment='center', color='white' if cm[i, j] > thresh else 'black') plt.tight_layout() plt.ylabel(’True label’) plt.xlabel(’Predicted label’) plt.show() # plt.savefig(’confusion_matrix’,dpi=200) cnf_matrix = np.array([ [4101, 2, 5, 24, 0], [50, 3930, 6, 14, 5], [29, 3, 3973, 4, 0], [45, 7, 1, 3878, 119], [31, 1, 8, 28, 3936],]) class_names = [’Buildings’, ’Farmland’, ’Greenbelt’, ’Wasteland’, ’Water’] # plt.figure()# plot_confusion_matrix(cnf_matrix, classes=class_names,# title=’Confusion matrix, without normalization’) # Plot normalized confusion matrixplt.figure()plot_confusion_matrix(cnf_matrix, classes=class_names, normalize=True, title=’Normalized confusion matrix’)

在放矩阵位置，放一下你的混淆矩阵就可以，当然可视化混淆矩阵这一步也可以直接在模型运行中完成。

补充知识：混淆矩阵(Confusion matrix)的原理及使用(scikit-learn 和 tensorflow)

原理

在机器学习中, 混淆矩阵是一个误差矩阵, 常用来可视化地评估监督学习算法的性能. 混淆矩阵大小为 (n_classes, n_classes) 的方阵, 其中 n_classes 表示类的数量. 这个矩阵的每一行表示真实类中的实例, 而每一列表示预测类中的实例 (Tensorflow 和 scikit-learn 采用的实现方式). 也可以是, 每一行表示预测类中的实例, 而每一列表示真实类中的实例 (Confusion matrix From Wikipedia 中的定义). 通过混淆矩阵, 可以很容易看出系统是否会弄混两个类, 这也是混淆矩阵名字的由来.

混淆矩阵是一种特殊类型的列联表(contingency table)或交叉制表(cross tabulation or crosstab). 其有两维 (真实值 'actual' 和预测值 'predicted' ), 这两维都具有相同的类('classes')的集合. 在列联表中, 每个维度和类的组合是一个变量. 列联表以表的形式, 可视化地表示多个变量的频率分布.

使用混淆矩阵( scikit-learn 和 Tensorflow)

下面先介绍在 scikit-learn 和 tensorflow 中计算混淆矩阵的 API (Application Programming Interface) 接口函数, 然后在一个示例中, 使用这两个 API 函数.

scikit-learn 混淆矩阵函数 sklearn.metrics.confusion_matrix API 接口

skearn.metrics.confusion_matrix( y_true, # array, Gound true (correct) target values y_pred, # array, Estimated targets as returned by a classifier labels=None, # array, List of labels to index the matrix. sample_weight=None # array-like of shape = [n_samples], Optional sample weights)

在 scikit-learn 中, 计算混淆矩阵用来评估分类的准确度.

按照定义, 混淆矩阵 C 中的元素 Ci,j 等于真实值为组 i , 而预测为组 j 的观测数(the number of observations). 所以对于二分类任务, 预测结果中, 正确的负例数(true negatives, TN)为 C0,0; 错误的负例数(false negatives, FN)为 C1,0; 真实的正例数为 C1,1; 错误的正例数为 C0,1.

如果 labels 为 None, scikit-learn 会把在出现在 y_true 或 y_pred 中的所有值添加到标记列表 labels 中, 并排好序.

Tensorflow 混淆矩阵函数 tf.confusion_matrix API 接口

tf.confusion_matrix( labels, # 1-D Tensor of real labels for the classification task predictions, # 1-D Tensor of predictions for a givenclassification num_classes=None, # The possible number of labels the classification task can have dtype=tf.int32, # Data type of the confusion matrix name=None, # Scope name weights=None, # An optional Tensor whose shape matches predictions)

Tensorflow tf.confusion_matrix 中的 num_classes 参数的含义, 与 scikit-learn sklearn.metrics.confusion_matrix 中的 labels 参数相近, 是与标记有关的参数, 表示类的总个数, 但没有列出具体的标记值. 在 Tensorflow 中一般是以整数作为标记, 如果标记为字符串等非整数类型, 则需先转为整数表示. 如果 num_classes 参数为 None, 则把 labels 和 predictions 中的最大值 + 1, 作为num_classes 参数值.

tf.confusion_matrix 的 weights 参数和 sklearn.metrics.confusion_matrix 的 sample_weight 参数的含义相同, 都是对预测值进行加权, 在此基础上, 计算混淆矩阵单元的值.

使用示例

#!/usr/bin/env python# -*- coding: utf8 -*-'''Author: klchangDescription: A simple example for tf.confusion_matrix and sklearn.metrics.confusion_matrix.Date: 2018.9.8'''from __future__ import print_functionimport tensorflow as tfimport sklearn.metrics y_true = [1, 2, 4]y_pred = [2, 2, 4] # Build graph with tf.confusion_matrix operationsess = tf.InteractiveSession()op = tf.confusion_matrix(y_true, y_pred)op2 = tf.confusion_matrix(y_true, y_pred, num_classes=6, dtype=tf.float32, weights=tf.constant([0.3, 0.4, 0.3]))# Execute the graphprint ('confusion matrix in tensorflow: ')print ('1. default: n', op.eval())print ('2. customed: n', sess.run(op2))sess.close() # Use sklearn.metrics.confusion_matrix functionprint ('nconfusion matrix in scikit-learn: ')print ('1. default: n', sklearn.metrics.confusion_matrix(y_true, y_pred))print ('2. customed: n', sklearn.metrics.confusion_matrix(y_true, y_pred, labels=range(6), sample_weight=[0.3, 0.4, 0.3]))

以上这篇利用python中的matplotlib打印混淆矩阵实例就是小编分享给大家的全部内容了，希望能给大家一个参考，也希望大家多多支持好吧啦网。

Python 编程

上一条：利用Python实现Excel的文件间的数据匹配功能下一条：Python SMTP配置参数并发送邮件

相关文章：

1. IntelliJ IDEA安装插件的方法步骤2. CSS hack用法案例详解3. 通过IEAD+Maven快速搭建SSM项目的过程(Spring + Spring MVC + Mybatis)4. golang中json小谈之字符串转浮点数的操作5. IntelliJ IDEA设置编码格式的方法6. IntelliJ IDEA恢复删除文件的方法7. IntelliJ IDEA设置自动提示功能快捷键的方法8. phpstudy apache开启ssi使用详解9. centos下配置ftp允许以root用户身份登录10. Spring的核心机制依赖注入简介

排行榜

					
					CSS hack用法案例详解
IntelliJ IDEA安装插件的方法步骤
IntelliJ IDEA恢复删除文件的方法
通过IEAD+Maven快速搭建SSM项目的过程(Spring + Spring MVC + Mybatis)
golang中json小谈之字符串转浮点数的操作
IntelliJ IDEA设置编码格式的方法
IntelliJ IDEA设置自动提示功能快捷键的方法
详解SpringBoot中关于%2e的Trick
Python加密word文档详解
phpstudy apache开启ssi使用详解
centos下配置ftp允许以root用户身份登录
				

热门标签