python - 急求问——关于iteritems不能用的问题
问题描述
明天就是毕设中期答辩了,想赶一下进度,现在是做贝叶斯网络预测模型,在网上找了一个例子,想跑一下试试,结果报错iteritems用不了,好像是Python版本的问题,想问问有没有什么比较快速一点的解决办法,救急如救火,先谢谢各位了
# Example of Naive Bayes implemented from Scratch in Pythonimport csvimport randomimport mathdef loadCsv(filename): lines = csv.reader(open(filename, 'rb')) dataset = list(lines) for i in range(len(dataset)):dataset[i] = [float(x) for x in dataset[i]] return datasetdef splitDataset(dataset, splitRatio): trainSize = int(len(dataset) * splitRatio) trainSet = [] copy = list(dataset) while len(trainSet) < trainSize:index = random.randrange(len(copy))trainSet.append(copy.pop(index)) return [trainSet, copy]def separateByClass(dataset): separated = {} for i in range(len(dataset)):vector = dataset[i]if (vector[-1] not in separated): separated[vector[-1]] = []separated[vector[-1]].append(vector) return separateddef mean(numbers): return sum(numbers)/float(len(numbers))def stdev(numbers): avg = mean(numbers) variance = sum([pow(x-avg,2) for x in numbers])/float(len(numbers)-1) return math.sqrt(variance)def summarize(dataset): summaries = [(mean(attribute), stdev(attribute)) for attribute in zip(*dataset)] del summaries[-1] return summariesdef summarizeByClass(dataset): separated = separateByClass(dataset) summaries = {} for classValue, instances in separated.iteritems():summaries[classValue] = summarize(instances) return summariesdef calculateProbability(x, mean, stdev): exponent = math.exp(-(math.pow(x-mean,2)/(2*math.pow(stdev,2)))) return (1 / (math.sqrt(2*math.pi) * stdev)) * exponentdef calculateClassProbabilities(summaries, inputVector): probabilities = {} for classValue, classSummaries in summaries.iteritems():probabilities[classValue] = 1for i in range(len(classSummaries)): mean, stdev = classSummaries[i] x = inputVector[i] probabilities[classValue] *= calculateProbability(x, mean, stdev) return probabilitiesdef predict(summaries, inputVector): probabilities = calculateClassProbabilities(summaries, inputVector) bestLabel, bestProb = None, -1 for classValue, probability in probabilities.iteritems():if bestLabel is None or probability > bestProb: bestProb = probability bestLabel = classValue return bestLabeldef getPredictions(summaries, testSet): predictions = [] for i in range(len(testSet)):result = predict(summaries, testSet[i])predictions.append(result) return predictionsdef getAccuracy(testSet, predictions): correct = 0 for i in range(len(testSet)):if testSet[i][-1] == predictions[i]: correct += 1 return (correct/float(len(testSet))) * 100.0def main(): filename = ’pima-indians-diabetes.data.csv’ splitRatio = 0.67 dataset = loadCsv(filename) trainingSet, testSet = splitDataset(dataset, splitRatio) print(’Split {0} rows into train={1} and test={2} rows’).format(len(dataset), len(trainingSet), len(testSet)) # prepare model summaries = summarizeByClass(trainingSet) # test model predictions = getPredictions(summaries, testSet) accuracy = getAccuracy(testSet, predictions) print(’Accuracy: {0}%’).format(accuracy)main()
报错如下:
Traceback (most recent call last): File 'E:/Data/BEYES/data_test.py', line 101, in <module> main() File 'E:/Data/BEYES/data_test.py', line 91, in main dataset = loadCsv(filename) File 'E:/Data/BEYES/data_test.py', line 8, in loadCsv dataset = list(lines)_csv.Error: iterator should return strings, not bytes (did you open the file in text mode?)
数据格式如下,文件名为pima-indians-diabetes.data.csv:
6,148,72,35,0,33.6,0.627,50,11,85,66,29,0,26.6,0.351,31,08,183,64,0,0,23.3,0.672,32,11,89,66,23,94,28.1,0.167,21,00,137,40,35,168,43.1,2.288,33,15,116,74,0,0,25.6,0.201,30,03,78,50,32,88,31.0,0.248,26,110,115,0,0,0,35.3,0.134,29,02,197,70,45,543,30.5,0.158,53,18,125,96,0,0,0.0,0.232,54,14,110,92,0,0,37.6,0.191,30,010,168,74,0,0,38.0,0.537,34,110,139,80,0,0,27.1,1.441,57,01,189,60,23,846,30.1,0.398,59,15,166,72,19,175,25.8,0.587,51,17,100,0,0,0,30.0,0.484,32,10,118,84,47,230,45.8,0.551,31,17,107,74,0,0,29.6,0.254,31,11,103,30,38,83,43.3,0.183,33,01,115,70,30,96,34.6,0.529,32,13,126,88,41,235,39.3,0.704,27,08,99,84,0,0,35.4,0.388,50,07,196,90,0,0,39.8,0.451,41,19,119,80,35,0,29.0,0.263,29,111,143,94,33,146,36.6,0.254,51,1
问题解答
回答1:lines = csv.reader(open(filename, 'rb'))
是谁教你的?建议这样写:
def loadCsv(filename): dataset = [] with open(filename) as f:reader = csv.reader(f)for row in reader: dataset.append([float(x) for x in row]) return dataset
文档里有例子的。
当然还能再省三行,不过那样就对新手不太友好了。
PS: 报错里根本没有提到 iteritems。这是个 Python 2.x 才需要的方法,Python 3.x 里直接用 items 就可以了。
相关文章:
1. 关docker hub上有些镜像的tag被标记““This image has vulnerabilities””2. debian - docker依赖的aufs-tools源码哪里可以找到啊?3. docker images显示的镜像过多,狗眼被亮瞎了,怎么办?4. angular.js - angular内容过长展开收起效果5. golang - 用IDE看docker源码时的小问题6. dockerfile - 为什么docker容器启动不了?7. 在windows下安装docker Toolbox 启动Docker Quickstart Terminal 失败!8. Docker for Mac 创建的dnsmasq容器连不上/不工作的问题9. docker网络端口映射,没有方便点的操作方法么?10. 前端 - angular报错?