I am following a this tutorial to write a Naive Bayes Classifier:
http://machinelearningmastery.com/naive-bayes-classifier-scratch-python/
I keep getting this error:
dataset[i] = [float(x) for x in dataset[i]]
ValueError: could not convert string to float:
Here is the part of my code where the error occurs:
def loadDatasetNB(filename):
lines = csv.reader(open(filename, "rt"))
dataset = list(lines)
for i in range(len(dataset)):
dataset[i] = [float(x) for x in dataset[i]]
return dataset
And here is how the file is called:
def NB_Analysis():
filename = 'fvectors.csv'
splitRatio = 0.67
dataset = loadDatasetNB(filename)
trainingSet, testSet = splitDatasetNB(dataset, splitRatio)
print('Split {0} rows into train={1} and test={2} rows').format(len(dataset), len(trainingSet), len(testSet))
# prepare model
summaries = summarizeByClassNB(trainingSet)
# test model
predictions = getPredictionsNB(summaries, testSet)
accuracy = getAccuracyNB(testSet, predictionsNB)
print('Accuracy: {0}%').format(accuracy)
NB_Analysis()
My file fvectors.csv looks like this
What is going wrong here and how do I fix it?
Kenil Vasani
Try to skip a header, an empty header in the first column is causing the issue.
If you want to skip the header you can achieve it with:
(2) Or you can just ignore the exception:
If you decide to go with option (2), make sure that you skip only first row or only rows that contain text and you know it for sure.