python - How to use the a k-fold cross validation in scikit with naive bayes classifier and NLTK -
i have small corpus , want calculate accuracy of naive bayes classifier using 10-fold cross validation, how can it.
your options either set or use nltk-trainer since nltk doesn't directly support cross-validation machine learning algorithms.
i'd recommend using module if want write own code following.
supposing want 10-fold, have partition training set 10
subsets, train on 9/10
, test on remaining 1/10
, , each combination of subsets (10
).
assuming training set in list named training
, simple way accomplish be,
num_folds = 10 subset_size = len(training)/num_folds in range(num_folds): testing_this_round = training[i*subset_size:][:subset_size] training_this_round = training[:i*subset_size] + training[(i+1)*subset_size:] # train using training_this_round # evaluate against testing_this_round # save accuracy # find mean accuracy on rounds
Comments
Post a Comment