read a random subset of a csv file in matlab -
i have large(150000) dataset in csv format. data set has noise , error in of fields. want read file , perform classification svm(with libsvm) on it. need read subset of data clean , usable. choosing 10000 random records clean , none of fields noisy. fileds noisy has value 0 or na. how can matlab?
if want proper matlab solution, need make custom filereader. that's not worth effort, though.
the fastest solution can think of filter out erroneous lines using tool (such grep
) prior loading file in matlab csvread
. if have grep
, can rid of lines 'na':
cat file | grep --invert-match na > file.filtered
you can read file.filtered without issues matlab's csvread
function. can rid of rows 0's within matlab easily.
Comments
Post a Comment