python - Is there a better way to do csv/namedtuple with urlopen? -


using namedtuple documentation example template in python 3.3, have following code download csv , turn series of namedtuple subclass instances:

from collections import namedtuple csv import reader urllib.request import urlopen      securitytype = namedtuple('securitytype', 'sector, name')  url = 'http://bsym.bloomberg.com/sym/pages/security_type.csv' sec in map(securitytype._make, reader(urlopen(url))):     print(sec) 

this raises following exception:

traceback (most recent call last):   file "scrap.py", line 9, in <module>     sec in map(securitytype._make, reader(urlopen(url))): _csv.error: iterator should return strings, not bytes (did open file in text mode?) 

i know issue urlopen returning bytes , not strings , need decode output @ point. here's how i'm doing now, using stringio:

from collections import namedtuple csv import reader urllib.request import urlopen import io  securitytype = namedtuple('securitytype', 'sector, name')  url = 'http://bsym.bloomberg.com/sym/pages/security_type.csv' reader_input = io.stringio(urlopen(url).read().decode('utf-8'))  sec in map(securitytype._make, reader(reader_input)):     print(sec) 

this smells funny because i'm iterating on bytes buffer, decoding, rebuffering, iterating on new string buffer. there more pythonic way without 2 iterations?

use io.textiowrapper() decode urllib response:

reader_input = io.textiowrapper(urlopen(url), encoding='utf8', newline='') 

now csv.reader passed exact same interface when opening regular file on filesystem in text mode.

with change example url works me on python 3.3.1:

>>> sec in map(securitytype._make, reader(reader_input)): ...     print(sec) ...  securitytype(sector='market sector', name='security type') securitytype(sector='comdty', name='calendar spread option') securitytype(sector='comdty', name='financial commodity future.') securitytype(sector='comdty', name='financial commodity generic.') securitytype(sector='comdty', name='financial commodity option.') ... securitytype(sector='muni', name='zero coupon, oid') securitytype(sector='pfd', name='private') securitytype(sector='pfd', name='public') securitytype(sector='', name='') securitytype(sector='', name='') securitytype(sector='', name='') securitytype(sector='', name='') securitytype(sector='', name='') securitytype(sector='', name='') securitytype(sector='', name='') securitytype(sector='', name='') securitytype(sector='', name='') 

the last lines appear yield empty tuples; original indeed has lines nothing more comma on them.


Comments

Popular posts from this blog

c++ - Function signature as a function template parameter -

algorithm - What are some ways to combine a number of (potentially incompatible) sorted sub-sets of a total set into a (partial) ordering of the total set? -

How to call a javascript function after the page loads with a chrome extension? -