python - Request an url not allowed for get status code from the response -

- June 15, 2013

i'm looking solution making request on not allowed domain checking outbound links.

but function "parse_outboundlinks" never called.

i must modify allowed domain ?

thanks help

my code :

    name = "myspider"     allowed_domains = ["monsite.fr"]     start_urls = ["http://www.monsite.fr/"]     rules = [rule(sgmllinkextractor(allow=()),follow='true',callback='parse_item')]      def parse_item(self, response):         xlink = sgmllinkextractor(deny_domains=(self.allowed_domains[0]))         link in xlink.extract_links(response):             request(link.url, callback=self.parse_outboundlinks)     def parse_outboundlinks(self, response):          print response.status

parse function called if yield specified.
change request(link.url, callback=self.parse_outboundlinks) yield request(link.url, callback=self.parse_outboundlinks)

similar problem in other threads.
scrapy's request function not being called

Search This Blog

HPH

python - Request an url not allowed for get status code from the response -

Comments

Post a Comment

Popular posts from this blog

c++ - Function signature as a function template parameter -

algorithm - What are some ways to combine a number of (potentially incompatible) sorted sub-sets of a total set into a (partial) ordering of the total set? -

How to call a javascript function after the page loads with a chrome extension? -