python - Request an url not allowed for get status code from the response -
i'm looking solution making request on not allowed domain checking outbound links.
but function "parse_outboundlinks" never called.
i must modify allowed domain ?
thanks help
my code :
name = "myspider" allowed_domains = ["monsite.fr"] start_urls = ["http://www.monsite.fr/"] rules = [rule(sgmllinkextractor(allow=()),follow='true',callback='parse_item')] def parse_item(self, response): xlink = sgmllinkextractor(deny_domains=(self.allowed_domains[0])) link in xlink.extract_links(response): request(link.url, callback=self.parse_outboundlinks) def parse_outboundlinks(self, response): print response.status
parse function called if yield
specified.
change request(link.url, callback=self.parse_outboundlinks)
yield request(link.url, callback=self.parse_outboundlinks)
similar problem in other threads.
scrapy's request function not being called
Comments
Post a Comment