Scrapy Following Links - Scrapy

How to follow links in Scrapy?


This chapter will explain how to extract page links based on your interest, follow them and extract data from that page. For this, you should make below changes in our previous code shown as follows

Above code contains below methods

  • parse() − It extracts the links based on your interest.
  • response.urljoin − parse() method uses this method to build a new url and provide a new request, which will be sent later to callback.
  • parse_dir_contents() − This is a callback which actually scraps the data of interest.

Here, Scrapy makes use of a callback mechanism to follow links. Using this mechanism, bigger crawler is designed and you can follow links of interest to scrape the desired data from different pages. Regular method will be callback method which extracts the items, looks for links to follow the next page and then provides a request for the same callback.

Below example produces a loop which follows the links to the next page.

All rights reserved © 2020 Wisdom IT Services India Pvt. Ltd Protection Status

Scrapy Topics