The website applies AJAX( Deal with AJAX) to load new content, which means after clicking on the first product page, the system fails to go back to the listing page automatically (and click into the second product page from there).If the data you need is on the item page, then you need to set scrolling times and collect enough product URLs first for the next task. Example: Zara's search result page uses infinitive-scrolling to keep loading new items.One task is to load the page and scrape URLs, and the other one is to use a list of extracted URLs for scraping the detailed info. If you need to collect data by clicking on each URL to scrape details on the deeper layer, then you'll need to split the task into two. Some websites use infinitive-scrolling/load more to load the content.Example: I have a list of product URLs, and I want to start a task with a list of URLs directly to scrape updated pricing data regularly.All the URLs are under the same domain, sharing the same webpage structure (Most Important).Here are some cases where you can start the task with a list of URLs for extraction. When should you consider scraping by using a list of URLs? In this tutorial, we will introduce an easy and powerful way to extract data from multiple web pages by using a list of URLs. ![]() In some cases, you may have a list of similar-structured URLs (like a batch of product URLs) on hand, and you want to extract the data from them directly. Sharpen your skills and explore new ways to use Octoparse. For the latest tutorials, visit our new self-service portal.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |