scrape data from amazon using octoparse

29
Collect Data from Amazon www. octoparse.com

Upload: dianne-hung

Post on 11-Apr-2017

115 views

Category:

Data & Analytics


4 download

TRANSCRIPT

Page 1: Scrape Data from Amazon Using Octoparse

Collect Data from Amazon

www. octoparse.com

Page 2: Scrape Data from Amazon Using Octoparse

Click “start”to build a new task.Or hit the “Quick start” button in Navigation Panel to Create a new task.( Here we use Advanced Mode.)

Page 3: Scrape Data from Amazon Using Octoparse

Step 3. Complete basic information. Click ➜ “Next”.

Page 4: Scrape Data from Amazon Using Octoparse

Step 4. Design Workflow to configure the extraction rule. You can check your configuration rule in Workflow Designer here if something goes wrong.

Page 5: Scrape Data from Amazon Using Octoparse
Page 6: Scrape Data from Amazon Using Octoparse

Create a list of links of all the subcategories. Wait until the page loaded, click the first subcategory. Choose “create a list of items”. ➜

Page 7: Scrape Data from Amazon Using Octoparse

Select “Add current item to the list” “Continue to edit the list” ➜ ➜Click the second subcategory

Page 8: Scrape Data from Amazon Using Octoparse
Page 9: Scrape Data from Amazon Using Octoparse

Select “Add current item to the list” again.

Page 10: Scrape Data from Amazon Using Octoparse

When you get all the subcategory links, click “Finish Creating List”. ➜ Select “Loop” to process the list.

Page 11: Scrape Data from Amazon Using Octoparse

Now you can see it automatically enter the first category page

Page 12: Scrape Data from Amazon Using Octoparse

Click “Next Page” “loop click next page” to create a loop action to process all the web pages. ➜The action of pagination has been added to the extraction rule.

Page 13: Scrape Data from Amazon Using Octoparse
Page 14: Scrape Data from Amazon Using Octoparse

Then go back to the first product section. If you want to capture the information inside the product section, you have to click the detail link to get into the detail page. Choose the ➜detail link. Click the first product title to "create a list of items" . ➜

Page 15: Scrape Data from Amazon Using Octoparse

Click “Add current item to the list” “Continue to edit the list”. ➜

Page 16: Scrape Data from Amazon Using Octoparse
Page 17: Scrape Data from Amazon Using Octoparse

Then click the second product title. ➜ Click “Add current item to the list” “Finish Creating List”➜

Page 18: Scrape Data from Amazon Using Octoparse
Page 19: Scrape Data from Amazon Using Octoparse

As can be seen, all the detail links on the first page are all here. And Click “loop” to process the list.

Page 20: Scrape Data from Amazon Using Octoparse

Now you' re on the detail page. Then extract any information you need. Click on the product title to extract it.

Page 21: Scrape Data from Amazon Using Octoparse

Click “Extract Text”.

Page 22: Scrape Data from Amazon Using Octoparse

Click on price to extract. Then click ➜ “Extract Text”. And you get the product title and price in the Customize Current Action box.

Page 23: Scrape Data from Amazon Using Octoparse
Page 24: Scrape Data from Amazon Using Octoparse
Page 25: Scrape Data from Amazon Using Octoparse

Drag the second “Loop Item” before “Click to paginate” action.

Page 26: Scrape Data from Amazon Using Octoparse

Now we are done configuring extraction rule! Click “Next” to process configured rule. When images are not needed, you can choose not to load images to speed up the extraction.

Page 27: Scrape Data from Amazon Using Octoparse

Now the Task is completed! Choose the “Local extraction” to run the task on your computer.

Page 28: Scrape Data from Amazon Using Octoparse

The data extracted will be shown in "Data Extracted" pane. Click “Export” button to export the results to Excel file, databases or other formats and save the file to your computer.

Page 29: Scrape Data from Amazon Using Octoparse

Happy Data Hunting

www.octoparse.com