Webscraping-project

Below you can find instructions on how to run the scrapers

Running BeautifulSoup scraper

In order to use BeautifulSoup scraper, you should run "laptops-soup.py" in terminal or other python interpreter, for example VS Code.

If you want to scrape more than 100 pages, change pages100 variable to "False".

Collected data will be saved in the "laptops-bs4.csv" file.

Running Scrapy scraper

The following order is important to run the Scrapy scraper successfully:

1. Run "pages" spider and save results to pages.csv
2. Run "laptops_links" spider and save results to laptops_links.csv
3. Run "laptops" spider and save results to laptops.csv

Running Selenium scraper

In order to use Selenium scraper, you should run "laptops-selenium.py" in terminal or other python interpreter, for example VS Code.

If your webdriver, web browser, or PATH to geckodriver is different than the provided one, please make appropriate changes before running the code. The webdriver used in the python file was copied from the class materials.

If you want to scrape more than 100 pages, change pages100 variable to "False".

Collected data will be saved in the "laptops-selenium.csv" file.

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
scrapy/laptops_scrapy		scrapy/laptops_scrapy
selenium		selenium
soup		soup
LICENSE		LICENSE
README.md		README.md
description.pdf		description.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Webscraping-project

Running BeautifulSoup scraper

Running Scrapy scraper

Running Selenium scraper

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Webscraping-project

Running BeautifulSoup scraper

Running Scrapy scraper

Running Selenium scraper

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages