Skip to content

Web scraping script to extract data from an online book store. Saves all product names as well as images in an organized self generated filesystem.

Notifications You must be signed in to change notification settings

Pascal273/2nd-Project-Market-Analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 

Repository files navigation

Introduction

This is the 2nd project for the Python path of Openclassrooms. The goal is to web scrape from Books to Scrape and to extract the following things:

  1. All categories
  2. All products (books) in each category
  3. All required details for each product

Then it takes all the above and saves it all in separate lists within a CSV file (1 file for each category).

It also saves the image from each product and organizes them by the book's category, in a separate folder, with the name of the book it belongs to.

Required Setup to run the program:

  1. Python version 3.9.5 or higher must be installed.
  2. Create the directory in which you want to keep the program.
  3. Open your terminal.
  4. Navigate to the folder with the main.py and requirements.txt
  5. Create your Virtual Environment by running the command: python -m venv .venv
  6. Activate the Environment by running: .venv\Scripts\activate.bat (Windows) or source .venv/bin/activate (OS)
  7. Install the Requirements by running the command: pip install -r requirements.txt

How to run the program:

  1. Open your terminal
  2. Navigate to the directory which contains the >main.py< file
  3. Activate the Environment by running: .venv\Scripts\activate.bat (Windows) or source .venv/bin/activate (OS)
  4. Run the command: python main.py
  5. While the program runs, it will continuously display the current category and the last title of each book it successfully saved.

Notes

  • The program will create all files and folders automatically in the program directory.

Technologies

  • Python version 3.9.5

  • Beautifulsoup4 version 4.9.3

  • Requests version 2.6.0

About

Web scraping script to extract data from an online book store. Saves all product names as well as images in an organized self generated filesystem.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages