Skip to content

azza16/scraping

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Web scraping

This repository contains multiple scripts for various scraping jobs

What follows is a brief description of each folder and its contents

  1. dynamicPages: Contains script responsible for collecting online newspaper articles from dynamic website
    • Written in Python and using the Selenium package
  2. eventRegistry: Contains script responsible for collecting articles from the Event Registry API
    • Written in Python
  3. facebook: Contains script responsible for collecting posts from public facebook pages
    • Written in Javascript and using the Puppeteer package
  4. news: Contains multiple scripts responsible for collecting articles from various online newspaper websites based on provided configuration
    • Written in Python and using the Scrapy framework
  5. reddit: Contains script responsible for collecting reddit comments from the official Reddit API
    • Written in Python and using the Praw framework
  6. twitter: Contains script responsible for collecting tweets from the official Twitter API
    • Written in Python
  7. youtube: Contains script responsible for collecting youtube video comments from the Youtube Data API
    • Written in Python

About

Scripts for various scraping jobs

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published