Leader Behaviour Prediction

This project will deal with extracting and gathering information about the behaviour/ bad work (corresponding to predefined adjectives ) of a leader/ representative by constantly scraping news website.

We have converted the output to a JSON file.

Installing required libraries

sudo pip install requirements.txt

Scraping Times of India website

I have scraped Times Of India Website specially for this purpose.

The dataset got after scraping Times of India website

This dataset have the details of the scrapped article. We have to scrap the text and get the names. Then we have to match the details of the adjective with the matched names that is got.

The dataset is present in the path :

LeaderBehaviour/leaderBehaviour/leaderBehaviour/spiders/newsTOI.sqlite

Scraped names of the members of parliaments in US :

LeaderBehaviour/getUSNames/getUSNames/spiders/getUSNames.json

Scraped the names of the members of parliaments in India :

LeaderBehaviour/getIndianPolNames/getIndianPolNames/spiders/getIndianPolNames.json

Additional Objectives :

* used headers/ user-agent in scrapy.
* need to use proxy/ integrate with Tor to make it completely untraceable.

Possible name extraction from the extracted text :

LeaderBehaviour/leaderBehaviour/leaderBehaviour/spiders/extractNamesTOI.py
LeaderBehaviour/leaderBehaviour/leaderBehaviour/spiders/probable_names_extracted.json

Note

Go to the directory real_shit, then copy the scrapTOI.sqlite, then run *** python get_neg.py***.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Leader Behaviour Prediction

Installing required libraries

Scraping Times of India website

The dataset got after scraping Times of India website

Scraped names of the members of parliaments in US :

Scraped the names of the members of parliaments in India :

Additional Objectives :

Possible name extraction from the extracted text :

Note

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

Leader Behaviour Prediction

Installing required libraries

Scraping Times of India website

The dataset got after scraping Times of India website

Scraped names of the members of parliaments in US :

Scraped the names of the members of parliaments in India :

Additional Objectives :

Possible name extraction from the extracted text :

Note