ScrapPyJS

The ScrapPyJS class provides functionality for web scraping using Selenium were you can Scrap data via running JS script directly from python.

Installing

pip install ScrapPyJS

How to Use

Including and Initiating

from ScrapPyJS import ScrapPyJS

# initiate ScrapPyJS
scrappy = ScrapPyJS()

# set js script
JS_SCRIPT = "return 'ScrapPy scrapping!'"
scrappy.set_script(JS_SCRIPT)

# rest of the code goes here...

# close ScrapPyJS
scrappy.end()

Simple way

Use the scrap method to scrape a webpage:

result = scrappy.scrap(url, wait=True, wait_for='id', wait_target='elementId')

Retrieve the result of the scraping operation:
```
print(result)
```

Loop through list of URLs

Set up a list of target URLs

URLS = [
    'https://url1.com/',
    'https://url2.com/homepage/',
    'https://url2.com/about',
]

Use the loop_through method to scrape through the target webpages webpage:

# The result value will be a list if save mode is on, else a JSON string
result = scrappy.scrap(url, wait=True, wait_for='id', wait_target='elementId')

Retrieve the result of the scraping operation:
```
print(result)
```

Save results to a file

Activate save mode

Via toggle:
```
scrappy.toggle_save_mode()
```
Here, the save mode which is set to False by Default is toggled to True. So the save file informations are default.
Via set_save_info method:
```
scrappy.set_save_info(save=True)
```
Here, we directly set save mode to True leaving other infos to default.

Configure save mode

Via set_save_info method:

FILE_NAME = "output"
FILE_FORMAT = "json"
SAVE_LOCATION = "path/to/file/"

scrappy.toggle_save_mode(save=True, file_name=FILE_NAME, file_format=FILE_FORMAT, location=SAVE_LOCATION)

Please note that you will need to have the necessary Selenium and WebDriver dependencies installed to use this code.

Documentation

The necessary informations on the ScrapPyJS class is available in .\CLASS_STRUCTURE.md

License

This code has been licensed under MIT open source copyleft license.

Author

NAME: Hind Sagar Biswas

Website: coderaptors.epizy.com

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
.github/workflows		.github/workflows
ScrapPyJS.egg-info		ScrapPyJS.egg-info
ScrapPyJS		ScrapPyJS
dist		dist
examples		examples
.gitignore		.gitignore
.sourcery.yaml		.sourcery.yaml
CLASS_STRUCTURE.md		CLASS_STRUCTURE.md
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ScrapPyJS

Installing

How to Use

Including and Initiating

Simple way

Loop through list of URLs

Save results to a file

Activate save mode

Configure save mode

Documentation

License

Author

About

Releases 2

Languages

License

hind-sagar-biswas/ScrapPyJS

Folders and files

Latest commit

History

Repository files navigation

ScrapPyJS

Installing

How to Use

Including and Initiating

Simple way

Loop through list of URLs

Save results to a file

Activate save mode

Configure save mode

Documentation

License

Author

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 2

Languages