finscraper
The library provides an easy-to-use API for fetching data from various Finnish websites:
Website |
Type |
Spider API class |
---|---|---|
News article |
|
|
News article |
|
|
News article |
|
|
Discussion thread |
|
|
Discussion thread |
|
|
Apartment ad |
|
|
Item deal |
|
Documentation is available at https://finscraper.readthedocs.io and simple online demo here.
Installation
pip install finscraper
Quickstart
Fetch 10 news articles as a pandas DataFrame from Ilta-Sanomat:
from finscraper.spiders import ISArticle
spider = ISArticle().scrape(10)
articles = spider.get()
The API is similar for all the spiders:
Contributing
Please see CONTRIBUTING.md for more information.
Jesse Myrberg (jesse.myrberg@gmail.com)
- finscraper package
- Subpackages
- finscraper.scrapy_spiders package
- Submodules
- finscraper.scrapy_spiders.ilarticle module
- finscraper.scrapy_spiders.isarticle module
- finscraper.scrapy_spiders.mixins module
- finscraper.scrapy_spiders.oikotieapartment module
- finscraper.scrapy_spiders.suomi24page module
- finscraper.scrapy_spiders.torideal module
- finscraper.scrapy_spiders.vauvapage module
- finscraper.scrapy_spiders.ylearticle module
- Module contents
- finscraper.scrapy_spiders package
- Submodules
- finscraper.extensions module
- finscraper.middlewares module
- finscraper.pipelines module
- finscraper.request module
- finscraper.settings module
- finscraper.spiders module
- finscraper.text_utils module
- finscraper.utils module
- finscraper.wrappers module
- Module contents
- Subpackages