finscraper.scrapy_spiders package

Submodules

finscraper.scrapy_spiders.ilarticle module

Module for ILArticle spider.

finscraper.scrapy_spiders.isarticle module

Module for ISArticle spider.

finscraper.scrapy_spiders.mixins module

Module for Scrapy spider mixins.

class finscraper.scrapy_spiders.mixins.FollowAndParseItemMixin(follow_meta=None, items_meta=None, follow_selenium_callback=False, items_selenium_callback=False)

Bases: object

Parse items and follow links based on defined link extractors.

The following needs to be defined when inheriting:
  1. item_link_extractor -attribute: LinkExtractor that defines the links to parse items from.

  2. follow_link_extractor -attribute: LinkExtractor that defines the links to follow and find item pages from.

  3. parse_item -function: Parses the item from response.

Parameters
  • follow_meta (dict or None, optional) – Dictionary to pass within link follow requests. Defaults to None.

  • follow_items (dict or None, optional) – Dictionary to pass within item link requests. Defaults to None.

  • follow_selenium_callback (function, bool or None, optional) – Selenium callback to use for follow requests. If function, takes in parameters (request, spider, driver) and returns response. If None, follows the default behavior of SeleniumCallbackRequest. If False, uses normal Scrapy Request. Defaults to None.

  • items_selenium_callback (function, bool or None, optional) – Selenium callback to use for item requests. If function, takes in parameters (request, spider, driver) and returns response. If None, follows the default behavior of SeleniumCallbackRequest. If False, uses normal Scrapy Request. Defaults to None.

Raises

AttributeError, if required attributes not defined when inheriting.

itemcount = 0
parse(resp, to_parse=False)

Parse items and follow links based on defined link extractors.

start_requests()

finscraper.scrapy_spiders.mnetpage module

Module for Muuskoiden.net forum spider.

finscraper.scrapy_spiders.oikotieapartment module

Module for OikotieApartment spider.

finscraper.scrapy_spiders.suomi24page module

Module for Suomi24Page spider.

finscraper.scrapy_spiders.torideal module

Module for ToriDeal spider.

finscraper.scrapy_spiders.vauvapage module

Module for VauvaPage spider.

finscraper.scrapy_spiders.ylearticle module

Module for YLEArticle spider.

Module contents