Skip to content

Puppeteer Scraper readme and input schema improvements #122

@mnmkng

Description

@mnmkng

Web Scraper and Cheerio Scraper already have READMEs and schemas of sufficient quality, but Puppeteer Scraper is lacking. The structure and format should be exactly the same as the existing ones, but the contents will differ. In some places not so much, in other places a lot.

Whoever attempts this should:

  • read the tutorials for all the scrapers, to understand how they work
  • make sure they understand the differences between Web/Cheerio and Puppeteer scrapers (read this for difference between web scraper and puppeteer scraper)
  • reuse as much as possible from the existing readmes, no need to reinvent the wheel, but make sure that the differences are not missed or obscured in the readme. Ideally, we would point out the differences where appropriate
  • while writing, run the scraper regularly with different inputs to see what they actually do and how they work
  • INPUT_SCHEMA.json description fields need to be updated and changed too, see input schemas of Web/Cheerio for inspiration. The descriptions are shown in the scraper UI as tooltips, so the descriptions need to look good in the UI.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions