-
Notifications
You must be signed in to change notification settings - Fork 2
Crawler Architecture
SB-GitHub Public edited this page Jan 15, 2018
·
6 revisions

Crawler works in the following way:
-
Extractor runs Crawler through the API using Crawler's ID.
a. API sends a request to the Orchestra.
b. Orchestra calls Crawler itself. - Crawler is checking job sites and looking for vacancies, in which search_word is found and writes them down in a raw_vacancy document.
-
Cron Job checks a raw_vacancy document and parsed_vacancy for new vacancies (status: "new").
a. If there are new records in raw_vacancy document, Cron sends a request to Orchestra to run Parser and to process vacancy.
b. If there are new records in parsed_vacancy document, Cron sends a request to Orchestra to run graph_macker and to process skills.