HTTPError should be catched.
The sitechecksum which caused error should be skipped to prevent blocking others behind.
|
@app.task |
|
def check_new_release(): |
|
scheduled_jobs = [] |
|
site_checksums: list[Type[SiteChecksum]] = [ |
|
AlterChecksum, |
|
GSCChecksum, |
|
NativeChecksum |
|
] |
|
with pgsql_session(): |
|
scrapy_util = ScrapydUtil( |
|
os.getenv("SCRAPYD_URL", "http://127.0.0.1:6800"), "product_crawler" |
|
) |
|
for site_checksum in site_checksums: |
|
checksum = site_checksum(scrapyd_util=scrapy_util) |
|
if checksum.is_changed: |
|
spider_jobs = checksum.trigger_crawler() |
|
scheduled_jobs.extend(spider_jobs) |
|
checksum.update() |
|
return scheduled_jobs |
HTTPError should be catched.
The sitechecksum which caused error should be skipped to prevent blocking others behind.
hook_tasks/hook_tasks/periodic/tasks.py
Lines 35 to 53 in d1826a3