You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Feb 1, 2022. It is now read-only.
Problem
I tried to get an overview of current status of the crawler regarding the number of successfully parsing districts. By randomly running some scripts, I already noticed some scripts which are not able to extract the current numbers of cases (probably due to changes to the corresponding website structure). In order to identify failing scripts, it would be nice to have some kind of common error signalling.
Suggestion
My first idea is based on UNIX exit codes, by simply return 1 if the parser is not able to extract the data.
I already included this approach in one script which I will link to this issue.
The text was updated successfully, but these errors were encountered:
dasmur
added a commit
to dasmur/corona_landkreis_fallzahlen_scraping
that referenced
this issue
Jun 19, 2020
The current implementation simply crashes in the case of
the regex not matching anything. This commit checks the regex match
(at least for `status`) and uses the UNIX exit code to signal
parsing errors to the caller (e.g. `run.py`).
See also: corona-zahlen-landkreis#70
Ok, while my suggestion (using UNIX exit codes to early exit failing parsers) might be a good thing to improve the overall coding style, it is not really necessary to answer the question:
Q: How many scripts are currently able to extract district case numbers?
The answer is 23 of the 62 are running without errors (defined by an exit code of 0) or in other words, currently 39 parsers are failing with an exit code of 1.
Sign up for freeto subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Component
crawler
Problem
I tried to get an overview of current status of the crawler regarding the number of successfully parsing districts. By randomly running some scripts, I already noticed some scripts which are not able to extract the current numbers of cases (probably due to changes to the corresponding website structure). In order to identify failing scripts, it would be nice to have some kind of common error signalling.
Suggestion
My first idea is based on UNIX exit codes, by simply return
1
if the parser is not able to extract the data.I already included this approach in one script which I will link to this issue.
The text was updated successfully, but these errors were encountered: