Skip to content

Commit

Permalink
Add exit code to pforzheim parser
Browse files Browse the repository at this point in the history
The current implementation simply crashes in the case of
the regex not matching anything. This commit checks the regex match
(at least for `status`) and uses the UNIX exit code to signal
parsing errors to the caller (e.g. `run.py`).

See also: corona-zahlen-landkreis#70
  • Loading branch information
dasmur committed Jun 19, 2020
1 parent 6f9062a commit 530a2dc
Showing 1 changed file with 16 additions and 2 deletions.
18 changes: 16 additions & 2 deletions landkreise/get-pforzheim-enzkreis.py
Original file line number Diff line number Diff line change
@@ -1,8 +1,18 @@
"""Parser for Pforzheim's Corona case numbers
This script parses the website of Pforzheim to extract the case numbers
of Pforzheim itself and Endkreis.
Return code:
* 1: if the parser could not extract the case number
"""
from bs4 import BeautifulSoup

import requests
import datetime
import re
import sys

import locale
locale.setlocale(locale.LC_TIME, "de_DE.utf-8")
Expand All @@ -22,8 +32,12 @@
text=clear_text_of_ambigous_chars(bs.getText())
text=remove_chars_from_text(text,["\n"])

status_raw = re.findall("Stand: .* Uhr\)", text)[0]
status= get_status(status_raw)
status_raw = re.findall("Stand: .* Uhr\)", text)
if not status_raw:
# - early exist, because the regex did not match anything
# Possible reason: website changed the structure?
sys.exit(1)
status= get_status(status_raw[0])


cases_pforzheim_raw = re.findall(cases_pforzheim_pattern,text)[0]
Expand Down

0 comments on commit 530a2dc

Please sign in to comment.