-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
122 add fig anchor metadata to validations (#124)
Closes #122 I created a script that read through the fig_source.html, which I saved from browser, and found each of the elements containing the validation ID for each SBLCheck. Then I grabbed the href and stored that in the fig_anchor in the phase_validations.py file. I created a pytest to use the same html to loop through both the phase validation fig_anchors and the hrefs to compare to ensure each validation ID had the correct fig anchor. I also updated the existing test_cli.py formats to include the new fig_anchor field. Might be a good idea to store this tool somewhere, it's all local currently. Just in case they change something, it makes it easy to loop through the py file and insert the fig_anchor instead of manually copy/pasting. Updated the schema to include the fig_anchor in the check, which automatically carries over into the validation results. Updated the df_to_json to include the fig_anchor in the json we send to clients. The other df_to's automatically get the fig_anchor. I don't think we need another story in the filing-api, the fig_anchor for each result will be in the JSON blob. I would LOVE to figure out how to automatically pull down the FIG html in an actual usable way for the pytests. However, because it's all javascripted, the actual hrefs don't come across as full links if you do a request.get, or curl, or wget. The only way I've found to have both the full href link and the Validation ID associated with it is saving the page off in a browser.
- Loading branch information
Showing
7 changed files
with
717 additions
and
156 deletions.
There are no files selected for viewing
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,39 @@ | ||
import requests | ||
|
||
from regtech_data_validator.phase_validations import get_phase_1_and_2_validations_for_lei | ||
from regtech_data_validator.global_data import fig_base_url | ||
from bs4 import BeautifulSoup | ||
|
||
|
||
class TestFigAnchors: | ||
|
||
def test_fig_links(self): | ||
|
||
html_text = requests.get( | ||
"https://www.consumerfinance.gov/data-research/small-business-lending/filing-instructions-guide/2024-guide/#4" | ||
).text | ||
source_links = BeautifulSoup(html_text, 'html.parser') | ||
|
||
validators = get_phase_1_and_2_validations_for_lei() | ||
checks = [] | ||
validator_anchors = [] | ||
fig_links = [] | ||
|
||
for k in validators.keys(): | ||
v = validators[k] | ||
for p in v.keys(): | ||
checks.extend(v[p]) | ||
|
||
for check in checks: | ||
validator_anchors.append({"id": check.title, "anchor": check.fig_link}) | ||
|
||
elements = source_links.find_all(lambda tag: tag.name == "a" and "Validation ID:" in tag.text) | ||
for e in elements: | ||
anchor = e.get('href') | ||
id = e.text.split("Validation ID:")[1].strip() | ||
fig_links.append({"id": id, "anchor": fig_base_url + anchor}) | ||
|
||
validator_anchors = sorted(validator_anchors, key=lambda d: d['id']) | ||
fig_links = sorted(fig_links, key=lambda d: d['id']) | ||
anchors = zip(validator_anchors, fig_links) | ||
assert not any(x != y for x, y in anchors) |