Skip to content

5. Metadata Fixes

Christina Chortaria edited this page Jan 30, 2020 · 1 revision

Metadata fixes

The second major part of the workshop involves the remediation of a metadata record in order to achieve a desired look and feel or function within GeoBlacklight.

Background on the GeoBlackight schema

Example issue: HTML metadata view won't render for Cornell Adirondack Park monitoring stations

For example, take a look at issue HTML metadata view won't render for Cornell Adirondack Park monitoring stations. As the issue explains, an error appears when you load the test app, click on the Adirondack Park record then attempt to view the HTML rendering of the layer's metadata. The item record loads fine in GeoBlacklight, but the error seems to suggest that something is wrong with the .JSON record (i.e., the GeoBlacklight metadata) rather than the existence of a software bug.

Diagnosing the problem

In this case, the application itself doesn't generate an error message, but we can tell from the NoSuchBucket... message that the metadata record is not resolving. The first step is to examine the JSON view of the record as it is indexed in the application. To do this, append /raw to the end of the catalog record URL in your test app.

http://localhost:3000/catalog/cugir-007741/raw

It helps if you have a .JSON viewer installed on your browser, which allows you to view the record in a "pretty-print" format. Take a look at the dct_references_s element.

"dct_references_s": "{\"http://schema.org/downloadUrl\":\"https://s3.amazonaws.com/cugir-data/00/77/41/cugir-007741.zip\",\"http://www.opengis.net/cat/csw/csdgm\":\"https://s3.amazonaws.com/cugir-data/00/77/41/fgdc.xml\",\"http://www.w3.org/1999/xhtml\":\"https://s3.amazonaws.com/cugir-data-missing-ID/fgdc.html\",\"http://www.opengis.net/def/serviceType/ogc/wfs\":\"https://cugir.library.cornell.edu/geoserver/cugir/wfs\",\"http://www.opengis.net/def/serviceType/ogc/wms\":\"https://cugir.library.cornell.edu/geoserver/cugir/wms\"}"

The dct_references_s element is a single-valued field, but it is a serialized JSON string, a series of key-value pairs that reference web content standards and then a URL that should resolve to an example of each respective standard. The source of the error starts to become apparent. According to the GeoBlacklight Schema, the key that signals an HTML document is http://www.w3.org/1999/xhtml. The value here is https://s3.amazonaws.com/cugir-data-missing-ID/fgdc.html, which seems incorrect. It's clearly a document on an Amazon s3 bucket, but the URL is wrong.

Making the fix

Step one is to determine what the URL should be. Cornell has established a workflow in which each metadata record it produces is named according to the six-character UUID associated with each record. You can see this by looking at the other "live" URLs in the dct_references_s field. We can surmise that it's highly likely the URL should be https://s3.amazonaws.com/cugir-data/00/77/41/fgdc.html. Just to make sure, try pasting this URL in a separate browser tab to see if it resolves.

It does resolve! Now, we might want to check and see if making the fix will allow this record to appear within the metadata viewer in GeoBlacklight. The sample metadata records, or fixtures, that we indexed earlier are in the /solr_documents/ directory of the app. If you navigate to it on your hard drive, you should see a file called cornell_broken_html_metadata.json. This is the problematic record. To test locally, open this file in a text editor, and make the correction (replace https://s3.amazonaws.com/cugir-data-missing-ID/fgcd.html with https://s3.amazonaws.com/cugir-data/00/77/41/fgdc.html).

Next, refresh the browser where your local GeoBlacklight is running and go to the record. The HTML view should now work.

Next steps

Fixing this permanently could involve making a pull request to OpenGeoMetadata. But if you are trying to maintain clean, functional fixture records on your individual GeoBlacklight app (which you should do for the purposes of testing), it's a good idea to submit this fix as a pull request. Let's do this. *Assuming that you began the fix process by running git pull and getting the most current version of the master branch, checkout a new branch that is named according to the issue at hand.

$ git checkout -b cornell-html-metadata-fix

Then, edit the cornell_broken_html_metadata.json file in question by replacing the bad URL with the correct one. Once you've done that check to see the status of your branch.

$ git status

You should see only one changed file. If so, you're ready to commit the change.

   $ git commit -m "submits a fix to a broken metadata record"
   $ git push origin cornell-html-metadata-fix
Clone this wiki locally