-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
In 1096 ocw workflow #91
base: main
Are you sure you want to change the base?
Conversation
9f54f9f
to
5097b39
Compare
Pull Request Test Coverage Report for Build 12935641949Details
💛 - Coveralls |
"dc.contributor.author": { | ||
"source_field_name": "instructor", | ||
"language": "en_US", | ||
"delimiter": "|" | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This mapping is made possible by the transformation of the instructors
property when _read_metadata_json_file()
method is called.
def _construct_instructor_name(instructor: dict[str, str]) -> str: | ||
"""Given a dictionary of name fields, derive instructor name.""" | ||
if not (last_name := instructor.get("last_name")) or not ( | ||
first_name := instructor.get("first_name") | ||
): | ||
return "" | ||
return f"{last_name}, {first_name} {instructor.get("middle_initial", "")}".strip() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While it is plausible that all the metadata in data.json
will always be formatted as needed (i.e., all instructor
name fields provided), it would be a good idea to check in with stakeholders (IN-1156) on the "minimum required instructor
name fields` to construct an instructor name.
In this sample mapping file we received, ocw_json_to_dspace_mapping.xlsx, it indicates the instructor
names must be formatted as:
<last_name>, <first_name> <middle_initial>
The code above will return an empty string if either the last_name
or first_name
is missing; it allows for missing middle_initial
values.
Why these changes are being introduced: * Support OpenCourseWare deposits requested by Technical Services staff. How this addresses that need: * Define custom methods to extract metadata from 'data.json' * Define custom 'get_bitstream_s3_uris' to filter to zip files Side effects of this change: * None Relevant ticket(s): * https://mitlibraries.atlassian.net/browse/IN-1096
5097b39
to
01b7107
Compare
Purpose and background context
This PR creates the
OpenCourseWare
DSC workflow.This also introduces the following changes:
SimpleCSV
workflow. The result is that when we runmake test
, workflow test results appear in the terminal together.How can a reviewer manually see the effects of these changes?
A. Review the added unit tests.
Note: The only custom method defined for
OpenCourseWare
without a unit test is theitem_metadata_iter
method. See method B for testing with MinIO server.OpenCourseWare
methods.B. Optional but highly recommended (especially for future development).
Run
OpenCourseWare
commands using local MinIO server.Prerequisite
Follow instructions in README: Running a Local MinIO Server.
Note: As of this writing, the root password set for the local MinIO server must be at least 8 characters long. Didn't want to write this requirement in the README as it is subject to change if/when we download updated versions of the MinIO Docker image.
Mock out the local MinIO server with test zip files.
Note: I did these steps via the WebUI.
dsc
bucket:dsc/opencourseware/batch-00/
It is not important to mock other files as the bitstream for OpenCourseWare deposits is the zip file itself.
data.json
.data.json
dsc/opencourseware/batch-01/
Add the following environment variables in your
.env
file.OpenCourseWare
commandsLaunch Python in your terminal:
pipenv run python
item_metadata_iter()
result forbatch-00
.You should see the following output:
item_metadata_iter()
result forbatch-01
.You should see the following output:
An
FileNotFoundError
is raised if any zip file is missing metadata (i.e., thedata.json
file)Includes new or updated dependencies?
NO
Changes expectations for external applications?
NO
What are the relevant tickets?
Developer
Code Reviewer(s)