Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

REST API to export modulestore XBlocks as OLX #23068

Merged
merged 1 commit into from
Feb 25, 2020

Conversation

bradenmacdonald
Copy link
Contributor

@bradenmacdonald bradenmacdonald commented Feb 10, 2020

This PR adds openedx-olx-rest-api into the core edx-platform. It provides a Studio API that any user with course authoring permission can use to get the OLX of an individual XBlock or a unit. Without this, the only way to get an XBlock's OLX was to download the tarball of the entire course.

Examples of usage (be logged in to Studio on devstack):

Example output for an HTML block:

{ 
   "root_block_id":"block-v1:edX+DemoX+Demo_Course+type@html+block@030e35c4756a4ddc8d40b95fbbfff4d4",
   "blocks":{ 
      "block-v1:edX+DemoX+Demo_Course+type@html+block@030e35c4756a4ddc8d40b95fbbfff4d4":{ 
         "olx":"<html display_name=\"Blank HTML Page\"><![CDATA[\n<p><strong>Welcome to the edX Demo Course Introduction.</strong></p>\n]]></html>\n"
      }
   }
}

The code is designed primarily for use when importing content into Blockstore. So it will:

  • Export HTML blocks as a combined OLX/HTML file, with the HTML in a CDATA section
  • Convert vertical blocks to unit blocks (unit is like a vertical but has no UI elements)
  • Detect static files (such as images) used by the XBlock and list the absolute URL of each static file in the "static_files": {...} JSON element for each XBlock that has at least one static file usage.
    • this can handle static files that are in mongo ("contentstore" / "Files & Uploads") as well as files generated on-the-fly during OLX serialization via the export_fs API (mostly this is video transcripts).

@openedx-webhooks
Copy link

Thanks for the pull request, @bradenmacdonald! I've created OSPR-4087 to keep track of it in JIRA. JIRA is a place for product owners to prioritize feature reviews by the engineering development teams.

Feel free to add as much of the following information to the ticket:

  • supporting documentation
  • edx-code email threads
  • timeline information ("this must be merged by XX date", and why that is)
  • partner information ("this is a course on edx.org")
  • any other information that can help Product understand the context for the PR

All technical communication about the code itself will still be done via the GitHub pull request interface. As a reminder, our process documentation is here.

@bradenmacdonald
Copy link
Contributor Author

@natabene I'll work with @ormsbee directly for this review.

@bradenmacdonald
Copy link
Contributor Author

jenkins run a11y

Copy link
Contributor

@ormsbee ormsbee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple minor comments/requests.

openedx/core/djangoapps/olx_rest_api/adapters.py Outdated Show resolved Hide resolved
# Replace static urls like '/static/foo.png'
static_paths = []
# Drag-and-drop-v2 has
# &quot;/static/blah.png&quot;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(sigh)... totally legit comment... but (sigh)

openedx/core/djangoapps/olx_rest_api/block_serializer.py Outdated Show resolved Hide resolved
openedx/core/djangoapps/olx_rest_api/urls.py Outdated Show resolved Hide resolved
openedx/core/djangoapps/olx_rest_api/views.py Outdated Show resolved Hide resolved
@bradenmacdonald
Copy link
Contributor Author

Thanks @ormsbee. Comments addressed, rebased, and squashed.

@bradenmacdonald bradenmacdonald force-pushed the olx-rest-api branch 2 times, most recently from 59edb84 to 20084ae Compare February 20, 2020 21:59
@ormsbee
Copy link
Contributor

ormsbee commented Feb 21, 2020

@bradenmacdonald: Please update your commit message to include the context from your PR description. Also, please make a note of this new API on the Juniper page: https://openedx.atlassian.net/wiki/spaces/COMM/pages/940048716/Juniper

Thank you.

This was originally a separate plugin called openedx-olx-rest-api.

It provides a Studio API that any user with course authoring permission can use to get the OLX of an individual XBlock or a unit. Without this, the only way to get an XBlock's OLX was to download the tarball of the entire course.

Examples of usage (be logged in to Studio on devstack):

Simple HTML XBlock:
http://localhost:18010/api/olx-export/v1/xblock/block-v1:edX+DemoX+Demo_Course+type@html+block@030e35c4756a4ddc8d40b95fbbfff4d4/

Exporting a unit:
http://localhost:18010/api/olx-export/v1/xblock/block-v1:edX+DemoX+Demo_Course+type@vertical+block@134df56c516a4a0dbb24dd5facef746e/

Example output for an HTML block:

    { 
       "root_block_id":"block-v1:edX+DemoX+Demo_Course+type@html+block@030e35c4756a4ddc8d40b95fbbfff4d4",
       "blocks":{ 
          "block-v1:edX+DemoX+Demo_Course+type@html+block@030e35c4756a4ddc8d40b95fbbfff4d4":{ 
             "olx":"<html display_name=\"Blank HTML Page\"><![CDATA[\n<p><strong>Welcome to the edX Demo Course Introduction.</strong></p>\n]]></html>\n"
          }
       }
    }

The code is designed primarily for use when importing content into Blockstore. So it will:
* Export HTML blocks as a combined OLX/HTML file, with the HTML in a CDATA section
* Convert vertical blocks to unit blocks (unit is like a vertical but has no UI elements)
* Detect static files (such as images) used by the XBlock and list the absolute URL of each static file in the "static_files": {...} JSON element for each XBlock that has at least one static file usage. This can handle static files that are in mongo ("contentstore" / "Files & Uploads") as well as files generated on-the-fly during OLX serialization via the export_fs API (mostly this is video transcripts).
@edx-status-bot
Copy link

Your PR has finished running tests. There were no failures.

@bradenmacdonald
Copy link
Contributor Author

@ormsbee Done (both) - thanks!

@ormsbee ormsbee merged commit 8c7dc22 into openedx:master Feb 25, 2020
@openedx-webhooks
Copy link

@bradenmacdonald 🎉 Your pull request was merged!

Please take a moment to answer a two question survey so we can improve your experience in the future.

@edx-pipeline-bot
Copy link
Contributor

EdX Release Notice: This PR has been deployed to the staging environment in preparation for a release to production.

@edx-pipeline-bot
Copy link
Contributor

EdX Release Notice: This PR has been deployed to the production environment.

olx_node = etree.Element("html")
if block.display_name:
olx_node.attrib["display_name"] = block.display_name
olx_node.text = etree.CDATA("\n" + block.data + "\n")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bradenmacdonald Something I stumbled into while unit testing upstream_sync: this line causes html_block.data to be wrapped in newlines, if it's not wrapped already. Does that sound right? It doesn't change the meaning of the HTML, so I think it's fine?

Minimal repro:

# Boilerplate...
from organizations.api import ensure_organization
from organizations.models import Organization
from openedx.core.djangoapps.content_libraries import api as libs
from openedx.core.djangoapps.xblock import api as xblock
from django.contrib.auth.models import User
user = User.objects.get(username="openedx")
ensure_organization("TestX")
library = libs.create_library(
	org=Organization.objects.get(short_name="TestX"), slug="TestLib", title="Test Library"
)
block_key = libs.create_library_block(library.key, "html", "test-newlines").usage_key

# Initial block data: No newlines
block = xblock.load_block(block_key, user)
block.data = "<html><body>Hello</body></html>"
print(repr(block.data))  # '<html><body>Hello</body></html>'
block.save()

# First round-trip: data is wrapped in newlines
block = xblock.load_block(block_key, user)
print(repr(block.data))  # '\n<html><body>Hello</body></html>\n'
block.save()

# Second round-trip: data is stilled wrapped in newlines
# Fortunately, it didn't add another pair of newlines.
block = xblock.load_block(block_key, user)
print(repr(block.data))  # '\n<html><body>Hello</body></html>\n'

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, yeah I don't think we need the newlines.

Copy link
Member

@kdmccormick kdmccormick Sep 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bradenmacdonald Actually, it's a little worse than I thought. As long as any block fields are edited, then it always adds a pair of newlines to .data, regardless of whether it's already wrapped in a pair.

print(repr(block.data))  # '\n<html><body>Hello</body></html>\n'
block.display_name = "blah"
block.save()

block = xblock.load_block(block_key, user)
print(repr(block.data))  # '\n\n<html><body>Hello</body></html>\n\n'

This still wouldn't break anything for end users AFAICT, but it'll make unit tests awkward, and it'll probably bother folks who edit OLX directly. I'll open a bug ticket. (EDIT: #35525)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll remove the newlines and see if any unit tests break.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged open-source-contribution PR author is not from Axim or 2U
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants