Skip to content

Normalized Output in HTML

Brian Feldman edited this page Dec 3, 2018 · 2 revisions

Output in a Html

Within output JSON each formatted field contains "raw", "normalized" and "plaintext". The normalized field is the HTML referred to here. The HTML format discussed below allows for the display within a web browser. And is a best attempt to map to a standardized document across all Patent document types (Greenbook, SGML, PAP, RedBook XML).

HTML Tags

BR
P
B, U
SUB, SUP
TABLE, THEAD, TGROUP, TR, TD
UL, OL, LI
DL, DT, DD
H1, H2, H3, H4, H5, H6
A, SPAN, PRE
Q, DEL, INS

Added Tags

O, SMALLCAPS, SUB2, SUP2

Rules

  • XHTML: Close all tags which includes BR tags
  • Entities have a class denoting its type
  • All Entity instances have an Id
  • html link "a" is used to denote a reference, could support future clickable link
  • html "span" used to annotate text
  • Subscript and superscript replace value with unicode value when mappable

Example Usage

Section Header

<h2 id="H-0001" level="1"></h2>

Header within Section

Usually denoted in xml format as paragraph with id starting with "H".

<h4 id="H-0001" level="1"></h4>

Paragraph

<p id="P-00001" level="0"></p>

Figure Reference

<a id="FR-0001" idref="FIG-1A" class="figref">FIG. 1A</figref>

Claim Reference

<a id="CR-0001" idref="CLM-00001" class="claim">claim 1</a>

Formula inline

<span id="FOR-0001" class="formula">c=a+b</span>

MathML inline

<span id="MTH-0001" class="math" format="mathml">
  <math> ... </math>
</span>

Note: Chrome doesn't support displaying MathML, you will need to install a javascript framework such as MathJax. An example exists below in section "Display in Browser".

List

<ul id="ul0002">
  <li id="ul0002-0001">element 1</li>
  <li id="ul0002-0002">element 2</li>
</ul>

Table

<table id="TBL-0001">
  <tr>
    <td>cell1</td>
    <td>cell2</td>
  </tr>
</table>

PRE (Table in Greenbook Patent format)

<pre id="TBL-0001" class="freetext-table"></pre>

Display in Browser

<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<meta name="viewport" content="width=device-width, initial-scale=1">
<link rel="stylesheet" href="https://fonts.googleapis.com/css?family=Roboto:400,400italic,500,500italic,700"/>
<link rel="stylesheet" href="https://fonts.googleapis.com/css?family=Product+Sans"/>
<style type="text/css">
body {counter-reset:paragraph;}
body, table {font-family: 'Roboto', sans-serif;background-color:#fff;color:#333;}
body ::selection{background-color: #C6DAFC;color: #333;}
p{padding-left:45px;font-size:14.5px;line-height:22px;text-indent:15px;display:block;word-break:break-word; -webkit-margin-before:1em; -webkit-margin-after:1em;-webkit-margin-start:0px; -webkit-margin-end:0px;}
p:before {position:absolute; margin-left:-70px; color:#CCC; content:counter(paragraph); counter-increment: paragraph;}
table.pgwide-1{width:100%}
table.pgwide-0{width:100%}
table {border-collapse:collapse;}
table.border-all{box-shadow:0 2px 3px rgba(0,0,0,0.06);}
table.border-sides{box-shadow:0 0 3px rgba(0,0,0,0.06);}
table.border-topbot{box-shadow:0 3px 0 rgba(0,0,0,0.06);}
td, th {padding-left:12px;padding-right: 12px;}
th {background-color:#f1f1f1f1;line-height:23px;}
.border-all{border:1px solid rgba(150,150,150,0.3);border-bottom:1px solid rgba(125,125,125,0.3);}
.border-none{border: none;}
.border-topbot{border-top:1px solid rgba(150,150,150,0.3);border-bottom:1px solid rgba(125,125,125,0.3);}
.border-sides{border-left:1px solid rgba(150,150,150,0.4);border-right:1px solid rgba(150,150,150,0.4);}
.border-top{border-top:1px solid rgba(150,150,150,0.3);}
.border-bottom{border-bottom:1px solid rgba(125,125,125,0.3);}
.border-undefined{border-collapse: collapse;}
h2{font-size:18px;text-align:center;}
h4{font-size:13.5px}
h2.level-1, h4.level-1{text-indent:12px;}
h2.level-2, h4.level-2{text-indent:24px;}
h2.level-3, h4.level-3{text-indent:36px;}
span.figref, span.clmref, span.patcite, span.nplcite{font-weight:bold;}
entry{display:table-column;}
sup2{vertical-align:65%;font-size:smaller;}
sub2{vertical-align:-65%;font-size:smaller;}
ul.ul-dash{list-style:none;margin-left:0;padding-left:1em;}
ul.ul-dash > li:before {display:inline-block;content:"-";width:1em;margin-left:-1em;}
o{text-decoration:overline;}
o.single{text-decoration:overline;}
u.single{text-decoration:underline;text-decoration-style:solid;}
u.double, o.double{text-decoration-style:double;}
u.dots, o.dots{text-decoration-style:dotted;}
u.dash, o.dash{text-decoration-style:dashed;}
smallcaps{font-variant: small-caps;}
</style>
</head>
<script>window.MathJax = { MathML: { extensions: ["mml3.js"]}};</script>
<script type="text/javascript" async src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/MathJax.js?config=MML_HTMLorMML"></script>
<body>

.... PLACE CONTENT HERE ....

</body>
</html>