PDF/UA support, part 2 #2018

hvbtup · 2025-01-02T09:11:06Z

Added some example PDF/UA reports.
Add support for declaring newer PDF/A versions.
Add support for declaring PDF/UA-2.
Add support for declaring PDF 2.0.
Use "auto" as default value for tagType property for some report element types to allow automatic context-depending setting as well as explicit setting.
Support generating "Caption" tag for tables.
Fix page-break handling of table rows and cells.
Create corresponding PDF tags for some HTML elements.
Add hyperlink support.
Allow creation of a PDF that conforms to PDF/A and PDF/UA at the same time.
Fixed specifying the document language based on report locale setting.
Added THead, TBody, TFoot tags for tables.
Fixed PDF syntax errors (contentByte.setColorStroke was called at wrong place in drawing sequence).
Removed support for Acrobat Flash.
Refactored attributes needed for split tracking into a dedicated class.
Refactoring of some PDF-tag related constants.
Added or improved some comments.
Removed some warnings.

I originally developed this using my hobby account hvbargen over several weeks.

…rfluous NonStruct tags for table caption

…hich is only allocated when a container is actually split.

…pends on context, eg TD or TDF

…pping

… remove Flash support

…nts for string literals

- Added some example PDF/UA reports. - Add support for declaring newer PDF/A versions. - Add support for declaring PDF/UA-2. - Add support for declaring PDF 2.0. - Use "auto" as default value for tagType property for some report element types to allow automatic context-depending setting as well as explicit setting. - Support generating "Caption" tag for tables. - Fix page-break handling of table rows and cells. - Create corresponding PDF tags for some HTML elements. - Add hyperlink support. - Allow creation of a PDF that conforms to PDF/A and PDF/UA at the same time. - Fixed specifying the document language based on report locale setting. - Added THead, TBody, TFoot tags for tables. - Fixed PDF syntax errors (contentByte.setColorStroke was called at wrong place in drawing sequence). - Removed support for Acrobat Flash. - Refactored attributes needed for split tracking into a dedicated class. - Refactoring of some PDF-tag related constants. - Added or improved some comments. - Removed some warnings. I originally developed this using my hobby account hvbargen over several weeks.

hvbargen · 2025-01-04T15:54:03Z

Reminder: Page numbering with the auto-text item causes and exception. I already added a commit and an example report with this commit,which I should add here when I'm at work again.

speckyspooky · 2025-01-04T17:10:16Z

Was the issue existing before or is it a reason of the changes?

hvbargen · 2025-01-05T11:37:22Z

The issue was existing before when you tried to generate PDF/UA.

hvbtup · 2025-01-06T13:44:42Z

I'll see what I can do this weekend.

speckyspooky

The most changes are added to get comments and some coding styles.
Also the question is added if it make sense to add PDF 2.0 support when we throw an exception if it is used through the report developer.

speckyspooky · 2025-01-06T13:52:00Z

...birt.report.engine.emitter.pdf/src/org/eclipse/birt/report/engine/emitter/pdf/PDFRender.java

+				PdfArray kids;
+				PdfObject kido = currentPageDevice.structureCurrentLeaf.get(PdfName.K);
+				if (kido == null) {
+					kids = new PdfArray();


The standard convention is to use name like "child" & "children" instead of "kid" and "kids".
Therefore change it to child & children.

The naming is used this way inside OpenPDF and the rare examples I could find in the net. Thus I used the same naming convention.

But the BIRT developer used Schild & children, you are the first with Kids.
There is no Methode with "getKids" but you will finde "getChildren" and "getFirstChild()".

So it would be good to keep on child & children.

Even for English it's poor English and I don't recall seeing such terminology used elsewhere.

Yes, lets use the children and child convention. I understand your idea to follow their pattern, but we should stick to our own.

speckyspooky · 2025-01-06T13:53:27Z

...birt.report.engine.emitter.pdf/src/org/eclipse/birt/report/engine/emitter/pdf/PDFRender.java

 					break;
 				}
 			} catch (Exception e) {
 				logger.log(Level.WARNING, e.getMessage(), e);
 			}
+			if (currentPageDevice.isTagged()) {
+				PdfArray kids;
+				PdfObject kido = currentPageDevice.structureCurrentLeaf.get(PdfName.K);


Better name for "kido" = "kid-object", normally we name it child e.r. "childPdfObj" or "childPdf"

speckyspooky · 2025-01-06T13:55:20Z

....birt.report.engine.emitter.pdf/src/org/eclipse/birt/report/engine/emitter/pdf/PdfNames.java

+import com.lowagie.text.pdf.PdfName;
+
+/**
+ * @since 4.19


A description of the class is missing. You are the developer with the deepest knowledge and for other developers it would be helpfull to have a short description.

speckyspooky · 2025-01-06T13:55:32Z

...se.birt.report.engine.emitter.pdf/src/org/eclipse/birt/report/engine/emitter/pdf/PdfTag.java

+package org.eclipse.birt.report.engine.emitter.pdf;
+
+/**
+ * @since 4.19


A description of the class is missing. You are the developer with the deepest knowledge and for other developers it would be helpfull to have a short description.

speckyspooky · 2025-01-06T13:57:03Z

....birt.report.engine.emitter.pdf/src/org/eclipse/birt/report/engine/emitter/pdf/PdfNames.java

+ * @since 4.19
+ *
+ */
+@SuppressWarnings("javadoc")


The core pain of the old code is that there is no documentation is given on class, methode and property level.
So it would be much better to avoid the "SuppressWarnings" and add regularly comments to all elements.

speckyspooky · 2025-01-06T14:32:03Z