Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhance auto-rename functionality with additional options #1604

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

tanseer123
Copy link

Enhance auto-rename functionality with additional options

  • Added ability to rename files based on the content of specific lines.
  • Implemented renaming based on text between specified markers.
  • Added option to rename files using content from specific line numbers.
  • Enabled renaming based on text before or after a specified keyword.
  • Included regex-based renaming to allow more complex patterns.

Description

This pull request enhances the auto-rename functionality by adding keyword-based renaming options and a fallback mechanism. The updates are motivated by the need for more flexible and user-friendly file renaming methods, especially when dealing with documents that lack meaningful filenames or metadata.

Summary of Changes

  1. Keyword-Based Renaming:
    Entire Line Containing Keyword: The filename is generated using the entire line that contains the specified keyword.
    Text After Keyword: The filename is generated using the text that follows the specified keyword.

  2. Fallback Method:
    If no suitable filename is found using the keyword-based method, the existing method that renames based on the largest font is used as a fallback.

Motivation and Context

User Flexibility: The enhancement allows users to specify keywords for more precise and meaningful filenames, catering to various document types and user preferences.
Practical Use Cases: Examples include renaming payslips based on date or company name, making it easier to manage and locate specific documents.
Improved Usability: The fallback mechanism ensures that the renaming process remains robust even if the keyword-based method does not yield a suitable filename.

Implementation Details

Files Updated:
auto-rename.html
AutoRenameController.java
ExtractHeaderRequest.java
Approach: The implementation checks for the presence of specified keywords first. If no keywords are provided or no matches are found, the process falls back to the original method of using the largest font for renaming.

Testing
Test Cases: Tested with documents having various structures and content to ensure the keyword-based renaming and fallback methods work correctly.
Results: The changes have been verified to work as intended, providing more accurate and meaningful filenames based on user-specified keywords.

Please review the changes and let me know if there are any additional adjustments or improvements needed. Your feedback is highly appreciated.

Closes #(issue_number)

##Screenshots
The filename is generated using the entire line that contains the specified keyword (Entire line containing keyword).
image

Line containing after keyword
image

If no suitable filename is found using the keyword-based method, the existing method that renames based on the largest font is used as a fallback.
image

Checklist:

  • I have read the Contribution Guidelines
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • My changes generate no new warnings

Contributor License Agreement

By submitting this pull request, I acknowledge and agree that my contributions will be included in Stirling-PDF and that they can be relicensed in the future under the MPL 2.0 (Mozilla Public License Version 2.0) license.

(This does not change the general open-source nature of Stirling-PDF, simply moving from one license to another license)

- Added ability to rename files based on the content of specific lines.
- Implemented renaming based on text between specified markers.
- Added option to rename files using content from specific line numbers.
- Enabled renaming based on text before or after a specified keyword.
- Included regex-based renaming to allow more complex patterns.
@tanseer123 tanseer123 requested a review from Frooodle as a code owner July 29, 2024 09:15
@github-actions github-actions bot added Java Pull requests that update Java code Front End Issues or pull requests related to front-end development API API-related issues or pull requests labels Jul 29, 2024
<div class="form-group">
<label for="useAfter">Text to use:</label>
<select class="form-control" id="useAfter" name="useAfter">
<option value="false">Entire line containing keyword</option>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all text should use translation feature

ie
the:text="#{auto-rename.useAfter.false}"

<br>
<div class="form-group">
<label for="useAfter">Text to use:</label>
<select class="form-control" id="useAfter" name="useAfter">
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be good to support text before key word, and or regex as well

@@ -20,8 +20,29 @@
<form method="post" enctype="multipart/form-data" th:action="@{'/api/v1/misc/auto-rename'}">
<div th:replace="~{fragments/common :: fileSelector(name='fileInput', multiple=false, accept='application/pdf')}"></div>
<br>
<div class="form-group">
<label for="keyword">Keyword:</label>
<input type="text" class="form-control" id="keyword" name="keyword" placeholder="e.g., Company, Name, Invoice" required>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about if people want to auto grab the title without using any keywords like previous functionality, we still need to support that and make it clear to use this is option,
Perhaps a param to change from
Autodetect from title vs using keyword
and have options show based on that

@@ -16,4 +18,9 @@ public class ExtractHeaderRequest extends PDFFile {
required = false,
defaultValue = "false")
private boolean useFirstTextAsFallback;

private MultipartFile fileInput;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this multiFile input is already part of this class... its from PDFFile that's extended

@@ -16,4 +18,9 @@ public class ExtractHeaderRequest extends PDFFile {
required = false,
defaultValue = "false")
private boolean useFirstTextAsFallback;

private MultipartFile fileInput;
private String keyword;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any API documentation for these?

@@ -16,4 +18,9 @@ public class ExtractHeaderRequest extends PDFFile {
required = false,
defaultValue = "false")
private boolean useFirstTextAsFallback;

private MultipartFile fileInput;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fileInput already included via extended PDFFile

@Ludy87 Ludy87 added the Stale Issues or pull requests that have become inactive label Jan 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API API-related issues or pull requests Front End Issues or pull requests related to front-end development Java Pull requests that update Java code needs-changes Stale Issues or pull requests that have become inactive
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants