Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improves branch autolinks experiments #3898

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

nzaytsev
Copy link
Contributor

Description

Added chunk matching logic for non-prefixed numbers:

  • XX - is more likely issue number
  • XX.XX - is less likely issue number, but still possible
  • XX.XX.XX - is more likely not issue number

considered the distance from the edges of the chunk as a priority sign

the chunk that is more close to the end is more likely actual issue number

Checklist

  • I have followed the guidelines in the Contributing document
  • My changes follow the coding style of this project
  • My changes build without any errors or warnings
  • My changes have been formatted and linted
  • My changes include any required corresponding changes to the documentation (including CHANGELOG.md and README.md)
  • My changes have been rebased and squashed to the minimal number (typically 1) of relevant commits
  • My changes have a descriptive commit message with a short title, including a Fixes $XXX - or Closes #XXX - prefix to auto-close the issue that your PR addresses

@nzaytsev nzaytsev linked an issue Dec 20, 2024 that may be closed by this pull request
@nzaytsev nzaytsev marked this pull request as ready for review December 20, 2024 05:30
Copy link
Contributor

@axosoft-ramint axosoft-ramint left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite follow the logic of the added code (specifically, the priority calculation) - some comments throughout might help. @eamodio might have more insight into it.

It didn't seem to help with this case I encountered in my repo. Not a blocker for merge but would be nice if it can accommodate cases like this (the first issue is correct, the second is not):

image

Comment on lines 316 to 322
console.log(
JSON.stringify(match.value),
match.value.groups.numberChunk,
match.value.groups.numberChunkBeginning,
match.value.input,
match.value.groups.issueKeyNumber,
);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This console log should be removed. Likely left over from when you were testing it locally.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

@nzaytsev nzaytsev force-pushed the bugs/3894-improve-issue-matching-with-branch-names branch 2 times, most recently from 625e1ab to 8e05baa Compare December 24, 2024 08:14
@nzaytsev nzaytsev force-pushed the bugs/3894-improve-issue-matching-with-branch-names branch from 66a9fd1 to e082063 Compare January 15, 2025 09:05
@justinrobots
Copy link
Collaborator

@axosoft-ramint should this one make it into 16.2 or push to the next release?

@axosoft-ramint
Copy link
Contributor

axosoft-ramint commented Jan 21, 2025

@nzaytsev To help me review this PR, please explain this a bit:

considered the distance from the edges of the chunk as a priority sign

Some specific examples, including what would happen before your PR and after your PR, and an argument for why you believe this is the case:

the chunk that is more close to the end is more likely actual issue number

...would help me a lot.

Thanks!

@nzaytsev
Copy link
Contributor Author

nzaytsev commented Jan 22, 2025

@axosoft-ramint - all these theses are my assumptions based on our discussions and on my experience or feeling. All of these examples was created to "tune" the manual parsing algorithm. I realize that these theses are not always true, but we can not tune this algorithm to be working always for all users and all use cases. There are many contradictory cases and I still don't know how to handle release-notes-16 branch name as it has the number in the end and it doesn't match any release-like tag name. Ideally, we should use some ML, store the dataset on our servers and use users' manual (de)assosiations as additional datasets for learning. Otherwise we're doomed to tune the current algorithm endlessly

Speaking more contcrete

considered the distance from the edges of the chunk as a priority sign

Dec 20th, 2024 at 9:43 AM, when we was discussing to numbers in the branch name release-notes-16-1 and picking between 16 and 1 @eamodio said that

They both certainly could be. I was just suggesting that if we limit the matching to one match, we'd still have a match with the -1. Assuming we'd prioritize a number at the start or end first and then inner matches

But I didn't agree with that 1 is more likely the issue number than 16 and I proposed number chunks. So, now 16-1 is a number chunk where the meaningful part is the first. Sometimes devs can call their branches like below:

  • bugs/1-foo
  • bugs/1-2-foo-solve-bar
  • bugs/1-3-foo-fix-baz
    It means that devs work on issue 1 and have 3 revisions of the branch for some reason

the chunk that is more close to the end is more likely actual issue number

Here I applied the breadcrumbs logic. Maybe devs work on an epic with the key X and have a scope of issues inside the epic:

  • X/1
  • X/2
  • X/3
    In this case X is still an issue key, but 1/2/3 keys are more actual and specific. Same for subtasks:
  • X/1/4
  • X/1/5
  • X/1/6

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve issue matching with branch names
3 participants