You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thank you for sharing the code implementation of your work. I have been carefully studying your paper and codebase, but I have come across a discrepancy that I would like to clarify.
In your paper, the reduce_sim term is mentioned as a part of the loss function, which plays a role in optimizing the prompt selection process. However, in the provided code, it appears that reduce_sim is calculated but not included in the loss function (it is commented out). As a result, the prompt_key_dict, which is randomly initialized, does not receive any gradient updates during training.
Here are my specific questions:
Was the exclusion of reduce_sim from the loss function intentional in the provided code? If so, could you elaborate on the reasoning behind this decision?
How does the omission of reduce_sim affect the effectiveness of the learned prompts, especially since the keys (prompt_key_dict) remain randomly initialized without updates?
If this was an oversight, could you provide guidance on how to properly incorporate reduce_sim into the loss function and ensure that prompt_key_dict is updated?
Did the implementation consider the diversity of keys during training, such as tracking how frequently each key was selected? For example, is there a mechanism to ensure that less frequently used keys are adapted or penalized to promote diversity?
I appreciate your time and effort in addressing these queries, as they are critical to understanding and reproducing the results presented in your paper.
Thank you!
Best regards,
Jason
The text was updated successfully, but these errors were encountered:
Thank you for your interest in our TEMPO work! This code corresponds to our ICLR camera-ready version, where the prompt is designed in a semi-soft manner. We have left the implementation of the prompt pool in this repository solely for future research discussions and do not currently involve it in the training or inference process.
Dear Author,
Thank you for sharing the code implementation of your work. I have been carefully studying your paper and codebase, but I have come across a discrepancy that I would like to clarify.
In your paper, the reduce_sim term is mentioned as a part of the loss function, which plays a role in optimizing the prompt selection process. However, in the provided code, it appears that reduce_sim is calculated but not included in the loss function (it is commented out). As a result, the prompt_key_dict, which is randomly initialized, does not receive any gradient updates during training.
Here are my specific questions:
Was the exclusion of reduce_sim from the loss function intentional in the provided code? If so, could you elaborate on the reasoning behind this decision?
How does the omission of reduce_sim affect the effectiveness of the learned prompts, especially since the keys (prompt_key_dict) remain randomly initialized without updates?
If this was an oversight, could you provide guidance on how to properly incorporate reduce_sim into the loss function and ensure that prompt_key_dict is updated?
Did the implementation consider the diversity of keys during training, such as tracking how frequently each key was selected? For example, is there a mechanism to ensure that less frequently used keys are adapted or penalized to promote diversity?
I appreciate your time and effort in addressing these queries, as they are critical to understanding and reproducing the results presented in your paper.
Thank you!
Best regards,
Jason
The text was updated successfully, but these errors were encountered: