You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks for your excellent works! However,I have a question about STAR-Prompt. I came across a paper at ICLR 2025 that is very similar to your work, and some of the reviewers raised concerns about the fairness of incorporating CLIP.
For instance,"The use of CLIP’s text and image encoders, which contain extensive pre-trained knowledge, raises concerns about data overlap. The testing data may overlap with or be highly correlated to CLIP’s training data, making the observed performance gains somewhat expected. It’s unclear if the improvement is due to the novel aspects of TIPS or simply the inclusion of CLIP, as other methods might similarly benefit from using CLIP. This ambiguity makes it unclear to identify the key design elements driving performance improvement."
How do you view this issue? Did you encounter similar questions during the submission process?
I am looking forward to your reply. Thank you!
The text was updated successfully, but these errors were encountered:
Hi @kkkcLi! It seems that your question is related to this paper currently under review. We had a similar experience in a previous submission of STAR-Prompt. At the time, we included datasets that significantly differ from the CLIP's pretraining data in the rebuttal, but we ended up rejected anyway.
In the new submission we decided to cover as much datasets as possible (medical and fine-grained datasets, as suggested by the reviewer of the paper you referenced, but also satellite and aerial datasets). Since the real CLIP pretraining data is private we added those datasets that have a low zero-shot performance.
Thanks for your excellent works! However,I have a question about STAR-Prompt. I came across a paper at ICLR 2025 that is very similar to your work, and some of the reviewers raised concerns about the fairness of incorporating CLIP.
For instance,"The use of CLIP’s text and image encoders, which contain extensive pre-trained knowledge, raises concerns about data overlap. The testing data may overlap with or be highly correlated to CLIP’s training data, making the observed performance gains somewhat expected. It’s unclear if the improvement is due to the novel aspects of TIPS or simply the inclusion of CLIP, as other methods might similarly benefit from using CLIP. This ambiguity makes it unclear to identify the key design elements driving performance improvement."
How do you view this issue? Did you encounter similar questions during the submission process?
I am looking forward to your reply. Thank you!
The text was updated successfully, but these errors were encountered: