You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, would you please help me understand how the preprocessing is done for theCovidQA corpus ? Why I ask is because the context in the CovidQA dataset seems to be so much larger than the maximum length set in the code (which is 300+ and BERT max_length is 512 tokens). How is the data processed to fit into the limit ? Couldn't find the code for that in the Git. Please advice. Thank you.
The text was updated successfully, but these errors were encountered:
For dealing with long context you have moving windows, compute possible answers per window and combine the answer afterwords. Hope that helps? If you have detailed questions about the processing, please ask in FARM or huggingface transformers directly
Hi, would you please help me understand how the preprocessing is done for theCovidQA corpus ? Why I ask is because the context in the CovidQA dataset seems to be so much larger than the maximum length set in the code (which is 300+ and BERT max_length is 512 tokens). How is the data processed to fit into the limit ? Couldn't find the code for that in the Git. Please advice. Thank you.
The text was updated successfully, but these errors were encountered: