-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Question about the simulation dataset #4
Comments
Hi Fangfei, I just uploaded the simulation datasets used in the paper to the github repo, please try it out. Yan |
Thank you very much!! |
Hi Yan, I just tried one simulation dataset with TideHunter, but I only got 26 records in the result file. I'm not sure why it could happen. Could you help me figure it out? Kind regards, |
Hi Fangfei, Thanks for pointing this out. Yan |
Also, TideHunter could output multiple tandem repeats if possible. Yan |
Thank you, Yan. I tried the latest TideHunter with Fangfei |
Hi Yan, Sorry to bother you again. I'm trying to generate more simulation datasets with different sizes of repeat patterns (for example 20 and 50). But I'm a little confused with the use of Kind regard, |
The repeat pattern size and copy number were set without using pbsim. The different error rates and error ratios were set directly by feeding different parameters to pbsim program. Yan |
The current version of TideHunter has some improvements over the old one. So this is expected. |
May I ask about the base frequencies when generating random flanking sequences? Did you use equal base frequencies? |
The 100 bp sequence was also randomly extracted from the reference genome. |
Thanks a lot! And for different repeat sizes, I notice you used a 15% error rate. Is the error ratio the same as 15%-a or 15%-b? Sorry for too many questions... Fangfei |
They are different, please refer to the TideHunter paper published in Bioinformatics. |
Hi,
I'm a master student at the University of Melbourne and my research project is also to develop an approach for tandem repeat detection. To evaluate the performance of our approach, my supervisor suggests I could compare it with TideHunter using the 15 simulation datasets mentioned in your paper. Would you mind tell me where I could find the simulation data?
Many thanks and waiting for your reply.
Kind regards,
Fangfei
The text was updated successfully, but these errors were encountered: