-
Notifications
You must be signed in to change notification settings - Fork 0
005. May 28 to June 1
aradu12 edited this page Jul 4, 2018
·
1 revision
- task 1: mine more repos using filtering and linking to issues ✔️
- task 2: mine some python repos ✔️
- linking commits to issues definitely gives more true positives
- filtering to .java files and filtering out "typo", "NPE" reduced the number of false positives
- stemming of keywords hasn't yet resulted significant improvements, but probably will in the long run
- gathering data for 50 java repos with the most stars on github, which haven't already been mined
- filtered to .py files and kept filtering for "typo", "NPE" to reduce false positives
- have looked at 8 repos so far
- interesting situation that keeps coming up is that some repos have lots of stars but very few commits (<30); these repos don't give many hits
-- from last week:
- filtering commit messages to those affecting .java files and using stemming
- using issues to find misuses
- separating project-specific misuses from project-independent ones
- adding API and 'correct usage' info to data
- adding a general 'rule' message to data
- removed the data from the 'plaid' repo that was irrelevant
- was thinking of adding a 'lang' description to data to clarify if it is a java or python project