-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wrong implementation of the SM2 algorithm #1
Comments
Hey man! Thanks for the comments, I will make sure to read more into it tomorrow and try to fix it. |
Hello @brumar, I don't understand how you assume that AFAIU, Could you share your thought about that please? I'm not familiar with python, and feel really lost in anki's source code. |
Hello @gquemener, public function calcInterval($time = 1, $factor = 2.5, $oldinterval=1)
{
if ($time < 1) {
throw new RangeException('The number of repetitions must be 1 or higher');
}
if ($time == 1) {
$interval = 1;
} elseif ($time == 2) {
$interval = 6;
} else {
$interval = $oldinterval * $factor;
}
return ceil($interval);
} and public function calcNewFactor($oldFactor = 2.5, $quality = 4)
{
if ($quality > 5 || $quality < 0) {
throw new RangeException('Quality must be between 0 and 5');
}
$newFactor = $oldFactor+(0.1-(5-$quality)*(0.08+(5-$quality)*0.02));
$fac=max($newFactor, 1.3);
return $fac; // 1.3 < factor
} I read carefully your comment, and I think the critical point is there _I don't see any recursive call in nextInterval. The reason is that it should not be computed in a recursive fashion. The current EF, or EF' as you call it, does only concern the relation between the old interval and the new one. This is why I think the original description of the SM2 can be misleading to this aspect. If you are still not convinced, think about how I(n) increases with a EF going from 2.5 to 2.8 with - let's say- n=8. Then the new interval is around 6 times higher than the previous one, this is not practical at all. Because from 1 to 7 the interval are successively multiplied by 2.8 instead of 2.5, then multiplied a last time with 2.8. The ratio of (I(n)/I(n-1) is computed with (2.8/2.5)^7 * 2.8 = 6.2. |
I'm a bit confused, There must be an error in my understanding of the algorithm. https://gist.github.com/gquemener/de9daa5d6891c83384e9 The interval does not increase with your solution... |
I gave you messy explanations, please accept my sincere apologies. |
I've updated the comparison gist, I think results are corrects now |
Cool. Few comments : $newInterval = calcInterval($n, $oldFactor, $oldInterval);
$newFactor = calcNewFactor($oldFactor, $q); and not $newFactor = calcNewFactor($oldFactor, $q);
$newInterval = calcInterval($n, $newFactor , $oldInterval); (2)The recursive one gives smaller intervals because the factor can't reach an higher value than 2.5. This is not a big problem, but as I stated in my first comment, I believe this is a misinterpretation of the SM2. http://www.supermemo.com/english/ol/sm2.htm : according to my understanding, this upper boundary is only related to the very first SM algorithm. |
(1) Actually the lines are: $oldInterval = calcInterval($n, $oldFactor, $oldInterval);
$oldFactor = calcNewFactor($oldFactor, $q); This is because, AFAIU, the newly calculated factor will only be used in the next interval computation (and not the current one). Am I wrong? Can you post an usage example if so please? (2) Indeed, plus the distribution of intervals is wider with your solution (with the recursive solution, answering 1 or 2 has the same result on the interval computation, same for answering 4 or 5). |
(1) Oh indeed. Sorry. Concerning the updating process, maybe we are not on the same page. According to me, aside the backtracking of EF, the idea of the main process is : subject answer → update Interval accordingly. Maybe I missed something, but reading your code, it seems that the interval is updated not with the current answer given but the previous one. It does not makes sense from a user perspective. But maybe I missed something. (2) Yes. (3) I forgot to mention something. I don't remember for @wiwofone implementation but to respect strict SM2, you should reset interval to the starting ones (1 then 6) when answer is below 4. Check the link I sent if you want to be sure. Personally, despite I think interval reset is a good idea, I think SM2 is a bit rough to consider that all answer below 4 force the rehearsal cycle to start over. IMHO, only 3 values of $q should be offered to the user : 0,3,4,5 for (blackout,hard,ok,easy). And no reset for the answer 3, only for 0. That's how anki proceed, and I think it's more clear for the user this way. |
(1) I see, indeed it makes more sense to use current question evaluation to evalute the next question interval! I've updated the gist. (3) I'll think about the interval reseting. In fact, the number of values is something i don't understand here. I've installed Anki, and it only proposes me 3 grades to evaluate my answers : Review (<1d), Correct (<10m) and Easy (4d). I don't see any reference to these grades in the SM-2 link you've provided. I guess this is the "modified" part in Anki of the SM-2 algorithm you mentionned but I don't find any documentation about it (apart from http://ankisrs.net/docs/manual.html#learning but I can't make any correlation between it and SM-2). I don't see how the interval written in the buttons could be respected either, because they will always be greater than 1 (day) and never be 4 (days), if it respects SM-2. Could you enlight me about that too please? |
Sorry if there is any errors or unclear elements in my answer. I have to hurry. I may get back to you in the evening to explain further some points if you need so. The behavior of Anki on new items and reseted items is special and do not strictly follow SM2. I won't be sure if I try to describe it, the best is to see by yourself how it works. For the latter rehearsal cycles, however, I can ensure you that it's more in line with the original SM2 proposition with a 4 buttons UI roughly related to the 0,2,3,5 possible qualities values I mentioned before and with no reset for 2. I know I mentioned 0,3,4,5 later, but it seems incorrect after reviewing the anki code source. def updateFactor(self, card, ease):
"Update CARD's factor based on EASE."
card.lastFactor = card.factor
if not card.reps:
# card is new, inherit beginning factor
card.factor = self.averageFactor
if card.successive and not self.cardIsBeingLearnt(card):
if ease == 1:
card.factor -= 0.20
elif ease == 2:
card.factor -= 0.15
if ease == 4:
card.factor += 0.10
card.factor = max(1.3, card.factor) I did the math one day, and if I remember well, this factor update is roughly equivalent in SM2 with what happens when $q equals to 2,3 or 5. So "ease" should be like $q-1 . I hope this helps. |
Hello guys, Thank again guys 😅 |
First of all, thank you a lot for this library coded with high quality standards. It saved me a lot of time. Here is my contribution :
I just noticed that your interval computation does not strictly follow the SM2 algorithm
I think you had a strict interpretation of this line :_''' I(n):=I(n-1)_EF'''* , that's why you got into a recursive computation to find I(n). Indeed, this mathematical notation seems dubious to me and should have been rewritten as '''I(n):=I(n-1)*EF(n)''' to be less ambiguous. If you are skeptic about my comment, you can check the implementation this function in anki .
Your computation of the new interval can be problematic for cards with a lot of repetitions small changes on EF involves big changes on the interval (it can suddenly double or divided by two for cards with more than 5 repetitions).
Diving more into SM2 for the purpose of this comment, I also realised that another misunderstanding might have occurred.
Avoiding $newFactor > 2.5 seems not consistent with the SM2 algorithm. I think you got mislead (and once again, the original article is not really clear) because the statement "E-Factors were allowed to vary between 1.1 for the most difficult items and 2.5 for the easiest ones" seems to be only related to the SM-0 (or SM-1 ?) algorithm. I also checked on anki, and there is no track of this upper bound.
Best Regards
The text was updated successfully, but these errors were encountered: