Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New practice exercise - Perceptron #678

Closed
wants to merge 30 commits into from
Closed

New practice exercise - Perceptron #678

wants to merge 30 commits into from

Conversation

depial
Copy link
Contributor

@depial depial commented Dec 10, 2023

This is a first attempt at contributing an exercise, so there may be some deficiencies. I haven't received a reply to the issue I opened, so I will leave this as a PR in case it's of interest to Exercism.

  • Since this is my first contribution, I would appreciate any help in bettering the exercise so I can contribute more efficiently in the future.
  • I was unable to get configlet to automatically create the UUIDs for the tests, so I've done it manually (using the UUID function).
  • The test setup uses pseudo-random dataset generation. I haven't seen this technique in other Exercism exercises, so I don't know if this creates a the above problem.

@depial
Copy link
Contributor Author

depial commented Dec 10, 2023

It looks like a lot of checks failed, so I apologize since I'm not sure what happened.

@yctai1994
Copy link
Contributor

Would you mind explaining what you intend to do in your commit? Particularly for the test cases, it would be better to provide any math explanation. If so, I can try to help with your pull request after next week.

@depial
Copy link
Contributor Author

depial commented Jan 15, 2024

Edit: I just had a look at the Checks tab and one issue is that the dotest is not defined. So there might be an issue with how that function is being called. I'll look a little more into this to see if I can find a fix. Below is my original response to your question.

TLDR: I have experience creating coding exercises, but, while I've done my best to read the Exercism docs, I still have quite limited experience with GitHub and the Exercism contribution procedure, so that is likely where I'll need the most help.

Excuse me if I misunderstand your question, since I'm new in this process, but I'll give a bit more mathematical description of the functions I use in creating the test cases and testing the returned values.

In the test cases, there are two types: fixed and pseudorandom. The fixed cases are just an array of six points and an array of labels corresponding to the points' class. The pseudorandom tests use the function population in testtools.jl to create linearly separable arrays of many more points and labels, but a seed is used, so the same test set is returned every time for reliability. Effectively, the function first creates a line (pseudorandomly), then uses it as a reference to create the separated populations of the two classes to be passed to the student. Care has been taken to keep the populations close to the origin, while allowing for some lopsidedness (more of one class than the other). If you would like more technical mathematical description, please let me know.

The task, put simply, is to return a line (in hyperplane format) which can completely separate the points with one class on one side and the other class on the other side. The other function in testtools.jl, dotest, tests the hyperplane returned by the student via an elementwise dot product and scalar multiplication which can determine on which side of the hyperplane the points are. If $h = hyperplane$, $p=point$ and $l = label$, then the equation to see if the point is on the "correct" side is $$(h \cdot p) l>0$$ However, a hyperplane still separates the points if $(h \cdot p) l<0$ for all points. However, this isn't the convention in ML, so this is tested, but does not result in a test failure. For more mathematical detail on these aspects, please see instructions.md

That being said, I believe there should be no problem with the mechanics/mathematics of the exercise, since I've tested it quite thoroughly a couple of places. I figure there might be an issue with configuration, since this is my first contribution here and I've likely messed something up. For example, I'm not sure if Exercism supports this type of pseudorandom test generation since I've only seen the fixed test type which usually get a separate UUID for each one in tests.toml (meanwhile the whole pseudorandom group of 40 tests gets a singe UUID in my test setup). One other thing is that there could perhaps be version issues, since I wrote this in Julia 1.8 and I see 1.6 and 1 in the failed checks above. While I didn't have any issues when testing with 1.5, I can no longer be sure I haven't made any version specific changes since those tests.

Please feel free to ask any further clarifications.

@depial
Copy link
Contributor Author

depial commented Jan 15, 2024

After looking over the errors more, it looks like runtests.jl fails to include testtools.jl and the test runner check fails with error message: "LoadError: SystemError: opening file \".//testtools.jl\": No such file or directory....

I've seen extra files added to the two normal files (slug.jl and runtests.jl) such as in Alphametics which has the permutations.jl file added. However, permutations.jl is not called in the test runner (it's just meant to be accessed by students), so I figure the issue is if including an 'outside' file from the test runner is possible.

In other words, is it possible to get testtools.jl to be visible to runtests.jl in the test runner?

@cmcaine
Copy link
Contributor

cmcaine commented Jan 30, 2024

I pushed a merge commit so we can see how CI likes it when running on a newer version of the repo. If that is inconvenient for you, please let me know and I won't do it again :)

@depial
Copy link
Contributor Author

depial commented Jan 30, 2024

Thanks Colin! It looks like there's still an issue with testing. However, I think it's a different (but possibly related) error from what I saw before. The LoadError for testtools.jl is gone and now I'm seeing UndefVarError: dotest not defined, where dotest is a helper function in testtools.jl

I didn't want to put the testing function dotest directly into runtests.jl because it has code which could be copied into a solution. That's why, instead of having all the testing helper functions in runtests.jl, I created testtools.jl in the same folder hoping the functions could be called from runtests.jl. I'm not sure if this is possible or if all functions have to be defined explicitly in runtests.jl.

@cmcaine
Copy link
Contributor

cmcaine commented Jan 31, 2024

You're welcome :)

The error is probably because we override include() when running the tests for the whole track at once.

I'm not sure there is a good way to hide the test code from the student. I'd just put any spoilery functions at the bottom of the test file with a note saying "Reading past here will give you the answer, so watch out!"

Added helper functions from `testtools.jl` to bottom of file so it doesn't have to be included. Since the `dotest` function does not show a working solution (only how to check for one), I think a spoiler warning is unnecessary, and could just end up attracting more attention than without.
Tests still failed with helper functions at the bottom of the file, so I'm moving them to the top to see if that changes things.
@depial
Copy link
Contributor Author

depial commented Jan 31, 2024

It seems to be working now* :)

I've put the helper functions at the head of runtests.jl since the test runner still produced an error (dotest undefined etc) when I put them at the bottom. Is there a way to put them at the bottom of the file? I'd much prefer that.

I've left the testtools.jl in the PR for now, but maybe that should be deleted?

It's too bad that we can't include other files for testing. It would make testing coding problems with pseudorandom test generation much easier since we could use a working reference solution to check against the students'. However, this exercise doesn't have an explicit working solution, so it's not a huge concern.

*just a note: the julia 1.6 - windows-2022 check took ~10 minutes just to load dependencies

@cmcaine
Copy link
Contributor

cmcaine commented Jan 31, 2024

The main motivation is that you want testtools.jl not to appear in the web UI, right?

I'm not sure that's practical while also having it be downloaded for people who use the exercism CLI.

@cmcaine
Copy link
Contributor

cmcaine commented Jan 31, 2024

We can put the functions at the end of the file if we invoke the testset within a function and only call that function at the end of runtests.jl

Wrapped the test set in a function to be called at the end of the file so the helper functions can be put at the bottom of the file.
Deleting since test helper functions have been moved to `runtests.jl`
@depial
Copy link
Contributor Author

depial commented Jan 31, 2024

I'm not sure that's practical while also having it be downloaded for people who use the exercism CLI.

Oh, good point! I hadn't considered compatibility with the CLI. Cheers!

I've put the test set in a function and called it at the end of the file, as you suggested, and it looks like all the tests have passed. I've also deleted the testtools.jl file since it's superfluous now.

@cmcaine One other thing I would like to ask about is attribution. I had put my name as author and yours as contributor just to not leave those fields blank, but is it okay to leave them blank? I don't really care to be listed as an author and I don't know if you want to be credited as a contributor or not.

@cmcaine
Copy link
Contributor

cmcaine commented Feb 1, 2024

Cool, thanks for fixing these things up :) I'll read through it properly in the next couple of days!

I don't know what happens if no authors are listed. If you strongly prefer not to be listed as the author on the website then you could try to credit @JuliaLang or @exercism. I don't know if that would work, though.

@depial
Copy link
Contributor Author

depial commented Feb 1, 2024

Thanks for the help! :)

I'm quite flexible on the author situation, so if it's any hassle at all, I certainly don't mind going with the norm. Feel free to modify/leave that (or anything else) as you see fit!

exercises/practice/perceptron/.docs/instructions.md Outdated Show resolved Hide resolved
exercises/practice/perceptron/.docs/instructions.md Outdated Show resolved Hide resolved
exercises/practice/perceptron/.meta/config.json Outdated Show resolved Hide resolved
@@ -0,0 +1,25 @@
# This is an auto-generated file.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it? I don't see a perceptron exercise in problem-specifications.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if I fixed this or not. I've deleted the auto-generated comment. I had to manually create the UUID since when I ran configlet, it kept saying everything was done even when there were no UUIDs in this file (sorry I can't remember the exact message). Let me know if I need to do something else

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I'm not very familiar with what configlet needs when we're writing a practice exercise from scratch (one where there is not an upstream definition in problem-specifications). Someone in @exercism/reviewers might know.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at some problem-specifications, I'm wondering if psedorandom test generation is possible on the platform. Checking canonical-data.json for Alphametics shows each exercise with data such as a unique UUID, plus the input and expected result. If this file is necessary and needs the hardcoded input and expected output, although possible, it'll be a bit more work to produce.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

canonical-data isn't required. You can see parametric tests in the rational-numbers exercise and elsewhere.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! I'd forgotten about the testset at the end of Rational Numbers :)

using Test, Random
include("perceptron.jl")

function runtestset()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should include some tests with manually specified input data of just a few points to make this more approachable and also as good practice (e.g. first few tests could be spaces with just 2-4 points placed manually).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I'm not sure if I've understood, but there are four tests with (six) manually specified points to illustrate a couple of different possible orientations of a hyperplane (testset "Low population"). After that the 40 pseudorandomly generated tests begin (testset "Increasing Populations"). Was this what you meant?

Copy link
Contributor

@cmcaine cmcaine Feb 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant we could take some of the manually specified examples out of the runtestset() function and test them without the support functions, just so that it's less mystifying to a student reading the tests.

And maybe we should have the student write their own function for finding the computed label of a point?

e.g.

# Student must implement both `perceptron(points, labels) -> boundary (a vector of 3 weights)` and `classify(boundary, point)`

@testset "Boundary is a vector of 3 weights" begin
    boundary = perceptron([[0,0], [3, 3]], [1, -1])
    @test eltype(boundary) <: Real
    @test length(boundary) == 3
end

@testset "Originally provided points should be classified correctly" begin
    boundary = perceptron([[0,0], [3, 3]], [1, -1])
    @test classify(boundary, [0, 0]) == 1
    @test classify(boundary, [3, 3]) == -1
end

@testset "Given 3 labeled points, an unseen point is classified correctly" begin
    # Adding more points constrains the location of the boundary so that we can test
    # the classification of unseen points.
    boundary = perceptron([[0,0], [1, 0], [0, 1]], [-1, 1, 1])
    @test classify(boundary, [0, 0]) == -1
    @test classify(boundary, [2, 5]) == 1
end

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure! I can throw a couple in there like your first examples, but I've got a question about the classify:
Due to the wide range of possible decision boundaries returned by Perceptron, beyond carefully selected examples, I'm not sure if testing classification of unseen points is viable. Also, since the algo is effectively just the three parts of classify + update + repeat, couldn't requiring a separate classification function introduce an unnecessary constraint on the task?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool!

I don't see why it would constrain the student or the task for us to ask them to provide a classify function. What do you mean by that?

I agree that testing with unseen points is a bit tricky, but it's not super hard given that we control the input data and it's not too bad for us to either only include hand-written tests or to do some geometry to ensure our unseen pointd will always be classified correctly by a valid linear boundary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the learning rate trivial?

Sorry, over-generalization from me there :) I guess I'm considering the basic (pedagogical) Perceptron, i.e. one provided with small, dense, separable populations, since, under separability, the learning rate doesn't affect either the final loss or the upper bound of the number of errors the algorithm can make. That said, I was wrong to say the initial hyperplane affects these, since it doesn't either. It just seems non-trivial to me because it can be returned.

I'll try to make the necessary changes to the PR today and/or tomorrow :)

Copy link
Contributor Author

@depial depial Feb 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there anything you would add (e.g. subtyping) to my suggestion for a slug, or do you think I could just copy and paste?

Edit: In my initial suggestion for the slug, conversion to Float64 is enforced, but the exercise (as presented) could also be handled entirely with integers. Should I drop all references to Float64? Something like:

mutable struct Perceptron
    # instantiates Perceptron with a decision boundary 
    # this struct can remain unmodified
    dbound
    Perceptron() = new([0, 0, 0])
end

function fit!(model::Perceptron, points, labels)
    # updates the field dbound of model (model.dbound) and returns it as a valid decision boundary
    # your code here
end

function predict(model::Perceptron, points)
    # returns a vector of the predicted labels of points against the model's decision boundary
    # your code here
end

It might make it appear cleaner and less intimidating for students unfamiliar with the type system.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the struct is really adding much, so I'd remove it. You can pick what you like, tho.

If I try the exercise and hate the struct then I might veto it, but not until then.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should include some tests with manually specified input data of just a few points to make this more approachable and also as good practice (e.g. first few tests could be spaces with just 2-4 points placed manually).

Agree with this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with this

Could you be more specific? There are already four tests with six manually specified points which check for different possible orientations of a decision boundary.

Beyond this, we've had an extensive conversation about using "unseen points". (TLDR: I believe testing unseen points to be potentially more detrimental to understanding than beneficial)

The other ideas for tests were of the type to see if the student is returning the correct object (vector of three real numbers), etc. Is there something else you were thinking?


We will be dealing with a two dimensional space, so our divider will be a line. The standard equation for a line is usually written as $y = ax+b$, where $a,b \in \Re$, however, in order to help generalize the idea to higher dimensions, it's convenient to reformulate this equation as $w_0 + w_1x + w_2y = 0$. This is the form of the [hyperplane](https://en.wikipedia.org/wiki/Hyperplane) we will be using, so your output should be $[w_0, w_1, w_2]$. In machine learning, ${w_0,w_1,w_2}$ are usually referred to as weights.

While hyperplanes are equivalent under scalar multiplication, there is a difference between $[w_0, w_1, w_2]$ and $[-w_0, -w_1, -w_2]$ in that the normal to the hyperplane points in opposite directions. By convention, the perceptron normal points towards the class defined as positive, so this property will be checked but not result in a test failure.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should maybe be a test failure. It would simplify the messaging for the student in the instruction and from the tests. Do you think that would be okay?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, negative numbers are scalars (at least to me?), so I would probably drop the first part of the first sentence.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should maybe be a test failure. It would simplify the messaging for the student in the instruction and from the tests. Do you think that would be okay?

I didn't want to strictly penalize a flipped normal since it is just a convention (logical though it may be) and it would at still tell the student they have successfully found a decision boundary. In any case, I think it would be nice to leave this in, but if you think it's better to enforce the convention (which of course is good practice), we can turn this into a failed test, with it's own error message.

Also, negative numbers are scalars (at least to me?), so I would probably drop the first part of the first sentence.

I've rephrase the scalar multiplication part to make more clear what I was trying to say (that a hyperplane looks the same multiplied by any scalar, but negative scalars flip what we define to be the normal). Hopefully that's clearer :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if we ask the student to define a classify(boundary, point) function then they'll have to get consistent about the boundary anyway?

See #678 (comment)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Edit: moving this down here to split up topics

I agree that testing with unseen points is a bit tricky, but it's not super hard given that we control the input data and it's not too bad for us to either only include hand-written tests or to do some geometry to ensure our unseen pointd will always be classified correctly by a valid linear boundary.

I was thinking about this, and it appears it is redundant to test unseen points which are guaranteed to be on the correct side of any possible decision boundary for a population. That's because there are two hyperplanes which define the limits of a cone of possibility, and each of these limits can be defined by two points, one from each class. These two points work like support vectors, and are robust to any additional (guaranteed valid) unseen points. This means only the support vectors need to be classified correctly to verify that the hyperplane is a valid decision boundary, and this is already done under the current testing.

For this reason I think it would be best to leave out testing of unseen points, since, for students encountering this for the first time, it could give the unintended impression that Perceptron will always correctly classify unseen points.

Any thoughts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe I understand your intentions, but I want to check if I'm getting everything:

  1. Confused students won't be able to encode the seen points in the perceptron and pass the tests.

I can possibly understand this two ways, so I'll take a stab at both:

  • Do you mean students won't have seen-points to use locally to debug? If so, the first set of fixed tests have the points and labels shown explicitly.
  • Or, do you mean they might hard code the answers into their function in order to pass the tests (i.e. 'cheat')? If so, the pseudorandom tests don't explicitly show the seen points, and someone would have to hard code over 1200 points across 40+ tests to pass, which, while doable, seems unlikely (also couldn't they just look at other students' solutions after submitting once?).

We can have unseen points without producing ambiguous tests by picking points that are geometrically guaranteed...

This actually still leaves an ambiguity. If a student presents an incorrect decision boundary (DB), the unseen point is not guaranteed to be either correctly or incorrectly classified. It all depends on the particular DB. Furthermore, if the point happens to be classified correctly, it can give the impression that the DB is correct instead of properly relaying the actual information of being just not incorrect, thereby misleading the student.

Just a note: the reason I said that any test that can be done with unseen points can be done as well or better with seen-points is that correct classification of a maximum of four special points (support vectors) out of the seen-points is both necessary and sufficient to tell if the DB is 100% correct or not. Any other test is technically deficient on its own, and deficiency introduces ambiguity. On the other hand, any other in conjunction with classifying the support vectors is superfluous. However, finding them is a non-trivial task, so every point is exhaustively classified in testing (to make sure they are found and tested).

Another option would be to train on and classify larger datasets and assert that the accuracy is above some threshold that it is either impossible or implausible for a correct implementation to miss.

I believe this is what the pseudorandom test set does and accuracy is required to be 100%. Is there something further that I'm missing?

I think it's fine to just call both functions for every test.

I believe this adds unnecessary confusion when there is a failed test as to which function is causing the failure.

  1. the tests will make more sense to students who read them.

At this point, even if this is the case, the added possibility of confusion created through various ambiguities, and/or the difficulties in trying to mitigate them, make including these features seem to be not worth it. I don't mind including ambiguity in exercises, since it provides 'real world' experience, but it depends on the audience. From what I gather from your concern in this point, you are looking out for less experienced students, so I would presume less ambiguity would be better?

E.g. for linearly separable datasets, I think the centroid of any triangle drawn from points of the same set will be classified correctly by any correct implementation.

Thanks! I was trying to think of the simple way you had mentioned, but I had lost the trees for the forest :)

Copy link
Contributor Author

@depial depial Feb 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that it's necessary to test the training and classification separately, though we can choose to do so. I think it's fine to just call both functions for every test, as shown above.

Sorry, I just wanted to clarify, I am on board with a classification function if it's tested separately before being used in conjunction with the perceptron function.

It's really the unseen points I'm most hesitant to accept. I think they would be very appropriate in the case of SVM, but the nature of Perceptron just seems to make them more of a bane than a benefit.

Copy link
Contributor

@cmcaine cmcaine Feb 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think your belief that unseen points will cause ambiguity (test failures for valid decision boundaries) is making you think that they're a big problem.

But, as I hope the triangle centroid thing shows, we can generate unseen points with zero chance of them causing false test failures.

If you don't believe the triangle thing, then maybe you can believe that, for linearly separable datasets, any point contained by the convex hull of the points of one label will be given that label by any linear boundary-based classifier that has zero error on the training set. For a linear boundary and a 2D space that one should be provable geometrically with a ruler and some drawings.

Or you can easily show that any point on the line segment between the support vectors for a label must be classified correctly by any valid linear decision boundary.

Copy link
Contributor Author

@depial depial Feb 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I have failed to unmuddle my concern :) I'm okay with testing to demonstrate how classification of unseen points works, but testing as proposed is too easily confused with testing unseen points to verify the correctness of a decision boundary. My thoughts on why testing unseen points to verify correctness is undesirable follow:

I'm with you on the centroid idea (which I quite liked) and familiar with the other arguments you've given, so I'm not concerned about test failures for valid decision boundaries. The minor issue I have in the case of valid decision boundaries is that testing with unseen points to validate a decision boundary is simply redundant. Why it is redundant can be gleaned from your arguments on why and how we can choose correct unseen points: namely they are chosen entirely in reference to seen points.

One issue that concerns me involves invalid decision boundaries. With an invalid decision boundary, there is no guarantee how an unseen point will be classified (just as there is no guarantee how any point will be classified). There are two cases:

  1. The unseen point is classified incorrectly and there is a test failure. The student is alerted to the fact that their decision boundary is incorrect. This I'm okay with, although it is still redundant.
  2. The unseen point is classified correctly and the test passes. The student is not alerted to the fact their decision boundary is incorrect and either receives conflicting information from other tests or may indeed end up believing that their decision boundary is correct. This is the first aspect I dislike.

The other issue that concerns me is on a more psychological note: People often take partial information, over-generalize and create false impressions (e.g. stereotypes/prejudices/etc). That's how some of the more naive or inattentive students, seeing testing on unseen points in this way, could walk away with the mistaken impression that Perceptron can correctly classify any unseen point (not just carefully selected points). This is the second aspect I dislike.

So, with what the possible outcomes are from testing unseen points, redundant good info or unnecessary potentially misleading info, and the possible misconception about Perceptron that can be conveyed simply by including the testing, I don't feel the testing of unseen points in the proposed manner brings enough to the table to be included.

To boil it all down: I feel that 'testing' unseen points to demonstrate how classification works has value, but I think this is too easily confused with testing to verify correctness of the decision boundary. I currently can't think of tests which can eradicate my concerns above. However, there are related ideas we could demonstrate.

For example, we could show that the introduction of a new point to the data will (in all likelihood) result in a different decision boundary. This doesn't involve classification of unseen points though. As an aside: with SVM, we could introduce an element of a demonstration of unseen point classification switching with a before and after approach, but it just doesn't work with Perceptron because of the non-uniqueness of the decision boundary (i.e. we can't be guaranteed a switch in classification).

Do you have any other ideas on ways to demonstration classification of unseen points?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cost vs benefit analysis that is going on in my head is the following:

  1. Testing unseen points to verify correctness is both redundant and carries downsides
  2. Testing unseen points to demonstrate classification is desirable, but unnecessary to satisfy the aim of the exercise.

The exercise is first and foremost meant to have students learn about the Perceptron algorithm. Including a demonstration of how classification works would be a bonus, but I just am not sure this type of testing environment properly supports demonstration of this kind.


Now that we know how to tell which side of the hyperplane an object lies on, we can look at how perceptron updates a hyperplane. If an object is on the correct side of the hyperplane, no update is performed on the weights. However, if we find an object on the wrong side, the update rule for the weights is:

$$[w_0', w_1', w_2'] = [w_0 \pm l_{class}, w_1 \pm x*l_{class}, w_2 \pm y*l_{class}]$$
Copy link
Contributor

@cmcaine cmcaine Feb 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This equation and the next two paragraphs are not clear enough, imo.

The update rule is where the magic happens, so it would be nice to explain this better and try to give the student some insight into how the rule works.

The $\pm\mp$ stuff will also be impenetrable to most readers, I think.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got rid of the plus/minus stuff, since I agree it's confusing/distracting and it really isn't terribly important anyway.

The update rule is where the magic happens, so it would be nice to explain this better and try to give the student some insight into how the rule works.

Were you thinking along the lines of more explanation of how the update makes the line move? Or were you thinking more clarity on how to implement it? I had left the implementation details a little bit vague intentionally (stopping short of pseudocode), hoping to leave room for the student to be creative or do some research for their implementation. However, I'm aware that some students are at a learning level where this is frustrating. Let me know you thoughts :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll see if I have time and energy to attempt the exercise myself during the week and maybe that will help me clarify my ideas :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

$$[w_0', w_1', w_2'] = [w_0 \pm l_{class}, w_1 \pm xl_{class}, w_2 \pm yl_{class}]$$

We also don't support rendering equations as far as I can recall.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also don't support rendering equations as far as I can recall.

Fixed.

Better blurb
Delete auto-generated comment as these were manually created
Updates to instructions including clarifications and an added gif from https://commons.wikimedia.org/wiki/File:Perceptron_training_without_bias.gif
exercises/practice/perceptron/.docs/instructions.md Outdated Show resolved Hide resolved
exercises/practice/perceptron/.docs/instructions.md Outdated Show resolved Hide resolved
using Test, Random
include("perceptron.jl")

function runtestset()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should include some tests with manually specified input data of just a few points to make this more approachable and also as good practice (e.g. first few tests could be spaces with just 2-4 points placed manually).

Agree with this

depial and others added 2 commits February 16, 2024 12:46
Co-authored-by: Victor Goff <keeperotphones@gmail.com>
@cmcaine
Copy link
Contributor

cmcaine commented Feb 22, 2024

Sorry for radio silence. I've not had the energy to do the exercise yet. Maybe next week. If anyone else fancies giving it a go and posting their comments about it then that's welcome and about the only thing that will speed up the process.

Otherwise, no worries, I'll get to it when I can.

@depial
Copy link
Contributor Author

depial commented Feb 22, 2024

Sorry for radio silence. I've not had the energy to do the exercise yet.

No worries! I'm in no hurry and have been learning a fair bit, which leaves me quite content :)

I've updated a few files:

  • Example solution in example.jl has been exploded (i.e. ala Russian nesting doll), so it is in a more modular form. That way, it wouldn't have to be changed if a classify function gets included in the tests for the exercise.
  • runtests.jl has had the element type and length tests added, and tests.toml has had manually created UUIDs inserted for said tests.
  • runtests.jl has had the manually specified six-point populations changed to four-point populations, so they are easier for the student to visualize. Also the dotest function has been renamed to isvalidboundary, to be more informative in testing.
  • Minor changes to variable name have been made throughout to be more informative.

After a lot of contemplation on the other suggested tests, I've come to the conclusion that the potential collateral effects are too detrimental in this particular learning environment for me to justify adding them. Of course, discussion on this is still welcome.

I'll look at splitting the introduction.jl into introduction.jl and instructions.jl files, but I'm still not sure if they have to be self-contained or can reference one another. I believe referencing is not a problem on the website, since the two seem to show up one after the other, but I'm unsure about how it would look on the CLI since I'm not very familiar with it.

@cmcaine
Copy link
Contributor

cmcaine commented Feb 22, 2024

Glad to hear you're content and those changes sound fine :)

For the unseen points, I think your complaint is that we can pick valid unseen points but we can't ensure that an invalid decision boundary or implementation will fail those tests. Whereas if we test only the support vectors then a decision boundary that passes is by definition valid.

Is that your position?

I agree that that is true. I don't think it's a problem, though. It's extremely common for an incorrect solution to pass some tests and not others. My suggestion is that we test the support vectors (or all the labeled points, whatever) and also some unseen points.

I can't help much regarding the introduction/instructions split. CLI users probably get either two files or the two files are concatenated into a single README.md

@ErikSchierboom
Copy link
Member

The CLI will concatenate them

@depial
Copy link
Contributor Author

depial commented Feb 22, 2024

For the unseen points, I think your complaint is that we can pick valid unseen points but we can't ensure that an invalid decision boundary or implementation will fail those tests. Whereas if we test only the support vectors then a decision boundary that passes is by definition valid.

This is one worry. However, I don't worry that the student will get past all the other tests, but more about what implicit information is being given to the student by using this testing. For example, my biggest concern is that some (inattentive/naive) students, seeing testing of this type, can form the misconception that Perceptron will always classify unseen points correctly.

Don't get me wrong, I like to induce misconceptions when teaching in person, but that's because dispelling them often produces a memorable lesson. However, here, we don't have the luxury of noticing or dispelling misconceptions. That's why I think it's better to mitigate the possibility of their inclusion.

Another possible misconception that could come from the proposed classify function, comes from the fact that we would essentially be testing the students' code against itself, since classification is a vital part of the Perceptron algorithm (e.g. code function A, then use function A in function B, then test function B with function A). This can leave some students thinking it's okay to test their code against itself, and this is the reason that I insisted on fully testing the classify function before using it in the proposed manner.

In short, I think that the other tests are overstepping the bounds of what can be reliably accomplished with this type of exercise on this platform in that they are getting into where we would need to test for understanding.

@depial
Copy link
Contributor Author

depial commented Feb 22, 2024

I've realized that I haven't detailed the logic of the current testing sequence, which is the following progression:

  1. Four points centered on the origin (e.g. optimal decision boundary would be a line through origin with slope = -1)
  2. Same four points, but with labels negated, which emphasizes the importance of the correct normal
  3. Same four points as Test 1, but offset, so a decision boundary cannot pass through the origin
  4. Four points which require a vertical or nearly vertical decision boundary (discourages geometric solutions)
  5. The remaining parametric tests present increasing populations, while incorporating the aspects from the first four tests

Is there something significant the proposed tests add which warrants all the drawbacks or which couldn't be remedied by further testing of the sort above?

Cleaned up obsolete functionality, and tried to include reporting of students' decision boundaries in testset titles, for easier debugging
@cmcaine
Copy link
Contributor

cmcaine commented Feb 23, 2024

my biggest concern is that some (inattentive/naive) students, seeing testing of this type, can form the misconception that Perceptron will always classify unseen points correctly.

Cool, I just don't think this is likely and it's not clear to me why this would be worse than students getting confused about why we only ever test with the same data we provided as input.

I've realized that I haven't detailed the logic of the current testing sequence, which is the following progression:

These sound good and well-thought out :) I would like unseen points to be tested as well, but whatever, let's leave that disagreement until the rest of the exercise is reviewed.

@depial
Copy link
Contributor Author

depial commented Feb 23, 2024

Maybe there is a slight source of confusion in using an output as a proxy to test for the correctness of an algorithm versus using an output as a proxy to test the accuracy of an algorithm? The former is the intended scope of this exercise, while I believe testing unseen points does the latter. A quick analogy to show the difference:

We can test the Alphametics algorithm by inputting a puzzle, then taking the output dictionary and testing it against unseen puzzles which have the same encoding as the original puzzle.

This test is redundant in the same way as the proposed testing with unseen points*.

I believe the two testing paradigms aren't usually conflated on Exercism because most (if not all) the current exercises are not ML related, and testing for accuracy is the paradigm most associated with ML. While Perceptron is used in ML, I don't want to mix the two paradigms here because correctness should be shown before testing accuracy, and trying to do the two together introduces a level of complexity which I don't think is beneficial at this stage. We could always make another exercise which tests Perceptron accuracy, but it would likely involve (or be entirely centered around) hyperparameter tuning, like an actual use case in ML.

To briefly highlight the key differences in the testing paradigms:

  • Correctness: An algorithm is fed many different inputs, resulting in many separate outputs which are then tested against a known standard.
  • Accuracy: An algorithm is fed one input (whole or part of a population) and the output is tested against either the same input or (numerous) examples from the same population as the input.

I'll be happy to respond to your other comments after the rest of the exercise is reviewed as you suggested, but I just wanted leave this for your consideration because I feel it nicely encapsulates the underlying source of many of my misgivings.

*Most of the italicized word have fairly obvious analogues in the Perceptron case, with the possible exception of encoding, whose analogue is support vectors.

@depial
Copy link
Contributor Author

depial commented Mar 12, 2024

I had made a new practice exercise from the library (binary-search-tree) and thought I would be able to open a new PR with it, but it got included in this one. I'm not sure if I've done something wrong or if this is expected. Let me know if there's anything I should do about it, but I'd like to explain the new exercise, since the tests from canonical-data.json seem to be fairly limited in scope (only testing instantiation and sorting), unless there's another way to implement them.

Also, it looks like there are checks that are not successful due to three tests in Circular Buffer failing in the Exercise CI / Julia nightly - OS group of tests. Does this have anything to do with the PR?

@@ -0,0 +1,2 @@
# Create a (Mutable) Struct BinarySearchTree, which has fields: data, left, right
# Also write a sort method, which returns a sorted array of the elements in a BinarySearchTree
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Also write a sort method, which returns a sorted array of the elements in a BinarySearchTree
# Also write a sort method, which returns a sorted array of the elements in a BinarySearchTree.

It seems that this should be sentences in a comment, but surely the EOL is important as well for the last line of this stub.

@depial
Copy link
Contributor Author

depial commented Mar 13, 2024

Sorry, I feel like I've made a mess of things here. Closing

@depial depial closed this Mar 13, 2024
@kotp
Copy link
Member

kotp commented Mar 13, 2024

Sorry, I feel like I've made a mess of things here. Closing

Another option is if you have a commit that you can go back to that you can identify as the point at which you feel things are where they should be, you can reset hard and force push to your own branch. That can bring you back to where you have less of that feeling, while keeping the work that you feel has some potential.

It might be better than opening a new PR, and can keep the communication in one place (those that are following alone, etc) will not have to re-subscribe to the work being done that they may be interested in.

Of course, it is up to you how you would want to manage it.

I appreciate all the work you have done on this, it is not going unnoticed.

@cmcaine
Copy link
Contributor

cmcaine commented Mar 13, 2024 via email

@depial
Copy link
Contributor Author

depial commented Mar 13, 2024

@kotp Thanks for the suggestion and comments! I'll consider how is best to move forward. I just felt the PR was becoming unwieldy because of the new exercise (which may also be able to use a good bit of feedback).

I'll keep the branch, as @cmcaine suggests, so anyone can use whatever I've done in the meantime. I don't know if there is anything more I could add to the Perceptron exercise, so feel free to take it and modify it as you please.

Likewise, the Binary Search Tree exercise is also available if anyone wants it. I feel it is rather 'bare-bones' since I merely followed what was in canonical-data.json for the tests. I may add some "optional tests" (ala Circular Buffer) for checking membership and adding elements, to allow for testing to be brought more in line with the description if desired.

@depial
Copy link
Contributor Author

depial commented Mar 16, 2024

Just an update: I've opened an issue with a list of the exercises I have finished making and a link to the branch in order to make more people aware of the available code. Hopefully this is an acceptable solution.

@depial depial mentioned this pull request Jul 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants