-
Notifications
You must be signed in to change notification settings - Fork 111
Breaking Syntax Changes, July 2016
The changes to Pyret released in July 2016 have some syntactic differences and additions, some backwards-incompatible. This post just details the syntax changes without diving into any new features.
Pyret encourages writing mostly-functional programs. This means that most function bodies, if/else bodies, cases branches, and so on are computing a single value with no side effects. A simple case is when the body is a single expression, like in lam(x): f(g(x)) end
.
But they sometimes use several lines, with intermediate declarations of other names:
fun dist(x1, x2, y1, y2):
fun s(n): n * n end
dx = x2 - x1
dy = y2 - y1
num-sqrt(s(dx) + s(dy))
end
However, if there is work done other than new declarations with a single, final expression, it probably means the function is stateful:
var x = 0
fun get-next-id(name):
x := x + 1
name + tostring(x)
end
Or maybe the programmer just made a mistake:
fun move-character(obj, key):
ask:
| key == "left" then: { x: obj.x - 1, y: obj.y }
| key == "right" then: { x: obj.x + 1, y: obj.y }
end
ask:
| key == "up" then: { x: obj.x, y: obj.y + 1 }
| key == "down" then: { x: obj.x, y: obj.y - 1 }
end
end
Do you spot the bug? This program never returns an object with a different value for x
than in the original! We've seen this kind of mistake comes up in beginning functional programming, as students internalize the idea of always producing a single new value instead of doing piecemeal updates.
So, we decided to make the program above an error. Every place where a block of expressions is allowed now is only allowed to have a sequence of declarations followed by a final expression, and it is a well-formedness error to not follow that constraint.
However, we still want to be able to write programs that use multiple expressions in a block; Pyret is mostly functional, but still supports mutable operations and programs that have direct effects like printing.
Of course, a programmer can still write the above programs using the
block
construct: for instance,
var x = 0
fun get-next-id(name):
block:
x := x + 1
name + tostring(x)
end
end
However, this both (a) wastes precious vertical space, and (b) forces the program to be indented further to the right. It also makes it difficult to quickly convert a single expression into a block and back, which programmers sometimes do if, for instance, they want to quickly add a profiling counter.
So, it's useful to have a way to explicitly indicate that we want a block of multiple expressions, rather than just a list of bindings and a single expression. We do this by adding block:
after the header for an expression that has a blocky body:
var x = 0
fun get-next-id(name) block: # block added here
x := x + 1
name + tostring(x)
end
The documentation for the shorthand is here.
While tuple syntax:
{ expr; ... expr; }
isn't formally ambiguous with the presence of semicolons as aliases for ends, or with object literals, it's quite close, and has some jarring interactions. For example, a tuple of anonymous functions that end in semicolons:
{ lam(x): x + 1;; lam(y): y + 2;; }
This gets into some odd-looking situations quickly, and depending on future uses of optional ending semicolons in tuples and other block-shaped expressions in tuples, this is an unnecessary clash. So the first breaking syntactic change is to simply disallow ;
as a synonym for end
, and only allow it as a tuple element separator.
The fix for existing programs is simple: just change any existing semicolons in the program to be end
instead, and then rejoice in your ability to program with tuples.
One reason for having ;
was to make writing anonymous functions like lam(x, y): (x * x) + (y * y) end
less laborious. Pyret has long lacked a good, concise anonymous function syntax anyway (;
was always a symptom rather than a solution), so this prompted the addition of a new shorter syntax for anonymous functions (though lam
still works just as it did before):
{(x, y): (x * x) + (y * y)}
This new form saves 5 characters (instead of lam end
, with a typically-required space before the end
, there are two braces). It also avoids a common case where end
would be "floating" in the middle of a line.
The new form still has all the same ability to specify annotations, with precisely the same syntax as fun
and lam
use, so you can also write:
{(x :: Number, y :: Number) -> Number: (x * x) + (y * y)}
to fully specify the function's argument and return types.
Similarly, for parametric polymorphism, you can transform lam<a>(x :: List<a>): x.length() end
to
{<a>(x :: List<a>): x.length()}
Since the beginning, you could write an object with method fields in Pyret as by writing the name of the method before an open parenthesis, as in dist-from-0
below:
{
x: 10,
y: 20,
dist-from-0(self):
num-sqrt((self.x * self.x) + (self.y * self.y))
end
}
With the addition of tuples, these method clauses become much closer to ambiguous, for example in this expression, which, at first glance, might be a singleton tuple and might also be an object:
# an object
{
dist-from-0(self, x, y):
num-sqrt((x * x) + (y * y))
end
}
# a tuple with a single element
{
dist-from-0-callback(x, y, lam(dist):
dist * dist
end))
}
When we get to expressions like this, while it's reasonable to write a general parser that can sort out the ambiguities (especially since we distinguish ,
for objects and ;
for tuples), it becomes surprisingly difficult to write editor modes that can do reasonable indenting and begin/end matching. This is because editor modes generally have much less than full parsing support available.
Because of both the difficulty of working with editor tools, and the potential for visual similarity problems, a method
keyword is now required in front of method fields:
{
x: 10,
y: 20,
method dist-from-0(self):
num-sqrt((self.x * self.x) + (self.y * self.y))
end
}
This makes it easier to identify object fields: they are either a name: expr
field, or a method name(...): ... end
field. The first few characters always make it clear what's going on, without needing to read all the way to the end of a method header to determine if it's a function application or a method field.