thoughts #1

Open
opened 2024-04-18 20:50:51 -05:00 by zephyr · 8 comments
Owner

design goals

  • programming language is the dx: includes tools, tests, language server, api versioning, ...
  • exclusively structured concurrency
  • expressive types; sum types and product types are similar syntactic cost to express
    • parametric polymorphism
    • parametricity (up to unrestricted recursion)?
  • no reflection, and hopefully no rtti at all
  • system abi, platform interop matters
  • featureful stdlib
  • given bocchi depends on ryo, a change to only bocchi never requires a recompile of ryo
  • dependencies: namespaced, resolved quickly, cryptographically signed sbom, package identity not tied to location

some stuff

// .Maybe demonstrates a sum type.
type .Maybe(T: type) =
    / .None: **
    / .Some: T

// .Null demonstrates a product type.
type .Null(T: type) =
    * .Value: T
    * .Exists: bool

// .Pair demonstrates a product type with "numeric" column names.
// It works because the . character always introduces an identifier.
type .Pair(Fst: type, Snd: type) =
    * .0: Fst
    * .1: Snd

// bool demonstrates a sum type with distinct variants of the same type.
type bool = / false: ** / true: **

** spells the unit type, evocative of an empty product

# design goals - programming language is the dx: includes tools, tests, language server, api versioning, ... - exclusively structured concurrency - expressive types; sum types and product types are similar syntactic cost to express + parametric polymorphism + parametricity (up to unrestricted recursion)? - no reflection, and hopefully no rtti at all - system abi, platform interop matters - featureful stdlib - given bocchi depends on ryo, a change to only bocchi never requires a recompile of ryo - dependencies: namespaced, resolved quickly, cryptographically signed sbom, package identity *not* tied to location # some stuff ``` // .Maybe demonstrates a sum type. type .Maybe(T: type) = / .None: ** / .Some: T // .Null demonstrates a product type. type .Null(T: type) = * .Value: T * .Exists: bool // .Pair demonstrates a product type with "numeric" column names. // It works because the . character always introduces an identifier. type .Pair(Fst: type, Snd: type) = * .0: Fst * .1: Snd // bool demonstrates a sum type with distinct variants of the same type. type bool = / false: ** / true: ** ``` `**` spells the unit type, evocative of an empty product
Author
Owner

re: subtype polymorphism without rtti, seg suggested looking at scala's and kotlin's sealed hierarchies. it seems like having a rule like "a property shared by every variant of a sum type is promoted to the sum type itself via dynamic dispatch" accomplishes effectively the same thing while still avoiding rtti, since we just need to know the variant in use to dispatch.

also of note that this rule is kind of like backward nominal interfaces: instead of types in disparate places declaring themselves subtypes of bocchi, we have bocchi declaring itself the supertype of types in disparate places. that seems fine for things like mocking or strategy pattern, but need to think about other places where subtype polymorphism is useful in other languages. though, it's probably fine to simply have closures and let people make their own vtables, like zig does.

vaguely interesting to think about the dual of this rule, which would read like "a property on exactly one field of a product type is promoted to the product type itself." not sure that's a good idea to actually do, though.

do note that regardless of whether we have subtype polymorphism, we do need a mechanism to express "any type that maps these messages onto these types" for bounded parametric polymorphism. rust uses its bounds language (i.e. traits) as the language for subtypes (i.e. dyn), and go roughly does as well (if general interfaces were types).

re: subtype polymorphism without rtti, seg suggested looking at scala's and kotlin's sealed hierarchies. it seems like having a rule like "a property shared by every variant of a sum type is promoted to the sum type itself via dynamic dispatch" accomplishes effectively the same thing while still avoiding rtti, since we just need to know the variant in use to dispatch. also of note that this rule is kind of like backward nominal interfaces: instead of types in disparate places declaring themselves subtypes of bocchi, we have bocchi declaring itself the supertype of types in disparate places. that seems fine for things like mocking or strategy pattern, but need to think about other places where subtype polymorphism is useful in other languages. though, it's probably fine to simply have closures and let people make their own vtables, like zig does. vaguely interesting to think about the dual of this rule, which would read like "a property on exactly one field of a product type is promoted to the product type itself." not sure that's a good idea to actually do, though. do note that regardless of whether we have subtype polymorphism, we do need a mechanism to express "any type that maps these messages onto these types" for bounded parametric polymorphism. rust uses its bounds language (i.e. traits) as the language for subtypes (i.e. dyn), and go roughly does as well (if general interfaces were types).
Author
Owner

Maybe is built into the language. unary ? introduces an optional type. the variants should have special spellings, but not sure what. >< for Nothing and <bocchi> for Just bocchi would be cute, but probably a bit too surprising.

all properties on bocchi bind to ?bocchi in the normal way for the Maybe monad: if bocchi responds to ryou to produce a nijika, then ?bocchi responds to ryou to produce a ?nijika. both would be written like b ryou. it's like automatic dereferencing, but monad.

an idea: ?bocchi also has the superpower of responding to a block by executing it iff it is not Nothing. then we can make bool be implemented as ?**, and then the usual if statement arises naturally. though, how does else work? easy to just say we take two functions instead of one and execute the second when Nothing.

this idea implies we need a special definition of equality at least for ?**, s.t. Nothing == Nothing = true and Just ** == Just ** = true. (otherwise we would automatically derive a version of == for bool that leads to false == false = false and true == true = Just true.) and the fact that we need it for ?bool suggests that we should have it for ?bocchi given any bocchi. and it seems weird to stop at ==. so basically we reach the want for type classes.

Maybe is built into the language. unary `?` introduces an optional type. the variants should have special spellings, but not sure what. `><` for Nothing and `<bocchi>` for Just bocchi would be cute, but probably a bit too surprising. all properties on `bocchi` bind to `?bocchi` in the normal way for the Maybe monad: if `bocchi` responds to `ryou` to produce a `nijika`, then `?bocchi` responds to `ryou` to produce a `?nijika`. both would be written like `b ryou`. it's like automatic dereferencing, but monad. an idea: `?bocchi` also has the superpower of responding to a block by executing it iff it is not Nothing. then we can make bool be implemented as `?**`, and then the usual if statement arises naturally. though, how does else work? easy to just say we take two functions instead of one and execute the second when Nothing. this idea implies we need a special definition of equality at least for `?**`, s.t. `Nothing == Nothing = true` and `Just ** == Just ** = true`. (otherwise we would automatically derive a version of == for bool that leads to `false == false = false` and `true == true = Just true`.) and the fact that we need it for `?bool` suggests that we should have it for `?bocchi` given any `bocchi`. and it seems weird to stop at ==. so basically we reach the want for type classes.
Author
Owner

thinking about statically typed dynamic scope. if this works, it provides a powerful kind of dependency injection with minimal syntax, so i think it might be a good idea after all.

the idea requires that a function's type indicate the transitive closure of elements of dynamic scope it will use. if bocchi depends on ryou and ryou is modified to change how it uses dynamic scope, then ryou's type changes, therefore bocchi's type changes. however, bocchi's function body doesn't change, even in compiled form. so it seems like this doesn't conflict with expression problem, even though the type itself changes? strange.

worth noting that if bocchi is a function that takes a closure ryou, then bocchi needs to add ryou's dynamic scope to its own.

a bit thinky about how this interacts with structured concurrency, which in general has a type. if a function takes a lifetime and adds a coroutine to it, that lifetime needs to be annotated with the dynamic scope fields that coroutine uses. (no, it is the function taking the lifetime that observes the scope changes. the lifetime itself doesn't care.)

this seems also naturally related to row polymorphism. https://arxiv.org/pdf/1406.2061.pdf might help with understanding how to infer the dynamic scope type, especially because it allows rows with repeated types and that seems like something we want.

thinking about statically typed dynamic scope. if this works, it provides a powerful kind of dependency injection with minimal syntax, so i think it might be a good idea after all. the idea requires that a function's type indicate the transitive closure of elements of dynamic scope it will use. if bocchi depends on ryou and ryou is modified to change how it uses dynamic scope, then ryou's type changes, therefore bocchi's type changes. however, bocchi's function body doesn't change, even in compiled form. so it seems like this doesn't conflict with expression problem, even though the type itself changes? strange. worth noting that if bocchi is a function that takes a closure ryou, then bocchi needs to add ryou's dynamic scope to its own. a bit thinky about how this interacts with structured concurrency, which in general has a type. ~~if a function takes a lifetime and adds a coroutine to it, that lifetime needs to be annotated with the dynamic scope fields that coroutine uses.~~ (no, it is the function taking the lifetime that observes the scope changes. the lifetime itself doesn't care.) this seems also naturally related to row polymorphism. https://arxiv.org/pdf/1406.2061.pdf might help with understanding how to infer the dynamic scope type, especially because it allows rows with repeated types and that seems like something we want.
Author
Owner

given that we have very lightweight product type syntax, we can investigate dropping currying and instead preferring product types for function arguments. we can come up with some additional interesting properties from this. but first we need syntax to construct values of product types.

since we unify structs and tuples into products, we'd like to avoid having different syntax for the two. that said, for reasons relating to function application, we'd like to be able to avoid writing the field name for every field. (bocchi, ryou) is probably fine for that, especially if we make (T,) automatically promote to T (which ironically does hearken back to the "promote properties on exactly one field onto a product type" rule mentioned before). then, (.0 = bocchi, .1 = ryou) makes sense for the struct-y syntax.

if we take these spellings for product types, then we can easily write application as f x in the usual functional programming way, and f(bocchi, ryou) looks like the usual procedural programming way as well. we also get named arguments for free: f(bocchi, ryou) is identical to f(.1 = ryou, .0 = bocchi) assuming .0 and .1 as field names, or if f : (* .x: U * .y: V) -> W instead then we can write f(.y = ryou, .x = bocchi).

i think it is preferable to disallow the tuple-like syntax on products with any invisible (i.e. unexported and outside a scope that can see them) fields, rather than allowing unexported names to be filled in where they can be inferred. this implies that functions often have to use exported names for their arguments. i think that's fine.

an option we can consider for a kind of "partial application" is to allow appending fields to products. e.g., if x : (* .0: U) and y : V, we can write something along the lines of let z = x (.1 = y) to create z : (* .0: U * .1: V). that lets us pass around a tuple representing a function's arguments, even with unexported fields, and invoke the function once the arguments are completed. we could hypothetically even leave closures to a pair of a function and the arguments it's been passed, although not sure that's actually a good approach. observe that, syntactically, this idea is application of a product to a product. although we probably do need an operator, so that we can distinguish between this kind of intersection with a variable and field selection.

i think if we do allow appending product types, then only exported names should be able to be appended to products this way. otherwise i foresee having to regularly explain that "the unexported names you append to your arguments in module A are different from the unexported names in the arguments of f in module B even though they are spelled the same in the source code for each." seems confusing, even if it does arise naturally from other rules.

given that we have very lightweight product type syntax, we can investigate dropping currying and instead preferring product types for function arguments. we can come up with some additional interesting properties from this. but first we need syntax to construct values of product types. since we unify structs and tuples into products, we'd like to avoid having different syntax for the two. that said, for reasons relating to function application, we'd like to be able to avoid writing the field name for every field. `(bocchi, ryou)` is probably fine for that, especially if we make (T,) automatically promote to T (which ironically does hearken back to the "promote properties on exactly one field onto a product type" rule mentioned before). then, `(.0 = bocchi, .1 = ryou)` makes sense for the struct-y syntax. if we take these spellings for product types, then we can easily write application as `f x` in the usual functional programming way, and `f(bocchi, ryou)` looks like the usual procedural programming way as well. we also get named arguments for free: `f(bocchi, ryou)` is identical to `f(.1 = ryou, .0 = bocchi)` assuming .0 and .1 as field names, or if `f : (* .x: U * .y: V) -> W` instead then we can write `f(.y = ryou, .x = bocchi)`. i think it is preferable to disallow the tuple-like syntax on products with any invisible (i.e. unexported and outside a scope that can see them) fields, rather than allowing unexported names to be filled in where they can be inferred. this implies that functions often have to use exported names for their arguments. i think that's fine. an option we can consider for a kind of "partial application" is to allow appending fields to products. e.g., if `x : (* .0: U)` and `y : V`, we can write something along the lines of `let z = x (.1 = y)` to create `z : (* .0: U * .1: V)`. that lets us pass around a tuple representing a function's arguments, even with unexported fields, and invoke the function once the arguments are completed. we could hypothetically even leave closures to a pair of a function and the arguments it's been passed, although not sure that's actually a good approach. observe that, syntactically, this idea is application of a product to a product. although we probably do need an operator, so that we can distinguish between this kind of intersection with a variable and field selection. i think if we do allow appending product types, then only exported names should be able to be appended to products this way. otherwise i foresee having to regularly explain that "the unexported names you append to your arguments in module A are different from the unexported names in the arguments of f in module B even though they are spelled the same in the source code for each." seems confusing, even if it does arise naturally from other rules.
Author
Owner

torn between directory is a module as in go, which i really like, and refer only to things earlier in the same file as in f#, which is tremendously easier to type check, easier to read since you can see any reference, but would imply importing other files in the same directory. i very strongly dislike the magic entry points and general system that rust uses, and even the rust programmers i have talked to agree that it's bad even under the new design for it.

we could try the f# approach with a rule like "modules in the same directory are implicitly imported" so that you still have to qualify their names but we keep the lexical precedence property. we'd also want "modules in the same directory can access each others' unexported identifiers." e.g., if bocchi.lig and ryou.lig are in the same directory, and ryou.lig defines some unexported identifier nijika, then bocchi.lig can write ryou nijika to access it without requiring an explicit import of ryou. this seems verbose, but maybe it's ok.

mutually recursive references are impossible in general with f#'s approach. there are a couple approaches to resolve this:

  • we can simply drop the type checking advantages and implement mutually recursive references the same way go does, by type checking in multiple passes.
  • we can not worry about it and require the programmer to Add a Type Parameter to the earlier referent, to abstract the recursion.

all that said, i think the go approach might be the winner for me. it's just way easier to get started writing go code.

torn between directory is a module as in go, which i really like, and refer only to things earlier in the same file as in f#, which is tremendously easier to type check, easier to read since you can see any reference, but would imply importing other files in the same directory. i very strongly dislike the magic entry points and general system that rust uses, and even the rust programmers i have talked to agree that it's bad even under the new design for it. we could try the f# approach with a rule like "modules in the same directory are implicitly imported" so that you still have to qualify their names but we keep the lexical precedence property. we'd also want "modules in the same directory can access each others' unexported identifiers." e.g., if bocchi.lig and ryou.lig are in the same directory, and ryou.lig defines some unexported identifier `nijika`, then bocchi.lig can write `ryou nijika` to access it without requiring an explicit import of ryou. this seems verbose, but maybe it's ok. mutually recursive references are impossible in general with f#'s approach. there are a couple approaches to resolve this: - we can simply drop the type checking advantages and implement mutually recursive references the same way go does, by type checking in multiple passes. - we can not worry about it and require the programmer to Add a Type Parameter to the earlier referent, to abstract the recursion. all that said, i think the go approach might be the winner for me. it's just way easier to get started writing go code.
Author
Owner

i've been trying to imagine structured concurrency lifetimes as applicatives, but i think that is the wrong view. it is certainly the case that we might want to operate on an applicative concurrently, but we also might want to arbitrarily change structure – e.g. we can imagine a coroutine spawning two more coroutines in the same lifetime on some inputs, and that's definitely something we want to support. so, what we're looking at isn't even a functor.

that said, one of the very important things about structured concurrency is that a lifetime has a type: every concurrency computes a single value. thinking about it that way, maybe what we want is actually for a structured concurrency lifetime to operate over a monoid. constructing a list of all results is obviously something we want for some use cases. discarding results and eventually returning ** or maybe an error type makes sense for something like an http server with respect to its request handlers.

so what are some things we actually want to do with structured concurrency?

  • operations along the lines of js's Promise.all, Promise.any, Promise.race, and maybe Promise.allSettled. in js these operate on arrays. if we have type classes (which i'm not sure is a definite add; gleam forgoes them due to being confusing on their own and producing opaque error messages), we could hypothetically make them work on any applicative.
  • a coroutine should be able to take a lifetime as argument and add arbitrary new coroutines to it, with no structural relation between them. all the coroutines in a concurrency need to share a result type.
  • hard cancellation. it should be possible to force all coroutines in a concurrency to exit immediately (though with cleanup on the way out). the exit function needs to provide a result, or maybe a function computing a result, for the coroutines it's killing, since they need to return a value.

i have more thoughts about how to approach the applicative operations and ...coinduction? uniformly, and considerations regarding syntax, but now i am seepy.

i've been trying to imagine structured concurrency lifetimes as applicatives, but i think that is the wrong view. it is certainly the case that we might want to operate on an applicative concurrently, but we also might want to arbitrarily change structure – e.g. we can imagine a coroutine spawning two more coroutines in the same lifetime on some inputs, and that's definitely something we want to support. so, what we're looking at isn't even a functor. that said, one of the very important things about structured concurrency is that a lifetime has a type: every concurrency computes a single value. thinking about it that way, maybe what we want is actually for a structured concurrency lifetime to operate over a monoid. constructing a list of all results is obviously something we want for some use cases. discarding results and eventually returning `**` or maybe an error type makes sense for something like an http server with respect to its request handlers. so what are some things we actually want to do with structured concurrency? - operations along the lines of js's Promise.all, Promise.any, Promise.race, and maybe Promise.allSettled. in js these operate on arrays. if we have type classes (which i'm not sure is a definite add; gleam forgoes them due to being confusing on their own and producing opaque error messages), we could hypothetically make them work on any applicative. - a coroutine should be able to take a lifetime as argument and add arbitrary new coroutines to it, with no structural relation between them. all the coroutines in a concurrency need to share a result type. - hard cancellation. it should be possible to force all coroutines in a concurrency to exit immediately (though with cleanup on the way out). the exit function needs to provide a result, or maybe a function computing a result, for the coroutines it's killing, since they need to return a value. i have more thoughts about how to approach the applicative operations and ...coinduction? uniformly, and considerations regarding syntax, but now i am seepy.
Author
Owner

been focusing on writing thoughts on wobleg for a bit now but wanted to write down this idea for further refinement another day:

if function args are just product types, then we can write methods by giving the name of a defined type in place of the arguments. something like:

// plain function
func bocchi(* .x: int * .y: int) int { ... }
// type and method
type ryou = * .x: int * .y: int
func bocchi ryou int { ... }

perhaps this keyword refers to the argument to a method? w/b additional parameters to a method?

thinking about this also gave me some thinkies about nominal vs structural typing. maybe we can say something like product and sum types are structurally typed, and the name of the type is just a set of operations on that structure. again, requires additional braining to refine.


unrelated, thinking more about the whole mutability thing. one idea might be to have mutable variables but not mutable fields. if you have a product type, you need to create a whole new value of that type to replace a field. does that actually solve any problems?

if we do that, native support for higher-kinded data (as opposed to higher-kinded types) would allow succinctly expressing "create a new value of this type but every field is a mutable memory place," in addition to operations like Partial<T>. would it also give us a general mechanism for aos/soa representation choice?

been focusing on writing thoughts on wobleg for a bit now but wanted to write down this idea for further refinement another day: if function args are just product types, then we can write methods by giving the name of a defined type in place of the arguments. something like: ``` // plain function func bocchi(* .x: int * .y: int) int { ... } // type and method type ryou = * .x: int * .y: int func bocchi ryou int { ... } ``` perhaps `this` keyword refers to the argument to a method? w/b additional parameters to a method? thinking about this also gave me some thinkies about nominal vs structural typing. maybe we can say something like product and sum types are structurally typed, and the name of the type is just a set of operations on that structure. again, requires additional braining to refine. ---- unrelated, thinking more about the whole mutability thing. one idea might be to have mutable *variables* but not mutable *fields*. if you have a product type, you need to create a whole new value of that type to replace a field. does that actually solve any problems? if we do that, native support for higher-kinded data (as opposed to higher-kinded types) would allow succinctly expressing "create a new value of this type but every field is a mutable memory place," in addition to operations like `Partial<T>`. would it also give us a general mechanism for aos/soa representation choice?
Author
Owner
Module ::= TypeDecl Module | FuncDecl Module | LetDecl Module | eof .

TypeDecl ::= 'type' Ident Type .
FuncDecl ::= 'fn' Ident Type Type BlockExpr .
LetDecl  ::= 'let' Assignment .

Ident      ::= unexported | exported .
TypedIdent ::= Ident Type .

Type ::=
      UnitType
    | ProductType
    | SumType
    | OptionalType
    | NamedType
    | '(' Type ')'
    .
UnitType     ::= '**' .
ProductType  ::= '*' TypedIdent ProductType | '*' TypedIdent .
SumType      ::= '/' TypedIdent SumType | '/' TypedIdent .
OptionalType ::= '?' Type .
NamedType    ::= Ident .

Assignment ::= Ident '=' Expr | Ident Type '=' Expr .

Expr ::=
      UnitExpr
    | NothingExpr
    | JustExpr
    | ProductExpr
    | IntersectExpr
    | ApplyExpr
    | NameExpr
    | BlockExpr
    | '(' Expr ')'
    .
UnitExpr          ::= '(' ')' .
NothingExpr       ::= '><' .
JustExpr          ::= '<' Expr '>' .
ProductExpr       ::= '(' ProductExprFields ')' .
ProductExprFields ::= Assignment ',' ProductExprFields | Assignment ',' | Assignment .
IntersectExpr     ::= Expr '&' Expr .
ApplyExpr         ::= Expr Expr .
NameExpr          ::= Ident .
BlockExpr         ::= '{' StmtList '}' .

StmtList ::= Stmt Stmt | Stmt .
Stmt     ::= Expr ';' .
``` Module ::= TypeDecl Module | FuncDecl Module | LetDecl Module | eof . TypeDecl ::= 'type' Ident Type . FuncDecl ::= 'fn' Ident Type Type BlockExpr . LetDecl ::= 'let' Assignment . Ident ::= unexported | exported . TypedIdent ::= Ident Type . Type ::= UnitType | ProductType | SumType | OptionalType | NamedType | '(' Type ')' . UnitType ::= '**' . ProductType ::= '*' TypedIdent ProductType | '*' TypedIdent . SumType ::= '/' TypedIdent SumType | '/' TypedIdent . OptionalType ::= '?' Type . NamedType ::= Ident . Assignment ::= Ident '=' Expr | Ident Type '=' Expr . Expr ::= UnitExpr | NothingExpr | JustExpr | ProductExpr | IntersectExpr | ApplyExpr | NameExpr | BlockExpr | '(' Expr ')' . UnitExpr ::= '(' ')' . NothingExpr ::= '><' . JustExpr ::= '<' Expr '>' . ProductExpr ::= '(' ProductExprFields ')' . ProductExprFields ::= Assignment ',' ProductExprFields | Assignment ',' | Assignment . IntersectExpr ::= Expr '&' Expr . ApplyExpr ::= Expr Expr . NameExpr ::= Ident . BlockExpr ::= '{' StmtList '}' . StmtList ::= Stmt Stmt | Stmt . Stmt ::= Expr ';' . ```
Sign in to join this conversation.
No Label
No Milestone
No project
No Assignees
1 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: ligma/lang#1
No description provided.