Features of a dream programming language: 3rd draft

Features of a dream programming language: 3rd draft

·

133 min read

Featured on Hashnode

Have you ever dreamt of what features your ideal programming language would have? Have you ever tried to make a list of the best features (or non-features) of existing languages that are particularly inspired in some way...?

This article explores and lists features of an ideal programming language, focusing on readability, reasonability, and fast onboarding. It emphasizes the importance of the alignment between execution order and reading order, and the ability to trace the flow of execution by simply reading the code chronologically. The desire is to utilize Functional Programming (FP), while alleviating the weirdness of FP through taking inspiration from OOP in areas such as ease of domain modeling, and Alan Kay's vision of scalable computing. The article also talks about the importance of some esoteric features, such as evolvability, content-addressable code, and transparent upgrades. It begins by outlining some general guiding principles, such as that all programming languages are built to overcome human limitations, not machine limitations.


Last update: 16. June 2024.

Summary

I long for a very constrained language for web-app + systems development, prioritizing readability (left-to-right and top-down) and reason-ability above all, which is designed for fast onboarding of complete beginners (as opposed to catering to a specific language community who already have the curse of expertise).

FAMILIAR features it should have (the most important ones):

  • Functional Programming, but based on data-first piping.

  • Immutability, but w/ opportunistic in-place mutation.

  • Gradually typed: dynamic for development, static for production (w/ fully sound type inference).

  • Concurrency via goroutines and (the async) unbounded buffered channels.

  • Ecosystem: interoperable with existing languages.

  • Transpiles to JS and/or compiles to WASM.

  • Adaptive runtime.

ESOTERIC features it should have (the most important ones):

  • Crucial evolvability / backward- and forward-compatibility.

    • Content-addressable code.

    • Transparent upgrades without any breaking changes.

  • Data First Functional Programming w/ SVO style syntax: subject.verb(object)

  • Interpreted, for development. But compiled, incrementally, for production.

  • Interactive: facilitates an IDE-plugin (VS Code) that shows the contents of data structures while coding. Enable REPL'ing into a live system.

  • Aggressively parallelizable and concurrent. Inspired by Verilog and Golang.

  • Scales transparently from single CPU to multi-core to distributed services, without the language necessitating a refactoring of the code.

OTHER features it should have (a short-list of some familiar ones):

  • Eager evaluation (strict, call-by-value), Strong & Static Typing with Fully Sound Type Inference, Generics, Algebraic Data Types (ADT), No Null, No Exceptions by default, No undefined behavior, No Garbage Collector (GC), Async via a blocking/sync interface but non-blocking I/O, Reliable Package Management, Tree Shakeable, Serializable, Helpful Error Messages, Good Interoperability, and First-Class Functions and Higher-Order Functions (HOC), of course (since FP).

DEFAULT features it would be nice-to-have and align the language towards (instead of as special cases):

  • Async, Event-driven, Streamable, Parallelizable, all by default. Which would effectively take these concerns out of the equation, freeing the application programmer from thinking of implementation details (like how the computer should work efficiently, which is the language/framework designer's job), to restructuring or rephrasing the problem into sub-problems, and making those explicit, so the computer can efficiently parallelize computation at the right points. (Read more under the section on 'Performance'.)

NON-features, i.e. features the language should NOT have:

  • Multi-paradigm, meta-programming, DSL's. (These decisions might be controversial, so search article for these terms to read the reasoning why. I'm always open to change my mind.)

The complete set of desirable (or particularly undesirable) features are detailed below (including some limited rationale for each).


Prelude

The original article received some encouraging feedback from Ruby's creator Matz, and positive feedback and excellent critique from Roc's creator Richard Feldman. Since I improved the original article / 1st draft quite a bit, I saved it for posterity, to respect the contents of the original URL (where it may have been shared), and to respect the ensuing original Hacker News discussion based on it. The 2nd draft greatly expanded the article, based on the feedback and further research.

This 3rd draft now synthesizes the various features into overarching sections, based on overall language goals (thanks to brucou's feedback on the 2nd draft).

The article is a long read, so I suggest first skimming it and dipping into the sections where you find interest or disagreement. Then, if your curiosity is piqued by the ideas, I hope you will take time read it carefully over a few evenings, mull over the ideas presented (a very select few may be novel), consider your own dream language features, and contribute those, as well as your own insights and experience, in the comments below.

Even if you disagree with my wishes, you could treat the list as an overview over many of the language features you ought to consider when designing a programming language. Since this list is effectively a compiled list of a few such feature lists I've come across.

The article is a one part meta-philosophy of programming languages, and one part wishlist based on a study of the good ideas out there. I won't specify a concrete language directly. “If I had an hour to solve a problem I'd spend 55 minutes thinking about the problem and five minutes thinking about solutions.” -- Einstein, commonly attributed to. But I will attempt to give you some ideas on what features I think would be ideal, and why. Maybe some of the many linked resources I've made sure to include will be inspirational in the design of your own language.


Guiding principles

  • All programming languages are first and foremost, in actuality, built to overcome human limitations. (Not primarily for overcoming machine limitations. Even though we have historically treated programming languages as tools to tinker with machine/hardware.) If they weren't, we might as well be using a lower-level language like Assembly, or typing 0's and 1's. The machine would be just as happy (or happier) with that.

    • Software architecture in general (and frameworks in specific) is a way to organize the mind of the developer(s), categorising the conceptual world into what's closely or merely remotely related (giving rise to principles like 'cohesion over coupling' etc.). (This might explain OOP's popularity. Articles like those on syntonicity seem to confirm this suspicion.) The machine would be perfectly content with executing even spaghetti code. “Any fool can write code that a computer can understand. Good programmers write code that humans can understand.”Martin Fowler.

    • A programming language has certain affordances, allowing you to talk specifically about/with some concepts (typically the first-class citizens of the language), and avoid having to talk about other things (e.g. memory management, language runtime concerns). This does not only apply to Domain Specific Languages (DSL's).

    • The Sapir-Whorf hypothesis (linguistic relativity) is especially applicable to programming: "each programming language has a tendency to create a certain mind set in its programmers. ... you tend to have a mental model of how to do things based on that language. ... Such a mind set may make it difficult to conceive of solutions outside of the model defined by the language." - Dennis J. Frailey

    • Programming needs to get away from the notion that the programmer is giving instructions to the machine. Instead, programming ought to be thought of in terms of modelling a set of things that interact with one another (causal relations). The programmer ("software engineer", really) should not need to be an inherent part of that model. The software engineer should develop software by merely supplying the machine with a description of such a model. This was foreseen by the great computer scientist Edsger Dijkstra: «Progress is possible only if we train ourselves to think about programs without thinking of them as pieces of executable code.» He was thinking in terms of mathematical modeling, and hated OOP), but OOP became a success for a reason (human intuitiveness / mental model), and I believe it is possible to marry the two notions (mathematical modeling, through compositional FP, and modeling causal-relations of domain entities, in a way similar, but not equal, to OOP).

  • Only by accounting for human limitations (like cognitive capacity, and familiarity), could one derive a specification for the ideal programming language.

  • A bug is an error in thinking. Either by the developer, or the language-designer for not sufficiently accounting for human psychology (Sapir-Whorf: the language you write/speak determine what you can/do think). Even Dijkstra himself equates programming with thinking. Programming does not simply require thinking, but structure thinking. Languages guide thought, and what matters is not only what what programming languages let you do, but what they shepherd you to do. That's why "You can write good code in this language, you just have to be disciplined" isn't a good argument, even if it is commonly employed. "If we were seeking power we would have stopped at GOTO's. ... The point is to reduce our expressivity in a principled way ... [to] something which is still powerful enough for our daily uses." -- Cheng Lou of ReScript.

    • To reduce bugs, a language should ensure simple, safe, and scalable ways of thinking. For instance:

      • Type systems are a way to use the compiler to help us verify our beliefs about our own code: they help us think consistently.

      • Closures, enables the programmer to specify and share behaviors that are already half-way thought through (i.e. already set up with some external data/state).

      • Transducers, allows the programmer to define and compose behaviors/processes without having to think explicitly about the particular thing which is behaving. (It's currently only possible for a subset of behaviors, but see the Qi flow-oriented DSL for Racket for a potential generalization.)

      • Partial application allows the programmer to take one grand behavior and break it down into smaller behaviors that can be reused independently (i.e. specialization) or in sequence.

      • Composition allows the programmer to think & build piece by piece, instead of all at once, and without the context influencing too much. It also allows the programmer to reuse materialized thoughts.

      • "Open for extension, closed for modification": A programmer can recognize something useful, and add more pieces to it, without having to change the original thing (e.g. extension methods in C#), and without tying those new pieces too closely with the original thing (e.g. subclassing) which would limit their reuse.

  • A language determines WHAT you can & have to think about, but also HOW you have to think about it.

  • Prioritize scalability in every facet of the language: The '0, 1, or infinity' principle: "You will either need zero of a thing, one of a thing, or an arbitrary number of the thing." So "Arbitrary fixed limits are a CodeSmell". This principle could be applied from everything to number of parameters, length of code blocks and basic language constructs (if-else-if...........else; "else what, now again?"), to sizes of collections (maybe..; array lengths could be supplied as mere hints that the compiler can use for optimization purposes and warn/error if the length is exceeded; but arrays should not be its own hard-coded type since it prevents abstraction and introduces type wrangling/conversions). Inspired by Clojure. Also, if you have one of a thing (a block of code, or whatever), then assume it can become arbitrarily long, and account for that in the design of the syntax (scalability). An example of scalability criteria taken into consideration in syntax formatting can be seen in the new Prettier ternaries format. Notably using un-indentation of the false-condition to prevent excessive indentation, since flattened/linear code reads better. Formatting is not a trivial concern, and fast top-to-bottom readability is paramount.

  • "Things that are different should look different". Counter-inspired by D. Inspired by Lary Wall on Perl's postmodernism and my own frustrations with modern component frameworks like React, and my impression that Lisp/Clojure is perceived as hard to learn because it has so little syntax: when everything looks the same it is hard to tell things apart. Syntax matters, so when people balk at noise like too many parentheses, you ought to listen (like React did, rather than ignore it, like Clojure unfortunately did and still does, even though the Lisp inventor McCarthy didn't even like the S-expressions syntax himself, and it likely caused Lisp to lose a lot of potential users). So making a syntax which is easy to pick up for newcomers, is important to remove friction from onboarding and thus ensure the growth of the language. Some unfamiliarity to experts is tolerable, since they likely have the Curse of Expertise from many other languages, but are also adept at quickly learning new ones. Seemingly silly things can be the barrier between mass adoption and remaining niche. A language should optimize for mass adoption, due to network effects, since then everyone wins: learning, communication, portability, ecosystem etc. It doesn't mean pleasing everyone's ultimate desires, just avoid having most people turn at the door. Counter-inspired by Lisp/Clojure, and Haskell. Although it is prize-worthy to stay very frugal with syntax, since more syntax necessitates more learning/documentation (knowledge debt, info overload), more avenues for confusion (the best code is no code), and more complications (language intricacies can lead to software intricacies, which can lead to bugs). But most importantly: less syntax enables better composition (all things are lego blocks that fit with each other). Inspired by Lisp/Clojure. My philosophy leans more towards Golang (less features, readability is reliability, simplicity scales best) and Python ("explicit over implicit", "one way over multiple ways"), than Ruby (provide sharp knives) and Perl (postmodern plurality, coolness/easiness is justification enough in itself, aka. the smell of a toy language). Even though I come from Ruby and love it, and also cannot help admiring Lisp for its elegance and crucial evolvability.

  • Programming should be fun and painless. Inspired by Ruby.

  • Make it predicatable: Eliminate entropy/chaos/variability. Reduce choice. Uniformity helps (and "polymorphism" by it's definition smells; "ad-hoc" smells even more). Uniformity enables simple, clear and robust rules to be formed and used, which is well suited for straightforward programming/automation (and maintenance). Variability makes programming harder (e.g. "Did you forget an else-if case? Can you rely on this value being present?"). (Counter-inspired by JSON not allowing commas after all values.) Too often variability is merely accidental complexity in the name of sophistication, done without forethought / holistic design. Variability in a language makes interpreting that language harder (or writing a compiler for it; or using code-as-data/homoiconicity). Predictability is beneficial for both man and machine. Constraints liberate, liberties constrain. Premature abstraction is dangerous. Abstraction is not about power. Abstraction is about reducing expressivity in a principled way: "If we were seeking power we would have stopped at GOTO's. ... The point is to reduce our expressivity in a principled way ... [to] something which is still powerful enough for our daily uses." -- Cheng Lou of ReScript. Use the Principle of Least Power where possible. Principled abstraction and uniformity enables componentization and composability. Much of the pain in programming languages comes from harmonizing wildly different forms of abstraction (too much heterogreneity/entropy from sophistication, in the name of familiarity/easiness). "A computer language is like a system of tightly intertwined components: every time you introduce a new linguistic abstraction, you have to make sure that it plays well with the rest of the system to avoid bugs and inconsistencies." -- Hirrolot. "It is better to have 100 functions operate on one data structure than to have 10 functions operate on 10 data structures." -- Alan Perlis' in Epigrams on Programming (1982).

  • Make the performant thing the default. Inspired by QwikJS. Developers shouldn't need to rewrite/refactor large swaths of their application just to make it performant. There should be one way, and that way should be the correct and performant way. See the point on uniformity and predictability called "Make it predicatable: Eliminate entropy/chaos/variability". Inspired by Rust's 'zero-cost abstractions', but counter-inspired by Rust's many string types and Rust's semantic complexity).

  • Prefer static over dynamic where possible. It increases predictability for both humans and the machine. Static typing can predict bugs during development, before deployed to production. A less powerful language is easier to analyze and build tooling for. A stateless system is better than a stateful one. Similarly, recreating state from a known point ("just restart it") is much simpler and preferable to modifying a dynamic system into a desired state. Especially in bug reproduction. Inspired by React. Counter-inspired by Microservices, Actors and live/REPL programming.


Purpose: What would the language be for?

What should this dream language of mine primarily be for?

I believe there is enough cross-over between web app (in-browser + web server) and systems development that a generally powerful (i.e. Turing-complete) language could address both successfully. The language Go / Golang tries to do that (for web servers and systems), for instance. Clojure (for backend) with the ClojureScript compiler (for frontend) is another example, or similarly F# with the Fable compiler. Though the question is up for debate whether or not it is a good idea, or if we should always have specialized languages for different domains. But I believe there are benefits if we don't have to.

So, from various sources of inspiration, and the aforementioned principles in mind, here is the list of features that my dream programming language would have.


Features in bold are considered most important. Features are divided into the following sections, based on what all languages ought to address. Learnability, productivity, and scalability are considered cross-cutting concerns, and is thus by and large included within each section.

Readability & reasonability

  • No magic / hidden control. Control-flow should be easy to trace, because it makes it easy to understand and debug. Less magic. Counter-inspired by Ruby on Rails. Inspired by Elixir Phoenix routing / endpoint plugs. Testing isolated parts is made possible by explicitness. Explicit is better than implicit. Inspired by the Zen of Python. Explicitness makes testing isolated parts of the system possible. So Explicit > Implicit, almost always. Because explicitness typically reduces ambiguity and increases predictability. Although you can go overboard with it too, like in programming langauges for enterprise development, where everything tends to become over-specified. Furthermore, implicitness is preferred when one may intuitively and robustly determine the convention through the context, and might as well have an implicit sane default. E.g. self. references to access class variables inside the class methods is represent noise, when it could be done as an implicit default. This is counter-inspired by Python, and inspired by Ruby. However, using self and this are considered an anti-pattern in general.

    • Libraries over frameworks, as a strongly recommended community convention (since a language cannot prevent the creation of frameworks, afaik). Inspired by Elixir, where its Phoenix framework is a notable exception to the rule. Frameworks typically utilize inversion of control ("don't call us, we'll call you"), and ultimately serve to take away control from the programmer. That creates Stack Traces which are really hard to debug, because they reference the framework and not your own code, esp. problematic with concurrency. And when yielding control to various (micro-) frameworks, compatibility becomes a specific issue. The programmer shouldn't ever have to ask: "Is this library/framework compatible with this other one?". Counter-inspired by JS Fatigue. Nor have to ask "Where is the execution path of this program?". Counter-inspired by the magic of Ruby on Rails. When the control is always returned to the programmer (no IoC), he/she may likely mix and match more as pleased, without up-front worrying about compatibility (leading to analysis paralysis).

    • No meta-programming / no runtime evaluated macros. No first-class macros (runtime) since it is a powerful footgun, and breaks our principle of "prefer static over dynamic (for predictability)". But the language should possibly have compile-time macros. Inspired by Clojure and Babel. So that the language can be extended by the community, and so that legacy code could be updated to latest language version by processing the code to transform the syntax (novel feature?). Sort of like the polyfilling Babel does, but in reverse (see reverse polyfills; polyfills that upgrade/modify/interpret old code for features that are removed from the language, so the language evoution/adaptatino and growth is not impeded by compromises due to eternal backwards-compatibility, yet still prevent breaking old code. See section on 'Backward- and forward-compatible'). Also inspired by Babel's simple user-defined code transforms. The problem with even compile-time macros is that it is hard to combine them with (the preservation of) static typing (i.e. types are not checked). If compile-time macros are included, then it should not be string based (find-and-replace) macros, but based on Abstract Syntax Tree (AST) 'syntax objects', since macros should respect scopes, namespaces etc., and should probably also be hygienic while having some escape hatches (like syntax parameters). Inspired by Racket. The problem with compile-time macros for our concerns is that they are essentially direct instructions to the computer (compiler, runtime), which violates the principle of having the computer only indirectly interpret the model (of the system) that the programmer has described (see section on 'Few keywords and operators'). There also is the problem that there is a formal bound to what macros can do for you in terms of the expressiveness of a language: a macro expansion is syntax sugar, and a language can only express something new if it cannot possibly be macro expanded from something existing in the language. So if we want to allow extending the language easily then maybe instead of compile-time macros then the language should make it easy to create some sort of compiler plugins (which are written and run as a separate program). In any case, we want to disallow mixing macros into application code, since they make reasoning about the program and it's control-flow harder, since then what you see may not be what you get.

    • Expressions over statements. The calling code should always get something back (Is. 55:11). Because the result should be able to be further manipulated (chained, for instance). Inspired by Clojure and Haskell. Counter-inspired by JavaScript. Statements suck, as even the inventor of JavaScript, Brendan Eich admits. A goal should be to eliminate the subjective/anthropocentric bias that afflicts programming (especially the Imperative kind), because: It is not you, the programmer, which should be calling code, but code should be calling code. Code should not terminate in the void, as if it's you the programmer who is at every step acting on the machine. It should be the machine acting on itself. Which it actually is. So this is a matter of fact. But it should also be a matter of our language. So our programming language matches the fact. As programmers we should model/describe causal interactions between entities*, not simply encode our own interactions with the machine.*

Syntax, composition

  • Readability and reasonability as top priority. Local > Global: The language should afford the developer with the ability to perform local reasoning (instead of global reasoning), to only focus on the code at hand (not having to jump around many places or files, and worry about potential 'spooky action at a distance'). Reduce dev mind cycles > reduce CPU cycles. Human-oriented and DX-oriented. Willing to sacrifice some performance, but not much, and not to overly gain comparability with natural language. Counter-inspired by SQL. Willing to sacrifice immediate power in the language itself, esp. if that can be achieved through abstracted-away libraries.

    • Should always be able to be read top-to-bottom, left-to-right. The execution order should also always follow the reading order. No <expression> if <condititon> like Ruby allows, since it doesn't afford a scalable way of thinking (just imagine <expression> growing very large, and the "joy" of discovering a tiny if at the end, invalidating your initial assumption that reading the <expression> was relevant to your debugging). Certainly no <code_block> while <condition>. The alignment between execution order and reading order was some of Dijkstra's wisdom: "our intellectual powers to visualize processes evolving in time are relatively poorly developed, we should shorten the conceptual gap between the static program and the dynamic process, by spreading out the process in text space". In simpler terms: enabling the programmer to trace the flow of execution (aka. 'control-flow' or 'control') by simply reading the code. To be able to point to a precise location in the code/text, and ask what state the program/machine is in at that time (i.e. the ostensibility of code). This is of course a feature of the Von Neumann model of computing, so the language would to some extent be tied to it, but it would be a pragmatic choice, since it is the most prevalent model of computing anyway. Relatedly, it is said that: "a core difficulty with [Von Neumann style] languages was that programmers were reduced to reasoning about a long sequence of small state changes to understand their programs, and that a much simpler and more compositional method of reasoning would be to think in terms of the net effect of a computation, where the only thing that mattered was the mapping from function inputs to function outputs." I.e. imperative vs. declarative. So there might be some dissonance with functional programming model here, which the language should aim to resolve, since we desire both the "reasoning about a long sequence of small state changes to understand their programs" and the "more compositional method of reasoning would be to think in terms of the net effect of a computation, where the only thing that mattered was the mapping from function inputs to function outputs". The programmer should be able to consider each, at both development and debugging, where each has its strength. NB: There are some arguments against paying any heed to control-flow at all (see: "4.2 Complexity caused by Control", Out of the Tar Pit, 2006), but unless we'd want an entirely lazy programming language, of which we are not (yet) convinced (cf. eager evaluation by default), then we're out of luck, as far as I know.

    • Reasonability and safety > Power. "In other words, when I focus on reasonability, I don’t care what my language will let me do, I care more about what my language won’t let me do. I want a language that stops me doing stupid things by mistake. That is, if I had to choose between language A that didn’t allow nulls, or language B that had higher-kinded types but still allowed objects to be null easily, I would pick language A without hesitation." -- Scott Wlaschin.

    • Syntax matters (and homoiconicity is a plus): Readability should not imply a one-to-one match with natural language (counter-inspired by SQL), since natural language is inconsistent, duplicitous, ambivalent and multi-faceted. Consistency is key to a programming language. But it should borrow some of the sub-structures from commonly used natural languages such as English (like its popular Subject-Verb-Object, SVO, structure; see also the point on 'DFFP') to make adoption easier (more at-hand/intuitive) for most. Since such grammatical sub-structures are indicative of how we tend to model the world (maybe derived from our shared familiarily with physical objects acting on one another). (This can relate to Chomsky's Universal Grammar theory in linguistics). The SVO syntax also aligns elegantly with the Graph Data model of RDF (subject-predicate-object triples). So a language based on Subject-Verb-Object style could be homoiconic, since subject-predicate-object is already a data structure (RDF). Furthermore, if code-is-data (i.e. homoiconicity or pseudo-homoiconicity is preserved) it could be interesting to have the code map well to a graph database, opening up avenues for analysis in the form of advanced graph algorithms, which could be useful for, say, code complexity analysis (e.g. more straightforward cyclomatic complexity analysis) or deadlock detection. There is already precedence in the use of combinator graph reductions in FP languages. Homoiconicity (code structure mirroring a data structure) could potentially also help with respect to typing, since we want to be able to execute the same code at compile-time (statics) and run-time (dynamics), to avoid the biformity and inconsistency of static languages: "the ideal linguistic abstraction is both static and dynamic; however, it is still a single concept and not two logically similar concepts but with different interfaces". Counter-inspired by JavaScript & TypeScript, which have plenty of duplicate abstractions, for talking to either the runtime (JS) or the compiler (TS). I want to simply talk once, and have the runtime and the compiler interpret what it needs. "One of the most fundamental features of Idris is that types and expressions are part of the same language – you use the same syntax for both." -- Edwin Brady, the author of Idris (Edwin Brady, n.d.). Inspired by Idris. But, "Idris, however, is not a simple language", so the ideal solution here is wanting... (but see the sub-point below) Maybe patterns as types could be a way... Or types defined with the same language constructs as the runtime language, like in Zig's comptime concept... Data types are a way to manually describe the shape of data, but it seems what you want is to automatically derive/infer the shape of data (as far as you can, based on a closed-world assumption of your own code, where third party code would use type bindings, and data received at runtime would need to fit into a type declared statically). In any case, the goal is to avoid duplication, avoid types being a declarative meta-language on top of the language, and potentially allow constructing custom types imperatively (within some constraints, due to the Halting problem). This is also inspired by NexusJS and io-ts, which allow the inverse, namely using types at runtime (for I/O validation): "The advantage of using io-ts to define the runtime type is that we can validate the type at runtime, and we can also extract the corresponding static type, so we don’t have to define it twice."

    • A potential solution to the biformity and inconsistency of static languages may be to avoid declaring types at all, but rather declare concrete data (mock data and/or default data), from which its type is inferred. This is expounded upon under the point on 'concretization'. This could also help with the homoiconicity problem that code = data doesn't play well with the presence of types. If we treat mock data as default data, and assign it as such (to variables, arguments, API calls etc.), we could run the program entirely without dynamic input, at the time of development or compilation, if we wanted to. Or, in production, we could have sane default/backup values to fallback to (without erroring) in case some third party API didn't provide us with all the data we wanted.

    • The syntax should as much as possible favor words over special characters (like curly braces, etc., and brackets and parentheses to a lesser extent). (This must be weighed against the desire for homoiconicity..). Plaintext words are faster to write (counter-intuitively enough, but learned from the shift towards password phrases over passwords with special characters, and from the benefit of not having to use modifier keys, which are esp. cumbersome on mobile keyboards), and more aesthetical to read, helpful for visually impaired users, and more self-documenting to novices. Inspired by Ruby (if .. end is vertically more symmetric than a vertically aligned if ... }) and Tailwind (fast-to-type is a feature). The IDE should support auto-completion on language keywords (similar to how it can auto-complete references to your own code), so it's even faster to type the valid language keywords. The IDE should also allow toggling auto-compaction/tersification, for the times when you need to process a lot of information or many details of a a complex algorithm at once (though that should be a sign that you should refactor instead of writing more tersely). But the full text of language keywords should always be present, so people won't constantly need documentation to understand the gist of code that is shared online (otherwise you could get "what does cons and conj mean, now again?" scenarios). That should also ease on-boarding of novices, and thus benefit the growth of the ecosystem.

    • Isolation / Encapsulation. To analyze a program (i.e. break it down and understand it in detail), you need to be able to understand parts of the program in isolation. So everything should be able to be encapsulated (all code, whether on back-end and front-end), since encapsulation affords reasonability (and testability), by limiting places bugs (i.e. errors in thinking) can hide. Counter-inspired by Rails views (sharing a global scope) and instance variables. Inspired by the testability of pure functions.

    • Params: We want to avoid problems with ordering and mysterious arguments. So, arguments to functions can be given in any order, but must always be named with labels corresponding to the parameters the function takes. If the label is equal to the argument variable name then the label can be omitted, since you shouldn't need to repeat yourself. (If the argument variable is referenced from a module, then the module path should be disregarded in the name matching.) Inspired by keyword arguments in Ruby and JS object params. Even if a function only takes only 1 parameter, then it turns out it's useful/informational to have a label, unless the function takes a variable with the same name as it's parameter (in which case the label should be omitted. E.g. you can write output(message) instead of needing to write output(message: message) which is repetition that frankly looks silly and represents noise. (In the SVO-syntax, the Subject can in principle be bound to any of the function's parameters that has the same data type, so to differentiate, the Subject also needs a label to designate which one. E.g.: helloMom→message.joinWith(helloDad→secondMessageParam), where the keywords are put after the input data, due to 'DFFP'.) Counter-inspired by the Mysterious Tuple Problem in Lisp, and inspired by the labeled arguments in ReasonML, and the interleaving of keywords and arguments in Smalltalk.

    • No currying. Inspired by ReScript. But should still allow specialization / partial application of functions by other means (see the difference). Currying is related to how it is declared, whilst partial application is related to how it is used. Currying is dependent on parameter ordering when declared, but we want to allow any order of parameters (see section on "No Place-oriented programming" which talks about "parameter lists", and the section on "Params" which talks about "keyword arguments"). Partial function application is a common technique to be able to create new functions that are specialized versions of general functions that take many parameters. It allows providing the function with some arguments at one place in the code, but waiting until another place in the code to provide the rest of the needed arguments and then to execute it. Or supplying a function to code that expects to work with functions taking fewer parameters. We want to achieve such specialization by using closures rather than currying. Inspired by Python's functools.partial function, the application of which allows partial function application to be decided upon at the place of use/application (not at the place of function declaration). Currying is also typically tied to a strict and enforced ordering of arguments, which we don't want for flexibility reasons (see point on 'No Place-oriented programming'). Normal currying would also limit reuse: you shouldn't, at the time of declaration, have to predict which (or in what order) arguments will be available at at the time of use. Counter-inspired by Haskell. But if we were ever convinced to allow currying, then input params should at least be explicit at every step (for clarity, refactorability and to aid the compiler). Counter-inspired by argument/point free style in FP and concatenative languages, due to the principle that explicit is better than implicit (inspired by the Zen of Python). Should in any case probably never auto-curry functions, since it confuses partial function invocation with regular function invocation ("is the result of this function a function or a value?"), and curried functions makes default, optional, and keyword parameters difficult to implement.

    • No Place-oriented programming (PLOP), iow. avoid order-dependence (aka. positional semantics) at almost any cost, since it isn't adaptable/scalable. Inspired by Clojure. Counter-inspired by Haskell and Elm, and Lisp (and to some extent Clojure). This goes for reorderability of expressions due to pure functions having no side-effects. Such reordering is desired (see: "4.2 Complexity caused by Control", Out of the Tar Pit, 2006) since it allows designing, structuring, and reading programs in a finish-to-start/high-to-low-level manner, enabling the reader to incrementally drill down into the code with the underlying implementation aka. "top-down program decomposition" aka. "call-before-declare" (the same reason that JS has function hoisting). Order-independence also goes for parameter lists to functions. I don't want to have to use a _ placeholder for places where there could be a parameter, just because I didn't supply one. Shouldn't have to sacrifice piping just to get named arguments, either (piping should use an explicit pipe operator). Counter-inspired by Elm, and inspired by Hack. Consequence (?): would need a data structure like a record but which ideally can be accessed in an order-independent manner (similar to a map). Plus, functions should be able to take in such records but also arest parameter that can represent an arbitrary number of extra fields (to make functions more reusable and less coupled to their initial context, e.g. they should be able to be moved up/down in a component hierarchy without major changes to their parameter lists). Counter-inspired by Elm. But Records are useful for enumerating what's possible, and when used in pattern-matching the type system could warn you when you are forgetting to account for all fields, and show you where you need to update the code. This is esp. useful when refactoring, and esp. when working in a team. To achieve this benefit without resorting to Records and PLOP, the type system should warn you of the (pattern-matching) places you could want to update to handle all the Record's new fields, but you shouldn't be required to update them, if you don't want to. This way you, and the code, can stay flexible (and functions would be more reusable), but you'd also receive the needed guidance.

    • No unless or other counter-intuitive-prone operators. Counter-inspired by Ruby. See also the rationale for disallowing <expression> if <condition> as it also applies to <expression> unless <condition>. Even Python's substitute for a ternary operator, <expression> if <condition> else <expression>, is prone to misuse. Such operators are a symptom of a deeper need. The lack of such operators could be alleviated with 'Dynamic code re-arrangement' (see the point with same name in this article), since that allows you to focus on the chosen thing of importance when reading (e.g. the result, or the way you get to the result). In all, the reason such syntax is invented I think comes down to: Some times you read code and just care about the result (WHAT), and only once you've found the end results you care about tracing HOW it got there. Because why bother tracing long and intricate code if its result is ultimately not what you are after? While other times you read code and want to go through all the steps chronologically to see what happens (HOW), and the result is of secondary importance. Typically when debugging, or when you know you are looking in the right place in the first place. This points to the need for tooling to be able to reorder code depending on what the programmer is looking for. Maybe instead of treating code just as text characters, code should be able to be treated as blocks in the IDE, as many web based publishing platforms (Medium, Authorea, Hashnode) have discovered could be a smart and powerful thing to do with text.

    • No abstract mathematical jargon. Counter-inspired by Haskell. As it impedes onboardig, and induces a mindset of theorizing and premature abstraction/generalization that impedes rapid development. Should be accessible for as wide a community as possible, with as little foreknowledge as possible. Inspired by Quorum. Also apply some pragmatic constraints to conventions from functional programming ways of writing code. In the interest of legibility and onboarding. FP conventions are not considered paramount (e.g. currying, argument/point free style and fold), so they might not be supported, but their utility should be considered on a case-by-case basis. Inspired by Don Syme of F#. On the dislike of argument/point free style: If you are to think about something, it is an advantage that it is there, in front of you, reminding you about what it is you are thinking about. You should always be able to **see what you are working on. See later section on 'No need to manipulate data structures in the human mind'.

    • Do not presume contextual knowledge. In UX this is known as "No modes!". Code should be able to be read from start to finish without having been educated/preloaded with any foreknowledge (like 'in this context, you have these things implicitly available'). Counter-inspired by class inheritance and Ruby magic, and JavaScript's runtime bound this keyword and associated scoping problems and safety problems. Turns out too much dynamism (runtime contextualisation) can be harmful. Counter-inspired by JavaScript: "In JS, this is one truly awful part. this is a dynamically scoped variable that takes on values depending on how the current function was invoked." -- sinelaw.

    • Should facilitate and nudge programming in the language towards code with low Cognitive Complexity score.

    • Dynamic Verbosity: Should be able to show more/less of syntax terms in the code (length of variable names could be auto-shortened). Beginners will want more self-documenting code. Whereas experts typically desire terser code, so they can focus on the problem domain without clutter from the language (e.g. mathematics). "By relieving the brain of all unnecessary work, a good notation sets it free to concentrate on more advanced problems, and in effect increases the mental power of the race." -- A.N. Whitehead (cited in Notation as a Tool of Thought). A programmer will typically gradually go from beginner to expert on a code-base, even his own. See: content-addressable code. Content-addressable code would afford dynamic verbosity, which is important because: "A language should be designed in terms of an abstract syntax and it should have perhaps, several forms of concrete syntax: one which is easy to write and maybe quite abbreviated; another which is good to look at and maybe quite fancy... and another, which is easy to make computers manipulate... all should be based on the same abstract syntax... the abstract syntax is what the theoreticians will use and one or more of the concrete syntaxes is what the practitioners will use." -- John McCarthy, creator of Lisp. Content-addressable code is also important for Dynamic Verbosity because of Stroustrup's Rule: "For new features, people insist on LOUD explicit syntax. For established features, people want terse notation." -- Bjarne Stroustrup. "When designing Rust, we quickly learned that the idea of picking short names to satisfy experienced users at the cost of new ones doesn't work in practice. Too many users complain that the language is too hard to learn and ugly; it took years for 'Rust is so ugly, it's the resurrection of Perl' to finally stop being a meme. If we had stuck to our guns with short keywords, Rust might have been dead by now. ... history has shown that readable languages much more frequently go on to be successful than languages with idiosyncratic syntactic choices." -- Patrick Walton (on unfamiliar syntax in Lisp such as car and cdr vs. first and rest).

    • Dynamic code re-arrangement: code should be able to be re-arranged in order of start-to-finish/low-to-high-level or finish-to-start/high-to-low-level, because each is beneficial at various points when writing/reading code. Some times you care only about the results, and some times you care about how the program got to those results. But today, code is fixed in the order it is written. Typically in either an imperative way, displaying steps chronologically from start-to-finish, akin to chaining e.g. data.func1.func2.func3, or in a functional way: from outside-in, e.g.: func3(func2(func1(data))), but when reading to understand the order of execution you have to read from the innermost to the outermost expression. This is problematic, at various times, and has arguably made functional programming less accessible to newcomers. But both ways of reading are desirable, at different times: At one time you just care about viewing the overall result/conclusion (e.g. func3 and what it returns), and potentially working your way backwards/inwards a little bit. Maybe you start with an end goal in mind (name of a function), and then drill down to a more and more concrete implementation. But at another time you care about going the other way around: seeing how it is executed from start to the finishing result/conclusion (think: piping). This duality of thinking/reading mirrors how we approach reading in other domains. This feature could be enabled by content-addressable code, since the arrangement of code itself could be made more malleable and dynamic. See: content-addressable code.

  • Piping, or some form of it. But always top-to-bottom or left-to-right. Inspired by Bash, and functional programming with pipes (Elixir, BuckleScript, and ts-belt). Data-first instead of data-last.

  • Not indentation based / whitespace should not be significant (counter-inspired by Python), since it is brittle: copy/paste bugs when the source and destination are indented differently, variable renaming resulting in bugs (and more). Sharing code online or by email can also fail when indentation is made meaningful, due to various editors treating indentation differently, some even stripping it out entirely (which would make code interpretation ambiguous). Whitespace should not implicitly determine the meaning of the program, rather: explicit meaning should determine indentation (which enables auto-formatting). Inspired by Pyret, and Golang's auto-formatter gofmt. This also eases mechanical source transformations (with clearer git diffs) which eases maintenance on large code bases (i.e. scales better). It should also go without saying that invisible characters (whitespace; space, tabs) should not affect the meaning or interpretation of a program (Who would like to debug something they can't see? I once spent 2 days debugging a fragile API call which turned out to be due to an extra whitespace at some point in the XML message...). With explicit closing tags, the IDE can then help to re-indent code appropriately (since seemingly improper indentation couldn't incidentally carry semantic meaning). But the language should also not require semicolons. Inspired by Ruby, and semi-colon-free JS. Even though newline characters could be deemed to be significant whitespace, and subject to the aforementioned problems, it should only be a problem if the language is based around statements, not expressions (like intended).

    • BUT: Might consider allowing (though never requiring) indentation-based syntax, if the language has a standard style and formatter, and if the readability and ergonomics turn out to be vastly superior at scale, with the particular language syntax. In that case, tabs should be forbidden in favor of simple whitespace (IDE's can easily turn tabs into whitespace characters anyway).

    • Similar-looking and non-interacting code-lines should be able to change place without breaking anything. Counter-inspired by not able to add comma to last line in JSON, not being able to reorder/extract from comma-separated multi-line variable declarations in JS, and also counter-inspired by the contextualised expression terminators in Erlang.

    • Consistent syntax, optimized for code refactoring: "The language syntax facilitates code reuse by making it easy to move code from local block → local function → global function". Inspired by Jai.

  • Data First Functional Programming (DFFP). Based on the solid theoretical foundation of Lambda calculus. Should mimic the style of object-orientation, but is simply structs and functions under-the-hood. (Also: functional programming patterns over procedural code.) Because it is human to see and visualize the world (as well as computing) in terms of objects and verbs, and to use verbs to signify the causal relations between objects. Focusing to heavily on only one of the paradigms (OOP or FP) can typically lead to anti-patterns (God classes/objects, Factory objects, and Singletons, in OOP), or program structures far removed from the business domain model which also has linguistically unintuitive syntax, as in FP (c.f. FP is not popular because it is backwards). This is since only 12% of natural languages start with the verb, either Verb-Subject-Object (VSO) or Verb-Object-Subject (VOS), subject-verb-object (wiki)), as FP tends to do, e.g. verb(subject object) or verb(object) or (verb object). But 88% of natural languages start with something concrete, the Subject/Object (and the Object of one sentence is typically the Subject of the next sentence; similar to call chains). So Subject-Verb-Object (SVO) should be preferred. The programming language could account for this, for instance like: subject.verb(object) or subject.verb(subject.verb(object)).verb or subject.verb.verb or subject.verb(object).verb(object) etc. (see 'syntax matters' on 'homoiconicity'). Because it enables the programmer to cognitively carry forward (dare I say.. iteratively mutate) a mental model, step by step (without relying on short or long term memory, which should be freed for higher level concerns than mere parsing). Inspired by method chaining and Fluent interfaces aka. Fluent API's. But without Fluent interfaces associated problems like mutation/impurity, large and cancerous classes/objects, and functions coupled with (and inside of) those classes, so they would be hard to reuse across classes, or relocate across modules. Instead, since the Fluent API would be solved with a simple transform to Lisp/Clojure-style FP underneath (and IDE autocomplete would be done by looking up functions applicable to the resulting data type), we could respect the open-closed principle and extend the functionality of a data type by simply writing new functions (wherever we want) that take in that data type. This is enabled by the subject.verb(object) to (verb subject object) transform, since the latter is simply a Lisp/Clojure style function (in JS terms it would be verb(subject, object)). The programmer would not have to mentally do this transform while coding and when looking at function definitions, because functions would also be defined in the same pattern, f.ex.: subject.verb(object).defined_as(...). I think the SVO alignment is one significant but under-appreciated reason for OOP's success (since it affords syntonicity, both of the body and ego kind). That, together with enabling a stepwise/imperative construction of programs (as Imperative programming styles capitalize on), makes for a more intuitive approach for beginners, which is vitally important for onboarding & growth. "Objects and methods" could be merely syntax sugar for structs and functions (see: interchangeability of method-style and procedure-call-style, or the pipe first operator in ReScript, which also illustrates emulating object-oriented programming), if one leaves out troublesome inheritance (which might be good, since composition > inheritance). Inspired by Golang. The quote "Data, not behavior, is the more crucial part of programming." is attributed to Linus Torvalds and Fred Brooks. If data is the focus-point, the language should mirror that. Interestingly, in addition to the more intuitive API, data-first also affords better IDE integration, simpler compiler errors, and more accurate type inference. Inspired by ReScript.

    • If all the language allows you to pass around is functions, then you could possibly choose whether to execute the program (functions) eagerly or lazily. If the program executes eagerly, then it could reify/materialize the result of functions down to the lowest level (primitive values) as it goes along. It would be interesting and potentially very useful to have functions all the way down. Loosely inspired by Unlambda. Since composing functions without materializing values enables some powerful programming constructs (such as 'Transducers'; see elsewhere in this article) and the programmer always has an option to intercede and return a different value from a basic function used throughout the program, if need be.

    • Containability and explicitness. Inspired by pure functions. Perhaps the language should even restrict a function's scope only to what's sent in through its parameters. So no one can reference hidden inputs (i.e. side-causes). Thus enforcing more predictable functions, where it is always apparent where it is used: what the function takes in and what it returns. So to achieve partial application of functions (i.e. useful closures), without addressing the outer scope implicitly, could be to supply variables from the outer scope as explicit default/preset/front-loaded parameters (e.g. in pseudo-JS: let closureState = 2; function someFunction(a, b: closureState){...}). This both makes the input and the closure more explicit, and explicit is better than implicit (inspired by the Zen of Python). That way, input coming from closures (usually considered side-causes) would be declared in the function signature, so you don't have to dive into the (potentially long) function body to discover them. With the added benefit that a function could always be customized by the caller by overriding the closure values given as inputs by default (e.g. called as let customState = 3; someFunction(a:1, b: customState);). But importantly, the supplied custom variables should not be able to be shared with other functions (aka. global variables) unless they are full constants, because that would create the dreaded shared mutable state which would introduce side-effects. If we should allow stateful functions at all, then we (at least here) favor functions being able to mutate (local, but persistent) state, rather than allow functions being able to share state with each other (e.g. via global variables). But we do prefer functions that always return the same result even when called repeatedly (i.e. idempotent, but also without side-effects i.e. being a pure function).

    • Composable. Favour composition over inheritance. Inspired by Robert C. Martin, Martin Fowler, and JSX in React. Composability entails it should be easy to write code that is declarative, isolated and order-independent. See "strongly typed".

    • Functional programming patterns like .map, .filter, over procedural code like for-loops etc., since the latter would encourage mutating state, and we want immutability.

    • Immutability enables composability, because it enables order-independence through managed effects.

    • Referentially transparent expressions. Which means variables cannot be reassigned, so a name will always refer to the same value (see principle: "Things that are different should look different"). Inspired by Haskell. Counter-inspired by Ruby: "Of course, functional programming is possible in Ruby, but it's not the natural style. You often end up with many side effects, partly because it's the same syntax for value declaration and variable mutation." according to Laurent Le-Brun. Referential transparency should enable a high degree of modularization but could also lead to easy automatic parallelization and memoization.

    • Automatic TCO (tail-call optimization). To keep the processing lean, and avoid potential stack overflow, by avoiding allocating new stack frames for recursive function calls. Counter-inspired by Clojure / JVM. Inspired by Scheme and ML-languages, and Lua. TCO should align well with the desire to have a language where the programmer can have and mutate a mental model to carry forward, without either the programmer or the computer having to rely on remembering and returning to previously remembered values.

    • Formally verifiable / mathematically provable. Nice-to-have, not must-have. The functional programming aspect of the language would support this. Provability of algorithms written in the language could reduce the need for specifying test cases for various forms of input. See also "Safety & Correctness".

  • Very constrained. Since "constraints liberate, liberties constrain", as Bjarnason said. Inspired by Golang's minimalism, Austral's anti-features, and Elm's guardrails. For learnability and maintainability. Since discipline doesn't scale (obligatory xkcd: with too much power, and the wrong nudges, all it takes is a moment of laziness/crunch-time to corrupt a strong foundation), and a complex language affords nerd-sniping kinds of puzzles, and bikeshedding and idiomatic analysis-paralysis. Counter-inspired by Haskell. The virtue of functional programming is that it subtracts features that are too-powerful/footguns (compared to OOP), namely: shared mutable state & side-effects. The language designers should take care of and standardize all the idiomacy (natural modes of expression in the language). "Inside every big ugly language there is a small beautiful language trying to come out." -- sinelaw. The language should assume the developer is an unexperienced, lazy, (immediately) forgetful, and habitual creature. As long as software development is done by mere humans. This assumption sets the bar (the worst case), and is a good principle for DX, as well as UX. The constrained nature of the language should allow for quick learning and proficiency. Complexity should lie in the system and domain, not the language. When the language restricts what can be done, it's easier to understand what was done (a smaller space of possibilities reduces ambiguity and increases predictability, which gives speed for everyone, at a small initial learning cost). The language should avoid Pit of Despair programming, and leave the programmer in the Pit of Success: where its rules encourage you to write correct code in the first place. Inspired by Eric Lippert (of C#), but also by Rust.

    • Few keywords and operators: I don’t want to talk to / instruct the compiler. I want the compiler to understand how I write my program. Even if that limits the ways in which I can write my program. Counter-inspired by F#, C# and Java. Inspired by Clojure's extreme frugality with syntax (macros aside). Since every keyword and operator has to be implemented by the language, and potentially has to be learned by the reader. I'd rather want just functions with self-explaining and easily distinguishable names. (Inspired by Clojure, but also counter-inspired by Clojure's cons and conj.) Even if the names may be longer to write. If it prevents a documentation lookup and reduces the size of the meta-language, then it's worth typing a few extra characters (instead of cons maybe name it construct or even build). Code is read 10x more than it's written. You could argue learning a new keyword is a one time front loading cost that is amortized over the number of times it's later encountered (and saves writing and reading time by being terse). But a language should not be reserved for the few who are "in the know" (e.g. a sociolect), but be as accessible to everyone as possible. Also, even if certain keywords are encountered seldomly, if there are a multitude of them, every reader is bound to have to reach for the docs for some new keyword frequently enough, when reading code/projects written by others. A programmer's memory is better spent elsewhere.

    • Declarative over imperative (data-oriented not capability-oriented syntax): The syntax of programming languages are typically based on the notion that you either directly reference hardware-related constructs (memory regions etc., think C or Rust), or give instructions to some language runtime, which then carry out those instructions (think Ruby or JS). Both of which means that the language invariably becomes closely coupled to either a certain conceptual model (i.e. capabilities) of the hardware that it needs to run on (or the limitations of LLVM, and its supported platforms), or needs to run on its own runtime. But what if you turned the language inside-out? So that the language does not envelop its runtime or platform, but is sufficiently abstracted (into describing only data, and dependencies between data) so that it can be consumed by any kind of runtime? Inspired by XState. That way people could implement various runtimes that have various sorts of capabilities/behaviors when it comes to executing the program code. Since the code does not presume anything about how the data is supposed to be processed (sync/async, eager/lazy etc.). Various runtimes may be tailored for different environments: local cpu-first, or distributed network-first, for example. Since some times synchronous operations are more performant, but other times async is unavoidable. Which means that the language syntax should not distinguish between sync/async operations, but leave the decision, of HOW the program is run, up to the runtime (where the responsibility should naturally be located: the runtime should decide the run time). Inspired by Golang's elimination of the distinction between sync and async code. All this ties back to the aforementioned notion that Programming needs to get away from the notion that the programmer is giving instructions to the machine. «Progress is possible only if we train ourselves to think about programs without thinking of them as pieces of executable code.» -- Edsger Dijkstra. To "separate the meaning of a program from the implementation details. ... Saying less about implementation should also make programs more flexible." -- Paul Graham in The Hundred-Year Language. So that the compiler or the runtime could choose the appropriate implementation details. See 'adaptive runtime'. (But doesn't it all come down to imperative machine instructions in the end? Yes, but the declarative foundation of the language could either be made in an imperative language like C, or there might exist a way to model even fundamental machine operations in terms of a declarative language, so the declarative language could bootstrap itself and be declarative all the way down...)

    • Names: No alias names for keywords in the language or for functions in the standard library (except for as documentation reference to other languages). Inspired by Python ("explicit over implicit", "one way over multiple ways"). Counter-inspired by Perl (postmodern plurality) and aliasing in the Ramda library. All things tend toward disorder, as programmers it is our job to Fight Entropy (aka. chaos or variability). The language should favor one consistent vocabulary, since it increases predictability and reduces variability. Even at the cost of expressiveness (the language should afford just enough expressiveness for the domain, see: configurable language, community grown). Names should not mimic any other programming language per se, but attempt to cater to complete beginners, because notation has a large impact on novices, a principle inspired by Quorum. There should be a VS Code plugin that allows people coming from various languages to type function names as they know them and the editor will translate on the fly. E.g. typing in array.filter gets turned into array.keep in the code.

    • Guardrails: "<insert your favorite programming paradigm here> works extremely well if used correctly." as Willy Schott said. The ideal programming language should both work extremely well even when used incorrectly (which all powerful tools will be), but first and foremost be extremely hard to use incorrectly. Inspired by Rust and Elm.

    • Not overly terse. Counter-inspired by C. "Legibility trumps succinctness". Maybe give compiler warnings if the programmer writes names with less than about 4 characters. Reading >>> writing, since time spent reading is well over 10x time spent writing (inspired by Robert C. Martin), and writing can be alleviated with auto-complete, text macro expansions, and snippets, in the IDE.

    • No runtime reflection. Counter-inspired by meta-programming and runtime type inspection in Ruby.

    • Not overly verbose. Counter-inspired by XML and Java. Maybe compiler warnings if the programmer writes names with more than about 20 characters.

    • The Rule of Least Power (by WC3), suggest a language should be the least powerful language still suited for its purpose. To minimise its complexity and surface-area. For better reuse, but more importantly: to make programs, data, and (I will include) data flows, easier to analyse and predict. Inspired by FSM & XState. It needs, however, to be just powerful enough to be generally useful (and not limited to a DSL). Possibly Turing-complete. Given these considerations, a Lisp-style language comes to mind. But there's reasons Lisp never became hugely popular. My guess: readability. So while it could be a Lisp-language (or compile to one), it should read better than one. "If we were seeking power we would have stopped at GOTO's. ... The point is to reduce our expressivity in a principled way ... [to] something which is still powerful enough for our daily uses." -- Cheng Lou of ReScript.

    • See: Escape hatches.

  • Pattern matching. Inspired by Elixir, Rust and ReScript. The expression-oriented nature of the language should make this natural, without extra/fancy syntax. Pattern matching could preferably replace if/else conditional logic, perform totality checking to ensure you've covered every possible condition, and even enable conditional branching based on type.

  • Niceties. Inspired by Bagel.

    • "Only single-quotes can be used for strings, and all strings are template strings and can be multiline ('Hello, ${firstName}!')"

    • No triple-equals; double-equals will work the way triple-equals does in JS.

    • Range operator for creating iterators of numbers (5..10, 0..arr.length, etc).

  • Simple primitives, that compose well, so that you are able to make powerful abstractions. Inspired by SolidJS, Jotai and Radix UI. But, their composability/orthogonality should be vetted and coherent before inclusion in the language, so not every programmer ends up in a tangle, and when programming you should be able to "foresee how your imaginary interface would fit with language semantics" (see Rust is hard). If this means fewer allowable primitives, so be it. Counter-inspired by Rust. So, simple primitives instead of directly supplying powerful abstractions that you have to customize for various use cases, or wait for someone else to release an update for. Covered in the principle of Composition over Configuration (not to be confused with Convention over Configuration). Maybe homoiconicity... since it would make writing the compiler slightly easier / more elegant, and make the language more readily available to evolve on its own (permissionlessly) in the community. Inspired by Lisp and Clojure's Rich Hickey. But homoiconicity would allow meta-programming, and the associated complexity..?

    • The language should maybe also not be so powerful that programs become entirely composed by very high-level domain-specific abstractions, since it encourages esotericity and sociolects, but most importantly: code indirection when reading/browsing. Coding should not feel like designing an AST, so should try to encourage keeping the code flattened (by piping perhaps?) and as down-to-earth as possible. Could maybe be alleviated by an IDE plugin which would allow temporary automatic code inlining (editable previews).

    • Removing variability in the syntax makes it more targetable for tooling and static analysis. This further benefits the ecosystem.

  • Transducers, under-the-hood, to compose and collate/reduce transformation functions (chains of map, filter etc. turn into a single function, visualised here). Chaining function calls should use language-supported transducers implicitly. Maybe one could get transducers for free through use of multiple return values, as inspired by Qi (a flow-oriented DSL for Racket, the Lisp-like language). The language should at least not require a special compose syntax.

  • No Exceptions. Inspired by Golang. But Recoverable and Unrecoverable errors. Inspired by Rust. (Definitely no checked exceptions, as it breaks encapsulation by imposing behavior on the callee (the callee should only have to handle the function's specified result). Counter-inspired by Java). This has implications for tracing the control-flow of the program, and for how you write the program. See also: "Safety & Correctness".

  • No null, and clear Error handling: Goal is to eliminate timid coding patterns like null checks everywhere. Counter-inspired by Golang. No implicit null or nil value. Meaning no runtime null errors (typically occurring far removed from their point of inception). Inspired by Elm and Rust. Ideally without having to explicitly declare Maybe aka. Option types (inspired by Hickey's Maybe Not). Could either automatically represent nilable variables as a union between the type and nil (think String?), so that the compiler can do null reference checks at compile-time. Inspired by Crystal. Or, automatically but statically infer and create/augment a function's return type to a nullable reference type indicated by a ? after the typename, whenever there is an unhandled condition that could result in a null value. Or automatically create a NullObject (see: NullObject pattern) of the function's declared return type. Maybe even better: let every type declare and handle their own empty state. If all types are defined in terms of Monoids, then null could be replaced by the identity value (of each Monoid), so that combinations within that type never fail, and never alter the result. NB: would make it hard to express something which was supposed to be there but which is missing, like a missing point on a graph curve, instead of plotting a definite 0. So would need careful consideration to choose this approach. There are some benefits of Nullability as GraphQL has demonstrated, especially in a networked setting. But even then I'd prefer some standardized Error object (with a necessary error message) that the client would have to handle. Because otherwise, how do you know the reason for the failure? Did a service go down, so the resolved field was nulled out, or did did a programmer forget to implement something or was it a bug in the server code..? Maybe nullability will be considered for over-the-network responses, but operations on a machine should be treated as having a greater guarantee to complete, and thus the data type should have stricter rules. In all: “Non-nullability is the sort of thing you want baked into a type system from day one, not something you want to retrofit 12 years later” -- Eric Lippert of C#.

    • Function indepencence: When a function becomes more capable (by widening its allowed input, e.g. string to Option<string>, or tightening its returned result, e.g. Option<string> to string) it shouldn't break callers (which then could result in cascading refactors, cf.: what color is your function?). A way to solve this would be if the language could automatically perform casting of such arguments/results, and have a type system that could account for that. Inspired by Flow, and counter-inspired by TS. Furthermore, the return type from functions using I/O (like IOMonad in Haskell), should always be augmented/inferred from static analysis.

    • Variant Types for error-handling using return values (like Result<Type, Error>, inspired by Rust), instead of special syntax. Counter-inspired by Golang.

    • So that you have less avenues to explore when debugging and fewer branches to check when programming, so you can write Confident Code focused on the happy-path.

    • No possibility of failing silently during runtime (due to syntax errors). Counter-inspired by JS.

  • No @decorators. Counter-inspired by Angular and NestJS. Decorators feel like “magic” that make the runtime control-flow unobvious. I don’t like macro-expansions, either. I don’t want to talk to / instruct the compiler. I want the compiler to understand how I write my program. Even if that limits the ways in which I can write my program. Rather than allowing a wide range of styles, and then having to decorate certain styles ad-hoc, to disambiguate them. I don’t like the aesthetics of decorators either.

Execution model

  • Eager evaluation by default (strict, call-by-value). Since it is more straightforward to reason about in most cases, simpler to analyze/monitor/debug, and spreads CPU and memory consumption out more in time, than lazy evaluation (aka. call-by-need, aka. memoized call-by-name) which would pile up work and in worst case could overflow memory at an unexpected time (in any case, the programmer shouldn't have to worry about evaluation strategies, including space usage performance and evaluation stack usage). Inspired by Idris. But it should use the generally more effective (do-less-work) lazy evaluation approach if currying functions, or chaining methods, unless intermediate error-handling or similar requires value realization aka. data materialization (and even here, transducers could potentially alleviate unnecessary value realization). Inspired by Lazy.js. But this is an optimisation that could wait. Concurrent operations across threads/processes should not be lazy. Since you'd want to start exercising the machine(s) as soon as possible. Counter-inspired by Haskell. Although it must be said: I am eager to be convinced that lazy in general is better and that space leakage and the bookkeeping overhead can be minimized. But in general, the programmer shouldn't need to worry about when the machine executes some piece of code. Why wouldn't it be possible for a compiler to figure out at compile-time how and where functions are referenced, and choose eager or lazy evaluation depending on which is more suitable? For sequentially chaining of operations on data structures, it could be lazy, and for other operations (potentially further apart in the program, with potentially memory intensive operations in between..) it could choose to be eager (get the work done, so the memory can be free'd asap). Or?

  • Async should be eager, but await should be lazy. Compared with JS. The machine should start async operations immediately, but you shouldn't have to denotate when to await it. It should be automatically awaited (block the process) only when the result is needed (i.e. lazy).

  • Async: blocking/sync interface, but non-blocking I/O. Inspired by Golang, and to lesser extent JS / Node.js too. But should not have to litter code with async/await repeatedly (see: what color is your function? and the problem with function annotations, and async everything, even though async/await can help nudge developers towards a "functional core, imperative shell" architecture..). Could be solved with Async Transparency, inspired by Hyperscript. But hiding the async nature with synchronous seeming abstractions could create a dangerous model-code gap with a potential impedance-mismatch and cause for design errors and bugs (inspired by Simon Brown)... So the language should make some abstractions around async simple (like goroutines in Golang). But also inspired by declarative and easily statically analysable async contexts, made with JSX, like Suspense (which is basically an async if-statement), in React and SolidJS.

    • Alternatively: Async everything? The feature referential transparency, obtained if the language enforces Pure Functions (i.e. no side-effects), could potentially open up an avenue for making everything async by default (and letting the compiler insert await instructions where it figures out functions are not I/O bound and thus can be optimized direct/synchronous CPU execution instead, without the overhead of asynchronicity).

    • Ease of reasonability is first priority. Inspired by F# (Is your language reasonable? by Scott Wlaschin). I believe it is best afforded by simple and clear abstractions (without model/code impedance mismatch, as made important by failures of ORM's and distributed contexts). The choice of sync interface here as opposed to async, is similar to how the wish for lazy evaluation by default was discarded for eager evaluation by default. One argument by Ryan Dahl of Node.js is that sync by default with explicit async (he mentiones goroutines in Go) is a nicer programming model than async everything (like in Node). Because it's easier to think through what the program is doing in one sequential control flow, than jumping into other function calls like in Node.js (if you are using async callbacks). See the "fragments your logic" point below. Reasonability is a top priority, so we cannot make a compromise here.

    • Async: Unbounded Buffered Channels, which simply puts messages onto the queue/buffer of the channel (see also "Machines" concept under the "Scalable" feature). Inspired by Golang and Clojure. So that the sender can continue working without having to wait for the receiver to synchronize for sending the message (thus freeing CPU time at the expense of some memory). The channel buffer should ideally be unbounded, as it is hard to predict in advance an accurate buffer limit (and reaching the limit will also mean the end of concurrent operations). So the channel should not block the sender when it's writing to it, but it should block the reader when it's reading from an empty channel (until the channel receives a value). Inspired by Alexey Soshin, and inspired by BlockingQueue in Java. Counter-inspired by Golang and Clojure. Maybe there should be some confluence of CSP and Actor model, since each works best at different abstraction levels, and we ideally want loosely coupled and flexible mechanisms which are equivalent to backpressure (inspired by samuell from HN). Ideally, a receiver shouldn't need to use backpressure to signal a desired reduction in messages, but simply control how fast it is reading from the buffered channel, since it should be in control of it's consumption anyway. The language should provide abstractions so that the user doesn't have to worry about these things, and then choose the appropriate model under-the-hood depending if it's running on one machine or distributed (see: 'Adaptive runtime'). This idea of 'abstracting away the network' should not be adopted lightly. Since programmers might make mistakes when important distinctions are hidden (i.e. using convenient Ruby chaining with the Rails ORM, can quickly lead to inefficiencies like excessive queries). We also have a principle "Make similar things look similar, and different things look different". So unless the abstraction actually abstracts away all important differences, and the location (it's local/distributed nature) of the called service is apparent from the usage context (by conventions in naming or otherwise), such abstractions can be dangerous and should be avoided in favor of explicit primitives instead.

    • Good means of async control: Being able to cancel tasks/jobs, set timeouts, and easily be able to wait for a task to finish. Channels should be able to contain Tasks that return a Result (which may contain an error), and are cancellable.

      • Rich Hickey has some good arguments against async by default (when implemented with callbacks as in JS), namely that it:

        • fragments your logic (spread out into handlers), instead of keeping it together. Programmer has to deal with multiple contexts at once (complicated), instead of one overarching context (simple).

        • callback handlers perform some action once in the future, but the state they are operating on may have mutated in the meanwhile. So it may give a false confidence in being able to get back to the state as it were when the callback was made. We want to avoid the dreaded Shared Mutable State. Which may be solved with only allowing immutable constructs (like Clojure).

      • On the other hand, having sync by default, and async through Channels:

        • gives the control back immediately (in line with functional composition) instead of functions that effectively evoke side-effects on the real world on the other end (as callback handlers do). In line with our principle: Always give control back to the programmer.

        • channels are generalized pieces of code that can handle many connections (pub/sub).

        • channels afford safe concurrency (thread handling), whilst with callback handlers (unless used in an event-loop system such as JS) the programmer has to ensure safe concurrency (which we don't want).

      • channels afford choice on when to handle an event, whereas with a callback it gets called whenever it gets called (event-loop). Channels work in line with our principle: Always give control back to the programmer.

    • All of the above have implications for reasonability. Needs to be investigated further... Golang's way of handling async seems to be the current gold standard, touted by many bright people, since "Golang has eliminated the distinction between synchronous and asynchronous code" (by letting the programmer code everything in a sync fashion, but doing async I/O under the hood). Golang's principle of "Don't communicate by sharing memory; share memory by communicating." avoids the dreaded Shared Mutable State and affords itself better to ensure simple, safe, and scalable modes of thinking (our core principle): It's hard to think of something, if it has changed the next time you think about it (thus: immutability). Or if thinking about it changes it (manifesting in code the cognitive equivalent of Heisenbug's): Programmers need to be able to reason about a program's state without simultaneously modifying that state (inspired by CQRS).

    • Another, but more radical idea: The programmer shouldn't have to think about when or where the code will run. It should be managed by the language runtime, based on the specified platform. If the program is a local program for one machine then it could be specified to run the work synchronously. If run over multiple machines, it could be specified to use async by default, to delegate work. But then if results don't arrive within time (from a remote machine/CPU-core), it could chose to perform the work itself, but lazily when the result is needed. So there should be some built-in semi-lazy evaluation measure based on CPU monitoring. Also, for the work it decides to do itself, the runtime should decide when to perform it: if the CPU-cores are idle, then it should eagerly execute the work, but if not then it should postpone just enough work so that the CPUs are adequately exercised. Currently, in languages without this nuanced model, the programmer has to make an either-or distinction based on a generalized heuristic of whether or not async or lazy makes sense, and apply it in a fixed fashion. But these assumptions do not necessarily hold for operational scenarios. Ideally, the programmer shouldn't have to think about such operational, low-level matters.

  • Ergonomic to type. Prefer text over special characters like curly brackets (they are hard to tell apart from parentheses in JS). No littering of parentheses. Inspired by Ruby. Counter-inspired by JavaScript, Lisp, and JSON.

Safety & correctness

  • Well-tested.

  • Memory safe, ergonomic, and fast. should be safely and implicitly handled by the language, without a runtime GC.

  • No Garbage Collector (GC), but also no garbage. Garbage is a problem that ought to be solved in a language's syntactical and semantical design (through shepherding the programmer's thoughts, and giving feedback at compile-time or even better: immediately at development-time). It could be solved by having Deterministic Object lifetimes, and Ownership tracking (affinity type system). Inspired by Rust and Carp. Alternatively, the language could take inspiration from concatenative programming languages which doesn't generate garbage by design, has other desirable properties, and uses the stack heavily. Inspired by Kitten. Garbage is a symptom of memorizing (keeping track of loose ends), which is tedious for the programmer, as well as the runtime (and potentially the compiler). Garbage comes when you have to clean up something you memorized (something you allocated memory for, but somehow stopped using further on). Concatenative programming is closely related to FP through continuation-passing style (CPS) and tail-call optimization (TCO). The language/compiler should utilize CPS where possible, so as to reduce/optimize usage of the stack. It should be able to store a continuation (equivalent to persisting the stack to RAM/Disk) so that stateless programs (like a web server) could be restarted near a point of interruption/error (when the client makes the request again), to simulate statefulness. Inspired by Scheme.

  • Memory-management & safety. The language should be low-level enough that it enables creation of various memory management strategies, and afford systems-level control of hardware resources: pointers to memory addresses (for mutation purposes), memory allocators, etc. Inspired by C and Rust. But each Platform (see 'Configurable language') should have a general and default memory management strategy, so that application code written for a given platform can avoid explicitly managing memory. Some variations within the general memory management strategy should be configurable for each Platform. A 'Platform' should specify an implementation of the memory management strategy. Inspired by Roc. It should for example enable choosing Arena-allocation strategy for HTTP request-response cycles, which would be optimal there. This type of performance enhancing platform config ties back into the point 'Configurable language' (elsewhere in this article). The language could have Automatic Reference Counting, but with optimized throughput by Static Reference Counting to perform reference couting during compile time while also canceling out redundant reference checks (update coalescing). Inspired by Roc. Or maybe a Borrow Checker, for memory-safety. Inspired by Rust. But ideally, Ownership and Borrowing should be implicitly handled by the programming language, so the programmer wouldn't have to think about low-level concerns such as memory management (e.g. what goes on the stack vs. the heap) or various kinds of references. Inspired by smaller Rust. To avoid conceptual overhead of manual memory management (as with explicit borrowing semantics), the language should perhaps use or take inspiration from Koka's Perceus Optimized Reference Counting. Koka apparently allows even more precise reference counting (see sect: 2.2) than Rust. Inspired by Koka. An idea that might be worth exploring is to use Arena allocation (aka. Region-based memory management) as the default memory management strategy for the language, since it has the lowest overhead possible. Coupled with stack based semantics and potentially also linear types it could make for a very efficient language.

  • Secure from the start. Secure runtime. Inspired by Deno. Safety has to be a built-in design-goal from the start, it cannot be added on later. As evidenced by the justification of existence of Deno (Node was unsafe), and Rust (C++ was unsafe). Also, see: memory safe.

  • Crash-safe. Can crash at any time and resume computation at exact same spot when restarted. Inspired by Erlang.

Data types, type system

  • Gradually typed, as types can add boilerplate, create unnecessary friction, obstruct a programmer's tinkering flow-state, and create noise in the code. Counter-inspired by TypeScript, and inspired by Elm and Jai. As many types as possible should be inferred. Inspired by TypeScript but even more inspired by OCaml and ReScript.

    • Development speed vs. Robustness: When prototyping you don't want to worry about robustness that much. You want to develop as fast as possible, and just want to see if your assumptions hold under some particular circumstance (not every possible ones), and if the program crashes that's not too consequential. In production, robustness and handling of all edge cases obviously becomes more important.
    • Compiling for a Platform (a distributed system, for example), should give compiler errors if robustness is not ensured (static types, async processing, error handling, possible loading states, how to deal with missing/null values from third party API's: accept or reject the request, partly or in full etc.). Whereas compiling for a local Platform (single-machine) some of the errors/concerns with a distributed system can be exempt: can assume assume calls return a response syncronously and only once (the fallacies of distributetd computing, idempotency, overhead of serializing data, potentially independent API versions, etc. don't need to be accounted for to the same degree). Therefore, a lot of the developer's work can be alleviated, and the compiler can be less strict, to allow for greater development speed. If compiling the program for a distributed Platform, then the developer would only then need to account for the compiler errors that ensure that the program will be robust in such a setting.
  • No runtime type errors. Inspired by Elm (and Haskell). See 'Error Handling & Non-Nullability'.

  • Union Types: Types should be associative/commutative/composable/symmetric (i.e. A|B should equal B|A), inspired by Dotty/Scala3, and the 'Maybe Not' talk by Rich Hickey.

  • Types should be enforced statically at program exit boundaries (so external libraries or outgoing I/O are ensured existing typings).

  • Structural subtyping (inspired by TypeScript, OCaml), instead of nominally typed (counter-inspired by Java and Haskell). Since it is the closest you'll get to duck-typing within a statically typed language. But it should also have support for nominal types at the few cases where that might be beneficial (i.e. opaque types, not possible with only structural subtyping). Inspired by ReScript, and counter-inspired by TypeScript.

  • Strongly typed (checked at compile time), not weakly typed, since implicit type coercion (at runtime) can be unpredictable, and variables that can potentially change their type at runtime is madness. Inspired by TypeScript and ReScript. Counter-inspired by JavaScript.

  • No runtime polymorphism (aka. 'ad-hoc polymorphism', 'dynamic dispatch' or 'monkey-patching'). Counter-inspired by Python, Ruby and JavaScript. Also, even compile-time function/operator overloading should NOT be possible. E.g. + can't be used both to sum integers and join strings, so you'd have to use more operators like ++ or similar to join strings; explicit is bettter than implicit, and even better than ambiguous/context-sensitive. Without these features we'd gain the more important ability to fully infer static types for programs, without having to write type annotations, and compiling could get really really fast. Inspired by OCaml (and by extension also ReScript). Counter-inspired by Clojure, Java, Ruby.

  • Generics / Type parameters / Parametric polymorphism. Inspired by ReScript and OCaml. Counter-inspired by how C++ and Java handles generics. Basically, to make generics sane, coherent, and pragmatic, without nudging developers into going too much overboard with generic abstractions (like generics induced function coloring). So maybe some kind of limitations to generics, like Rust. Since I'd prefer a little duplication instead of a complex abstraction. Due to reason-ability, time to onboard new developers to a project, and the roughly 10x more time programmers spend reading than writing code, as the saying goes.

  • Type inference, sound and fully decidable and with 100% coverage. Inspired by OCaml and Roc. To not have to declare types everywhere. For increased readability and convenience (though not essential, cf. popularity of Rust). But local type inference inside the body of functions are what's most important (inspired by Scala), since functions input/output types should always be declared, for documentation purposes. But they could probably be generated after the prototyping phase / exploratory coding is done and you want to ossify the code. In that case, they should not be inline (like in TS), but next to the function definition (like in Elm).

  • Pragmatic type bindings for external libraries: should allow you to write type bindings that mirror how you will use the library in your own project, instead of getting stuck at generalizing potentially complex types. Inspired by ReScript.

  • Typed Holes / Meta Variables. Inspired by Idris. Since it "allows computing with incomplete programs", and "allow us to inspect the local typing context from where the hole was placed. This allows us to see what information is directly available when looking to fill the hole". I.e. the compiler provides hints about its attempt to infer the type of the missing value (aka. hole). As opposed to either requiring that a program is fully typed, it can afford a Live programming environment that give feedback to the programmer while editing about how it would be executed.

  • No type classes or highly abstract type level programming. (See also 'First-class modules', since type classes are considered incompatible with that feature. See also no 'ad-hoc polymorphism'.) When your type system becomes Turing-complete and you start programming in the type system (aka. wrangling type-level prolog), or even unit testing the type system(!), then I consider it a language design smell. See statics-dynamics biformity and "types are a (distracting) puzzle" for why.

    • A potential way(?) out of the statics-dynamics biformity could perhaps be to declare data types in terms of concrete mock/default data (concretization by adding information/exemplification to the code so that dynamics may be treated the same as statics), and then infer the data type from there. F.ex. instead of const name: string = api.getName() you'd have const name: “John Doe” = api.getName(), where the string data type is inferred from "John Doe". That way you could potentially run some functions at compile time using the default values. Could this end the statics dynamics biformity? IDE tooling could show the programmer real defaul/mock data propagating through the app during development (Bret Victor style), like it shows the type inference today. Maybe we can compromise with Rich Hickey's aversion to static types and instead of having one or more optional tests of the code, through some extra code, like Clojure's spec, we could have enforced those tests declaratively throughout the code, akin to static types, but with actual values (like exemplified above). Functions could more easily be run at compile-time (since they'd have default/mock values, not merely a type specification), without needing to create a separate (potentially turing-complete..) additional language (like a static type system does) to declare the types of data those functions would take. Nor would the programmer have to reason about data types in the abstract, and how they can combine (i.e. type theory). In this, we are more inspired by Clojure's focus on raw data over classes or nominal types of data. the structural type system would presumably work with the types inferred from the mock data specified, and allow a certain flexibility, while also warning the programmer about inconsistent usage. See also point on 'Syntax matters (and homoiconicity is a plus)', for how this solution aids in combining homoiconicity with static typing.

Maintainability, refactorability, reuse

  • Be general purpose enough to at least write scripts and CLIs, but also web servers/clients.

  • No super-powerful tools which may hurt you or others in the long run. Counter-inspired by meta-programming in Ruby.

  • The Expression Problem: How can we “define a datatype by cases, where one can add new cases [aka. variants] to the datatype and new functions over the datatype, without recompiling [or refactoring] existing code, and while retaining static type safety (e.g., no casts).” (orig. definition)? We want to be able to both “introduce new functions acting on existing data types” and “introduce new cases/variants of existing data types and be able to act on them by using existing functions”. The goal is to achieve this extensibility in both dimensions (data types and functions) simultaneously, while retaining static type safety, and without requiring modification (recompilation or refactoring) of existing code (since it may be potentially unknown/uncontrolled third party source code that we might not have access to). (In object-oriented programming (OOP), it is relatively easy to add new data variants by creating new subclasses, but adding new operations typically requires modifying existing classes, violating the open/closed principle. Conversely, in functional programming (FP), it is easy to add new operations by defining new functions, but adding new data variants requires modifying existing functions that pattern match on the data type.) Solution: Overload functions statically (compile-time) based on function name and parameter labels (aka. keyword arguments). No dynamic dispatch (runtime polymorphism), no multiple dispatch, and no dispatch based on argument types (since it's not fully disambiguating, if overloaded functions use the same data types). Disallow dynamic arguments (aka. rest parameters / arguments object), since they complicate static overloading of functions. (Types must be resolved at program boundaries. To ensure determinism within program.). To solve the first part of the expression problem: Use opaque types and type extensions. Inspired by Gleam and inspired by Clojure's extend-type. To solve the last part: Use function composition and higher-order functions. Inspired by Gleam. Basically create and use a new function that delegates to a new function for the new data type variant, and to the existing function for the rest. The language should attempt to keep data (types) and functions separate/orthogonal, as far as is both possible and pragmatic. Inspired by Don Syme’s philosophy for F#, and Clojure's open methods / multimethods. Also inspired by Rust traits (which are also explicit and local; enabling safe/localized monkey-patching...). But ideally, maybe the expression problem is premised on the wrong thing, since: We don't really want different 'variants' of data types. Because it introduces variability, which represents entropy/unpredictability, and we want to "get to concrete as fast as possible" (inspired by Christopher Alexander, via Basecamp), which will also help the compiler (see 'concretization').

  • Reactive and Streamable. Inspired by Functional Reactive Programming, and Elm, and The Reactive Manifesto. Though the latter is geared at distributed systems, it could also be a model for local computation (rf. Actor model, and Akka). The programming language should make default and implicit the features of reactivity and streaming, as opposed to preloading and batch processing. (Reactive Streaming Data: Asynchronous non-blocking stream processing with backpressure.) The programmer shouldn't have to think of reactivity and streaming explicitly. Preloading and batch processing should just be a special case (stream 1 value/batch instead of several). Counter inspired by Promise API in JavaScript. Going from rather static data structures like arrays, into dynamic ones, such as a stream, shouldn't need a code re-write. They're both sequences of values, after all, and WHEN the data arrives should not need to be a concern to the algorithms implemented on that data (see 'The Hairy Vision' section). See the related concept of fuctional transducers), which could afford such operations (effectively eliminating unnecessary data materialization during program execution). The language should assume that values can arrive over time, and that processing values in an end-to-end pipeline is better done one value at a time, so that values start arriving at the other end sooner rather than later. Inspired by Lean manufacturing principles (WIP-limit is set to 1 when on a single CPU, or potentially higher WIP limit with more CPU's, effectively parallel processing a fast arriving stream of incoming values, f.ex. within a web service request as opposed to only parallel processing between requests) vs. Waterfall (data materialization and batch-processing) processing pipelines. Also inspired by Clojure's seq (sequence). But how to deal with streams rather than arrays, when arrays are more optimal/predictable to deal with in terms of memory management? The compiler could possibly deduce the size of an array from its usage in the code, instead of the programmer having to declare it up front, which would then need to incidentally/manually correspond with its later usage. In short: Why not go the other way around than what's normal, and treat the static sizing of an array as an optimization step? This optimization could be done for the code the compiler has access to, but for values arriving in streams from external third party sources, it could obviously not be performed. If you are fundametally dealing with streams, rather than arrays, it may pave the ground for easy Dataflow-programming, using FP. When new values arrive, they follow the same functional pipeline as those before. Arrays would simply be finite streams, and as far as possible they would be streamed into a continuous block of memory. The bounds of the memory block could be deduced by the compiler, or alternatively the programmer can set aside an initial decent minimal chunk of memory upon declaration. Similar to specifying size upon array creation in various languages today, but with the possibility to overflow into free store memory. (Another interesting idea here is to cater to stream matrices instead of just array-like sequences; maybe as streams within streams).

  • Content-addressable code: names of functions are simply a uniquely identifiable hash of their contents. The name (and the type) is only materialized in a single place, and stored alongside the AST in the codebase. Avoids renaming leading to breaking third-parties, and avoids defensively supporting and deprecating several versions of functions. Avoids codebase-wide text-manipulation, eliminates builds and dependency conflicts, robustly supports dynamic code deployment. Code would also need to be stored immutably and append-only for this to work. All inspired by Unison.

  • Backward- and forward-compatible. Should be able to not worry about (or make poor future tradeoffs due to) backward-compatibility. (Counter-inspired by ECMAScript and C++). To make the language optimally and freely evolvable, and worriless to upgrade. Backward-compatibility and Forwards-compatibility: Code in a one language version should be transformable (in a legible way) to another version (both ways; backward and forward). In the case of lossy changes, the old version should be stored (so it is revertible), and in the case of a "gainy" changes then the compiler should notify the programmer where in the code it is now missing explicit information (based identifying locations in the code where the old language constructs are used). There should also be solutions to either: have simple CLI tools to automatically refactor old code to new language versions, to always stay optimally adaptable, without having breaking changes. Maybe using some form of built-in self-to-self transpilation. Will likely need to be able to treat code-as-data. Might need compile-time macros. Or a solution could be to: with every breaking language revision, include an incremental language adapter, which would allow upgrading whilst ensuring backward compatibility. Could be solved with Mechanical Source Transformation, enabled by gofmt, so developers can use gofix to automatically rewrite programs that use old APIs to use newer ones. Which is crucial in managing breaking changes. A breaking (aka. widely deviating) change, should in effect, not actually break anything (that current languages and systems do, is considered a "pretty costly" design flaw). "Successful long-lived open systems owe their success to building decades-long micro-communities around extensions/plugins", and to enable that requires great care for backwards-compatibility, as Steve Yegge pointed out. But tool based upgrades, as mentioned, is better than keeping old APIs around forever. This philosophy is also applied by Carbon (C++ successor language). See 11:43 @ https://youtu.be/omrY53kbVoA

  • Scalable: From single core to multiple core CPUs, and from one to a distributed set of machines. Without needing refactors (or only through swapping out some keywords/libraries). Inspired by Smalltalk, FP principles, and Venkat Subramaniam's talk. This is called Location Transparency, and "Embracing this fact means that there is no conceptual difference between scaling vertically on multicore or horizontally on the cluster". Inspired by Alan Kay's vision of computing, and the purpose of the Actor Model, utilized in Erlang/Elixir and Pony's actors (w/ async functions). But rather than using state-driven Actors, I'd rather want it implemented with stateless "Machines" (a concept I made up), which would simply give stateless functions a call queue each. Inspired by Smalltalk, but stateless. They call each other by sending Messages (containing the parameters) to the other function's call queue, take into consideration error-handling, and the unreliability of distributed computing. (The caller chooses if their call should be sync/blocking or async/non-blocking, since sync/async aka. blocking/non-blocking, is not a feature of the called function, but the call itself. Async is not an adjective but an adverbial! Counter-inspired by JS, and inspired by the go keyword of Golang on function invocation/callsite.) We name such functions "Machines". Each of them are in fact a mini-computer, or a computer-within-the-computer, if you will, but unlike Actors they don't have inherent state (which rather ought to be stored and managed by a DB, or in-memory DB). Such Machines should be able to be moved to distributed systems without rewriting the code. Inspired by Alan Kay and Actor Model systems (Akka), Big-bang FP and Syndicated Actors which are based on the converse of Metcalfe's idea: "inteprocess communication might be networking communication", that is, "primitives we choose for our programming languages and programming models could be drawn more directly from our experience of computer networking". The Machine concept is also inspired by languages where actors are first-class citizens, like in Pony. A single Machine could work as a minimal microservice, or better yet, a Cloud Function, but likely you'd want multiple endpoints which expose a Machine each.

    • No global variables: Because global variables are a bad practise, and they don't translate well to a distributed system setting, so scaling up a code-base from single machine to multi-machine when using global variables can't be done without a rewrite. Instead, a function can call another function by simply sending a message. Inspired by Smalltalk. See aforementioned "Machine" concept. Messages can be passed through chains of function calls by piping and/or ...rest parameters.

    • No variable shadowing within functions, but insides of a function may shadow the outside. Inspired by C#. Since variable shadowing is a reason to have keywords such as let, then not allowing variable shadowing could afford the opportunity to avoid having such keywords as let or const, for minimalism (since all values should be immutable anyway). So functions should allow shadowing external variables to the function, since it ensures the writer/reader of the function doesn't need to know about, or be constrained by, potential name collisions with external/global variables (see: function independence). In general, the language should restrict global variables to module namespaces. When global variables are used, they should be accessed locally by calling a global function. Inspired by Python's use of the global keyword. Or, globals should be accessed directly using a @@ prefix to their name, like Ruby does for static variables. But in all: There shouldn't be various shadowing rules for various kinds of scope, but one simple shadowing rule for functions (and functions should be the general scope of choice).

    • Facilitate and nudge developers's towards creating Functional Core, Imperative Shell architectures (inspired by Bernhardt at 31:56 in his Boundaries talk), to preserve the purity of functions as far as possible, while also containing side-effects:

      • Configurable language: Platform framework/config that encapsulate all I/O primitives, which introduces a separation between trusting a particular platform and trusting the language runtime. Inspired by Roc. This could make programs more portable, since the programmer does not directly hardcode with platform I/O primitives afforded by the language runtime, but writes code via the Platform API. It could even make certain features of the language only available on certain platforms, i.e. Browser platform doesn't have access to low level memory management. So that the language can be as restrictive as possible for the environment, ensuring that code is written idiomatically for the target platform / environment, since a restricted language has value because for a given platform the programmer would encounter less diversity in the language and thus have less to learn. This strikes a balance between on one hand providing sharp knives as global tools programmers can apply anywhere (i.e. potential footguns, leading to The Pit of Despair like in C++), and on the other hand avoiding being so restrictive that programmers can't talk/write/think about what they want/need to for their given environment. The language itself should be massively configurable: It is not reasonable to assume that the language designers will have accounted for all possible usecases (various memory management strategies etc.). So the language primitives/keywords should be able to be given different underlying effects (e.g. stack vs. heap allocation) based on which platform or use-case is specified (without having to be explicit about every such effect in every environment). But the effects should be inconsequential for the reasonability of the code. Meaning that they should be at the bare-metal performance level, not at the level where operators are overloaded to do something different, cf. our principle that things that are different (i.e. have different effects at the language level the programmer is operating at) should look different. The platform config will depend on which implementation makes most sense for that platform (or use-case?) (i.e. browser webapp vs. systems development, vs. game development, potentially). The language should be configurable by libraries, that will define how it works, and can extend the core to platforms where the programmer needs to think about specific matters to that platform. Inspired by Clojure. The same program specification should be able to have different runtime characteristics on different platforms, depending on the platform configuration. This could be enabled by the programming language concerning itself with modeling causal relationships, instead of place-oriented-programming.

      • Encapsulated I/O, so functions can avoid having side-effects. Inspired by Haskell.

        • Alternative #1: Algebraic Effects for I/O, so that side-effects can be contained in a given context. Algebraic Effects are also a powerful general concept that could help with concurrecy, async/await, generators, backtracking, etc. Inspired by OCaml.
        • Alternative #2: use an IO action of an IO type (inaccurately named "IO Monad" at 30:44 in the Boundaries talk), transparently (without actually having to deal with the concept of a Monad). Where you effectively construct a sequence of I/O operations to be executed later. Inspired by Haskells separation between pure functional code and code that produces external effects. Compare with the concept of a functional core, imperative shell architecture (at 31:56 in Bernhardt's Boundaries talk and the 'functional core imperative shell' code talk), a concept which was inspired by Haskell. Something like this is needed because the Mailbox is stateful (it is constructive/destructive, like a queue), and I/O messaging would be a side-effect. The Machine/Mailbox is inspired by the Actor Model from Erlang. Ideally, since all I/O is wrapped, it should be able to turn on/off the execution of IO actions, based on setting some initial config. This could be useful for testing. You could even do a sample run to collect data, which you could snapshot to use as mock data for test runs.
        • Alternative #3: Potentially by syntactic rules/enforcement: You have functions, that always return a value (and have no side-effects, i.e. are always pure), and procedures, which never return a value, but imperatively perform side-effects (e.g. akin to a function executing side-effects and simply returning void). Procedures can contain functions (since functions are pure they are predictable), but functions cannot contain procedures (since that would make them impure, and side-effects are unpredictable and would be hidden/surprising). Inspired by Algol and Pascal. This way you could automatically enforce functional core, imperative shell architectures, as you'd have to do all side-effects inside procedures. Programs would be a nested hierarchy of procedures, with functions interspersed at any level, as needed. Functions could contain their own pure functional hierarchies. Importantly, I/O (or other side-effects) would not be allowed from within functions, only from within procedures (even for console.log, so function hierarchies would have to be debugged through unit-testing individual sub-functions, inspecting their result). NB: Could lead to the colored functions problem, since if procedures are more powerful/permissible than functions, then programmers may just default to using procedures everywhere, and programs would simply be nested procedures. But at least it would be more predictable if you could easily/visually tell functions and procedures apart. Unless people used procedures where they could/should have used functions.. but it should not be a problem, since procedures can't return anything (only void), so when you want to return a result you would need to use a function. Maybe this could de-complect functions and procedures, which have been too closely intertwined through historical circumstances (maybe it was it too tempting to have the console.log ability from inside functions?). Maybe colored functions aren't so bad, if they can help nudge developers towards a "functional core, imperative shell" architecture, effectively keeping procedures and functions separate.
        • Alternative #4: Use Uniqueness Type, which allows mutability and pass-by-value while also preserving the crucial referential transparency (since side-effects are ok in a pure language as long as variables are never used more than once). Inspired by Clean and Idris. Possibly use Simplified Uniqueness Typing, inspired by Morrow.
        • Alternative #5: Simply be able to turn on/off side-effects like external output operations. If all output operations are done through an IO module in the standard library, it could afford a simple "off" switch, to be able to do testing. That would prevent side-effects from acting on the outside world during testing. The challenge is side-causes (aka. hidden inputs), however. The language could have the IO module require a default/fallback parameter to be set for successful external input operations. IO.readFromFile(fileName, "Default file content fallback."). Which would be used during testing (the benefit being that the mocks would already be present). Another problem with side-effects have to do with ecosystem (especially interop with other languages): If you use a 3rd party package, how do you know it won't leak data to a 3rd party server during runtime? This ought to be solved by a sandboxed runtime environment (inspired by Deno), where it should automatically log any attempts at IO access not explicitly made through your own application code (using the language's IO module). Inspired and counter-inspired by Elm.
      • Ideally, for performance, when the code is compiled to be run on a single machine, the compiler should be able to be optimise away the Mailboxes, so that Machines can be turned into (simpler and faster) synchronously executed functions.

Performance

  • Zero-cost abstractions. Inspired by Rust.

    • A single string type. Counter-inspired by Rust. The compiler should check whether the string is modified, and if it is then it should use a dynamic/variable length string, otherwise it should optimize by using a static fixed-size string. The programmer should not have to worry about this optimization difference. Be as generic as possible, but make static optimisations for different contexts.
    • The compiler should optimize performance by making calls synchronous on a local Platform (single machine), that would otherwise be made async on a networked/distributed Platform.
    • Compile-time macros could presumably help with this.
  • No single-threaded event loop that can block the main thread. Counter-inspired by JS. Instead, inspired by Golang, create a set of threads in the core runtime and run hundreds of goroutines on them by multiplexing. But unlike Golang, the runtime should not only create a goroutine per web request, but also create goroutines for every parallelizable part of the program, so that even a single web request / program can be divided and executed in parallel (if it needs to, due to being computationally expensive). The adaptive runtime and/or user config should decide the amount of access to the core threads, and allowed run time for each program (so a heavy program doesn't starve other web requests).

  • Async, Event-driven, Streamable, Parallelizable, all by default. Which would effectively take these concerns out of the equation, freeing the application programmer from thinking of implementation details (like how the computer should work efficiently, which is the language/framework designer's job), to restructuring or rephrasing the problem into sub-problems, and making those explicit, so the computer can efficiently parallelize computation at the right points.

    • The largest performance gains are often not in working faster, but in eliminating/reducing the work to be done. (Since "constraints liberate, liberties constrain", as Runar Bjarnason said: "As programmers, we tend to think of expressive power of a language or library as an unmitigated good. In this talk I want to show the contrary; that restraint and precision are usually better than power and flexibility. A constraint on component design leads to freedom and power when putting those components together into systems. What’s more, this feature is built into the very nature of language and reasoning." -- Runar Bjarnason. When all functions and data structures in the language support these features, you'd gain unrivaled program composability (and scalability: from single machine to distributed).
    • Imagine a program taking in an array, item by item, streaming it into parallel pipelines (which transform the data according to a sequence of transducers), and simultaneously, while the pipelines are processing, they stream back the response, item by item, into either the array itself, or a new data structure, or some output destination. Streams breaking apart, parallel processing, and weaving together again, multiple times at different points throughout the program, and every step funneling through item-by-item - Lean Manufacturing style, according to a pre-designated WIP-limit (a limit of work allowed to be in process at any given time) - not having to wait for the full processing completion of the previous items, and never having to fully (re-)materialize the data on each step in the pipelines.
    • Syntax-wise you'd write the pipelines data-first left-to-right, i.e. in the chronological way that data would flow. So it will be easy to follow and trace the execution step-by-step from beginning to end as you read along. But the compiler should read the functions from right-to-left first (alternatively read left-to-right but upon encountering a function stop data processing and start collecting all subsequent functions), so that it can compile the list of functions into powerful composed (partial) functional expressions that can operate on data without materializing the data at every step when executing left-to-right (inspired by 'transducers' and inspired by the concatenative Om language). (The IDE should give notice about which data is potentially missing as an input to a step in the functional pipeline, either when you try to call a function at a step, or based on a partial function resulting from a constructed pipeline that doesn't include all input data (yet).) This is the exact inverse of functional languages that read function compositions left-to-right first but then execute with data right-to-left. Since I think it's vitally important to enable the programmer to trace the flow of execution by simply reading the code left-to-right (re: Dijkstra's impetus to "spreading out the process in text space"). This may open up a potential for optimal memory management: If the compiler reads the functions from right-to-left first (i.e. from the end of the program back to the beginning), then when it first encounters an input variable to a function it knows it is the last use of the data referenced by that input variable (see object lifetimes). So one idea is that it can automatically insert an instruction to free that memory, after the function has given its output. Effectively avoiding both compile-time reference counting, runtime reference counting, and garbage collection.
    • Locality of the data and the code should be handled by the adaptive runtime (see point on 'adaptive runtime'). Meaning that the runtime should decide whether or not to execute something locally or across the network. Maybe either depending on the given size of the data, if determinable upon compilation, or by runtime analysis of the data size, or runtime adaptations based on processing time (akin to how JIT compilation optimizes program execution during runtime). Alternatively it could be managed by configuring the adaptive runtime more statically, into various modes: local-only, or full-network-distribution of the data and code. To take advantage of the environment and available hardware the programmer should know about. But ideally, the adaptive runtime should take care of all of this, and be able to adapt to changing hardware and network environments, to achieve the optimal performance at all times without reconfiguration.
    • Error handling (restarting failing processes, inspired by Erlang) and potential hardware stragglers that fall behind in processing, should be taken care of by the adaptive runtime, for optimizing the speed of getting the end output from the weaved pipelines. It could choose to execute some parallel pipelines locally instead of across the network, or reuse hardware that has become idle, if the stragglers take too long. Similarly with deciding what parts of the data it should keep in RAM vs. on-disk, and how often it should persist to disk (according to some fault-tolerance and persistence configuration/policy).
    • Mutation: Data-flow languages are typically based on immutably consuming and generating tokens, which generates overhead (processing and re-materializing a large token store / large arrays of data), especially for applications with a low degree of parallelism. So the language should allow mutating existing data structures: you could stream the result of data (passed through a pipeline of composed functions) back into the origin data structure, value-by-value. Just as-if you were imperatively performing multiple operations on each value in an array for each step in a single pass / for-loop over the array. Even when functional oriented, it has to be as efficient as that, to be performant. You don't necessarily need immutability everywhere, as long as you can restrict mutation. Inspired by Rust's ownership-borrowing system. The language should be similar to data-flow languages in purpose and use, but similar to control-flow languages in implementation: since streaming instructions over static data is faster and more scalable than always streaming data (token-by-token) over static instructions/operands (like data-flow languages do). Amount of instructions are generally a lot less than amount of data items, and it's always a benefit if you can minimize the shuffling around of data. However, when functions are called over a network boundary (RPC), then data must be immutably materialized (and serialized) at those function boundaries, and transmitted across the network (instead of functions merely mutating the data, as done locally). But crucially, the same flow-oriented functional pipeline should be able to be used either locally or over-the-wire without any changes to the code (other than imports / platform config).
  • Aggressively Parallelizable: Parallelization made natural. Aided by pure functions, which are ideal for parallelization (and even for streams). In FP code the structure of sequential and parallel code is very closely aligned (as opposed to in imperative code). See also the point on 'DFFP'. The language should nudge programmers and make it easy/natural to use parallelism, through language constructs like executing several sequential lines simultaneously (inside fork/join constructs or similar). To avoid common overly sequential thinking, which can lead to suboptimal performance (due to not parallelizing work). But humans think sequentially. So we ought to pay heed to Dijkstra's wisdom that since: "our intellectual powers to visualize processes evolving in time are relatively poorly developed, we should shorten the conceptual gap between the static program and the dynamic process, by spreading out the process in text space". In simpler terms: enabling the programmer to trace the flow of execution by simply reading the code. As elsewhere mentioned: execution order should follow the reading order. Another important reason the language should steer the programmer to aggressively utilize parallelization, is due to Amdahl's Law, which states that, when parallelizing, the limiting factor will be the serialized portion of the program. Which is, notably, the queueing delay, due to contention over shared resources, like CPU time. So whatever part of the program which is not parallelized, will eventually, under high enough load, turn into a bottleneck. The language construct nudging developers to parallelization could be inspired by Verilog's fork/join construct, or the very similar Nurseries, which are an alternative to go statements (since go statements don't afford local reasoning, automatic error propagation and reliable resource cleanup, though some may be achieved in a WaitGroup). But as opposed to the fork/join example, the language should enforce a deterministic order upon joining, which should simply be guaranteed by the sequential top-down order of the lines in the fork/join code block (a novel idea, to my knowledge, which would need to be experimented with thoroughly... more thoughts in this issue). NB: Need to research if on today's hardware, automatic parallelization could in fact be a potential pessimization in practice instead of an optimization, as Richard Feldman pointed out to me. In any case, we do not envision making every function call parallelized, but to make simple, contained constructs (like nurseries, fork/join), that the programmer can use to signify separable pieces of the problem/algorithm. The runtime should then parallelize those portions aggressively. "Whether to be sequential or parallel is actually a separable and orthogonal question. ... allow not just the compiler, but the runtime to make the decision on the fly based on available resources to decide whether to break up sub-problems in a linear fashion or in a divide-and-conquer fashion." -- Guy Steele, in How to Think about Parallel Programming: Not!. This ties back to the point that programming needs to get away from the notion that the programmer is giving instructions to the machine.. See 'Declarative over imperative'. It could be helpful to "separate the meaning of a program from the implementation details. ... Saying less about implementation should also make programs more flexible." -- Paul Graham in The Hundred-Year Language. In all, the language should afford algorithms to be broken up into their separable and independent pieces of code, and then the runtime should decide whether to run the pieces sequentially, or in parallel.

    • Alternatively: Take inspiration from Chapel, by providing core primitives to control parallelization directly. But preferably through declarative means and not through such imperative control structures as for-loops, which Chapel uses.

    • Alternatively: Take inspiration from Golang's elimination of the sync/async distinction and allow programming everything in a sequential manner, but do a degree of parallelism under the hood (so concurrency, in practise). The sync/async barrier elimination, however, doesn't necessarily nudge programmers towards using parallelization (spinning off new threads) within the context of a program (thread). That style might conflict, or it might be synergistic with the goal of nudging programmers towards making more use of parallelization.

    • Syntax enabled parallelization. Inspired by Verilog and Chapel. Ideally, the language runtime should be able to use parallelization to handle multiple independent processes (like client/server requests; goroutines for concurrency), but also automatically distribute a single program across multiple CPU cores (facilitated implicitly by the language constructs/structure, without special imperative directives like thread/go) when those cores are idle. To do that, the language should not attempt to automatically make specifically sequential code parallel, since such automatic parallelization requires complex program analysis based on parameters not available at compile-time. Instead, the language should nudge towards constructs that afford natural use of multi-threading instead of single-threading (cf. principle that a language should afford scalable modes of thinking). But without compromising readability/reasonability, which is the top priority. The programmer should be concerned with, and simply describe independent sets of causal/logical connections, and the language runtime should automatically take care of as much parallelization as possible/needed. Inspired by Haskell.

    • Safe parallelization. Inspired by Haskell.

  • Adaptive runtime. The language should have a small, but adaptive/capable runtime. Small in the sense that the runtime should have no GC, and its implementations should have a high level of mechanical sympathy with the target platforms (see 'Configurable language'), so that the CPU operating cost of the runtime is as small as possible. But I'd trade the size of the runtime so it is larger, if it means app developers have to be concerned with less. The runtime could be worked on by specialist programmers that could hyper-optimize it. Inspired by V8 for JavaScript. Better that specialists work on optimizing the runtime for everyone, than every app developer having to make (the same) performance optimizations themselves.

    • Single-threaded to multi-threaded: The same code (or only with slight modifications) should be able to be run in a single thread in one context (e.g. in a client's browser), and take advantage of multi-threading in another context (e.g. server-side). Multi-threading should be the language default, but the runtime should be able to support concurrency (interleaving threads into a single thread), for use cases that require that (e.g. like if if compiles to JS and needs to execute in a browser or Node.js runtime). But it should be so configurable that the programmer could configure the runtime so that e.g. the compiled JS running on client side would use Service Workers for threads in addition to the main thread (where possible, so not for operations on the DOM).
  • Compiled, interpreted and/or incrementally compiled (for dev mode). Inspired by Dart.

    • Fast compilation. Inspired by ReScript. Fast compilation speed is more important than high-level language features. Using high-level language features like Polymorphism can even severely de-optimize an otherwise fast program ("Clean Code", Horrible Performance). So such features should be possible to restrict (or be disallowed entirely in favor of simpler approaches, e.g.: using if or switch statements, or some form of explicit branching, instead of polymorphism.), to gain faster compilation speed. Inspired by Molly Rocket in that video, and Cheng Lou behind the fast ReScript compiler. Configurable language (think strict mode). But readability and reason-ability should be prioritized over compilation speed. Conserving developer mind cycles are more important than conserving CPU cycles. As long as compilation doesn't become a significant local development burden. So incremental compilation is important. Inspired by the Salsa compilation framework for Rust (via Lark), which is aided by memoization of pure functions. Local development has to be fast, but longer CI/CD build times are acceptable, as long as it scales (non-exponentially). Also: "Compile time is important, but it’s ok to sacrifice it to reduce design time, debug time and runtime. Machine time is much cheaper than human time, and once you automate a task, a machine runs it more efficiently and reliably. And runtime gets multiplied by the number of devices that run it and the number of times it is run." -- natewind

    • Interpreted / incrementally compiled during local development: So developer can write quick scripts and get fast feedback. Sacrifices some runtime speed in local development for compile-speed. Except it also needs quick startup/load time.

    • Compiled: For production. Sacrifices compile-speed for runtime speed. Should use static linkinng to compile all dependency libraries, modules, and the language itself to a single binary, for easy deployment to a Lambda function or Docker container (without a base image). Inspired by Deno and Golang.

    • Small core language: Compiled down to a small instructions set, which can be used/targeted as a starting point to generate code for other programming languages (i.e. generate JS).

    • Portability: Be able to target and run on multiple computer architectures.

    • Easy to build from source. Inspired by Zig and Golang.

  • Best of Immutability with the best of Mutability. Using immutable variable names (programmer can treat all variables as constants), while having opportunistic in-place mutation (compiler can mutate under the hood, for performance, where possible). Inspired by Roc. The goal is to avoid Shared Mutable State, but without excessive copying, so the compiler should detect when a data structure that is Shared (passed into another function) is Mutated there, plus also referenced again later, f.ex. later in the calling function, as if it's having an identity with a State that changes over time. Only if it detects these three conditions then it should create an incremental copy of the data structure (copy-on-write), using structural sharing. The programmer semantically deals with it as if a copy was made each time (i.e. immutable API when reading values), and thus has to declare a new variable/name for every new/updated version of the data structure (since it does refer to something different from the original; identity across time is a mirage). Like when passing in a data structure to a function and receiving a seemingly mutated version as the result. The programmer treats it as if a new copy of the data structure was made (i.e. the API is immutable). But the compiler, which sees the inside of that function knows if the data structure was just read/borrowed (in which case nothing special needs to be done), or if it was attempted mutated (using a mutable API, think Immer). If it was attempted mutated, and if the compiler sees that no later references to the original variable is made (like in the callig function's body, after the variable was passed into the other function), then it can selectively optimize by simply mutating the original data structure (since the data structure effectively isn't "shared"; its ownership was effectively just passed around). This is opportunistic in-place mutation. Inspired by Roc. The programmer shouldn't need to think about this optimization, but may enjoy the performance benefit of having data structures that are linearly mutated under-the-hood, where possible, and when shared they'd be optimally (only incrementally) copied. This through persistent data structures (like Lean-HAMT) and structural sharing, while also avoiding duplication of data. Inspired by Clojure.

    • Ownership/borrowing should have no special syntax. It should be opaque to the programmer, not explicitly handled by the programmer, as it is a concern of the machine how it optimizes it's handling of data structures (while providing the needed semantic guarantees to the programmer, through strict enforcement in the programming language / compiler). Special knowledge/syntax for (such inherently internal) ownership/borrowing concerns - like needing to know the in-memory representation of various types of values or use special & references to allow referring to a value without taking ownership of it - is considered a design inadequacy in the programming language. Counter-inspired by Rust. The programmer should not need to deal with such internal machine concerns.

      • In-place mutation, where data structures only become immutable when they're shared (presumes keeping track of borrowing / reference counting). Inspired by Rust, Roc, and Clojure's transients. Immutability gives the benefit of facilitating concurrency and avoid race-conditions. As a bonus you could get time-series and thus time-travel for data.

      • Mutable API: The desirability of a mutable API (mutating objects instead of always having to pass in functions) is inspired by the JS libraries Immer and Valtio. But for algorithms, instead of using the mutable API in an imperative style, it should allow keeping to a functional style, possibly with something akin to Clojure's transients. Alternatively: A mutable context (block scope) could be mandated for mutations (similar to Immer), which could also afford resource cleanup (if we want to avoid having a GC). Inspired by Rust.

      • Deep immutability: Cloning/Copying a data structure should not simply copy references below the first level, or if the data structure contains certain data structures. Because it is unintuitive/unexpected: a copy should be a full copy (at least as far as the programmer is concerned; it can use structural sharing under-the-hood). Counter-inspired by JS/TS. Alternatively: Instead of using Immutability to defeat Shared Mutable State, restrict the split-brain duplicity of keeping onto a reference to some data while also sharing a reference to it, like Rust does: by simply disallowing local references to data after it has been shared (aka. "moving" data).

      • Memoization, automatically, but measured and only applied dynamically when runtime finds it beneficial. Aided by pure functions. The programmer shouldn't have to think about memoization when programming, but should be able to tune the degree of memoization (since it is a space/time tradeoff) through general configuration, for advanced cases not optimal from the default. Run time optimisations such as these are not critical features, but certainly nice to have, and should be considered in the language design in those cases where it can affect the implementation of the language runtime. Memoization of math might not always be worth it (06:43 @ Andrew Kelley on Data-oriented design) so the adaptive runtime should measure math calculations and decide whether or not to memoize them in main memory (RAM), or just recompute the calculations because the CPU and it's cache are so fast and accessing RAM would be slower. These are hardware concerns that are subject to change as hardware progresses, and such concerns should thus not be encoded in the language syntax, but transparently taken care of by the language's runtime environment. See: adaptive runtime.

  • Concurrency. For Multi-Core and Distributed. using Channels like in Golang/CSP, but asynchronous ones (see: Buffered Channels). The important point is to produce readable stack traces that exclude framework code, like CSP tends to give, since concurrency then is vastly easier to debug. Inspired by Golang and CSP. Async is also important for the distributed part. Alternatively: For multi-core just use regular Channels like CSP, since it is proven, and for distributed just use the goroutines which are async.

    • Async: Concurrency should integrate well with the async feature of the language (see: Buffered Channels). Default should be to ship tasks off to be completed elsewhere (other thread/process/worker/server), while continuing own work in the meanwhile (fire & forget). Inspired by JS. But without fragmenting the logic into dispersed callback handlers throughout the codebase which are run at unknown points in time (as Hickey points out under the 'Buffered Channels' point elsewhere in this article. Counter inspired by JS).

    • Probably not implemented as an Actor Model. Since Actors statefulness is complex. Also, since events going all over the place in a non-stateful app, is harder to reason about than stricter promise-based operations (using callbacks under-the-hood). Counter-inspired by StimulusJS. Inspired by ReactJS.

    • Concurrency vs. Parallelism should be up to the runtime, not something the programmer should have to worry about. If the runtime has multiple CPU cores, then parallelize the tasks onto those cores. If the runtime only has one core to work with, then interleave the execution of the tasks concurrently on that single core.

Tooling

  • Small and predictable language surface: The language should be small and easy to understand, and straightforward (homoiconic?) to target with tooling-code (transpiling, linting, type checking/inference etc.), so that it is easy for the community to implement good and diverse tooling for the language. Inspired by the goal for Kitten, and the homoiconicity of Lisp/Clojure.

  • Fast feedback to the programmer is second top priority. Inspired by TypeScript hints, QuokkaJS (!), Webpack Hot Reload, and Expo Live Reload.

  • Great syntax highlighting. Counter-inspired by Clojure.

  • Namespaces should be able to be reversed in the code, through a hotkey. Since presenting information general-to-specific (top-down) vs. specific-to-general (down-up) serves different uses. Top-down is often preferable when writing (accessing information), and down-up is sometimes preferable when reading (cutting to the chase). So it should be possible to swap between showing the module path to a function (like com.some-example.my-app in Clojure, or Js.Array.indexOfFrom in ReScript, or ion-accordion in Ionic), to the inverse, placing the most specific part first (like my-app.some-example.com or indexOfFrom.Array.Js or accordion-ion). Because it aids readability when you are already in a context, since you then care about the most specific information first, not always having to read the path to that information. Imagine how much better the Ionic accordion example would read if you didn't have to read the ion- prefix before every component (like ion-accordion, ion-item and ion-label), but could read/scan the code and see the most specific information first, followed by a suffix, like accordion-ion, item-ion, label-ion. Autocomplete should be able to work with both displays, so you can still drill top-down in the namespace (like typing ion-) and then have the alternatives presented.

  • No need to manipulate data structures in the human mind. Counter-inspired by almost every language, but this problem is especially prevalent in stack-based languages like Popr, and to some extent also dynamic/non-typed languages. You should always be able to see the data structure the code is working on, at any given time, in the code or ideally inferred as an example alongside the code. Inspired by Bret Victor, and Smalltalk. Ideally with some actual example data, not only its data type. Though statically typed languages does indeed alleviate this need to some extent, since seeing the type of the data is better than nothing. Counter-inspired by dynamic languages such as Ruby and Python, but also counter-inspired by the argument/point free style often empoyed in FP. Also, it should be possible to visualise/animate an algorithm. Since "An algorithm has to be seen to be believed", as D.E. Knuth said. It shouldn't be necessary for the programmer to take the effort to visualize it in his mind (with the error-proneness that entails). So the language should make such visualization and code-augmentation easy for tooling to support. But without being a whole isolated universe in its own right like a VM or an isolated image. Counter-inspired by Smalltalk. Some have described this as REPL-driven-development, or interactive-programming. "This leads to a substantially different experience from running a program, examining its results (or failures) and trying again. In particular, you can grow your program, with data loaded, adding features, fixing bugs, testing, in an unbroken stream." from Clojure.org. Especially good for debugging: getting the exact program state from loading up an image of it that someone sent you. Inspired by Clojure. BUT it should have the ability to see the content of the data structures within your IDE, so you can stay and work, interactively, entirely within your files in your IDE (not via any kind of separate console). Inspired by QuokkaJS. The REPL-driven development approach should ideally afford simply changing code in the code editor, detecting the change, and showing the result then and there, without you having to go to back-and-forth to a separate REPL-shell and copy-pasting / retyping code. Inspired by ClojureScript. In fact, since a program is about binding values to the symbols in your code, when running your code, the IDE, enabled by the content-addressable code feature), could replace variables in the text with their bound values, successively. Effectively animating the flow of data through your code. Without you having to go to an external context like a debug window to see the bindings.

    • REPL / interactive shell. Can be done even if compiled, by having an interpreter on top of a VM to the compiler.
  • Sensible, friendly, and directly helpful error messages. Inspired by Elm.

  • See section on 'Dynamic code re-arrangement'. This points to the need for tooling to be able to reorder code depending on what the programmer is looking for. Maybe instead of treating code just as text characters, code should be able to be treated as blocks in the IDE, as many web based publishing platforms (Medium, Authorea, Hashnode) have discovered could be a smart and powerful thing to do with text. Preferably as a plugin to the VS Code IDE or similar, so developers won't have to switch to a new IDE for this feature.

  • Quick to get started and produce something. Inspired by JS. Counter-inspired by JS tooling.

    • Not too unfamiliar (to a large group of programmers, and to what they teach in universities). "Familiarity and a smooth upgrade path is a really big deal." source
  • Code-Formatter, like gofmt, inspired by Golang. A tool to auto-format code into a standard. Since standardisation creates readability and faster onboarding of new developers. It also enables mechanical source transformation, which is crucial for language evolvability. Beautiful formatting is important. But Consistency & Determinism > Beauty. Since even "a bad deterministic formatter is better than a non-deterministic formatter", inspired by Dart. Language should have a default standardized formatter, so code from a newbie and a pro looks similar, and jumping from project to project is easier and faster.

  • Reversible debugging / time-travel debugging (TTD). “Reverse debugging is the ability of a debugger to stop after a failure in a program has been observed and go back into the history of the execution to uncover the reason for the failure.” Jakob Engblom. Inspired by Elm. Re: Accounting for human limitations and affording the most natural way of thinking: "The problem you are trying to fix is at the end of a trail of breadcrumbs in the program’s execution history. You know the endpoint but you need to find where the beginning is, so working backward is the logical approach." source. Should at least have this. Could be enabled by, but not necessarily need:

    • Reversible / invertible control flow: "A reversible programming language produces code that can be stopped at any point, reversed to any point and executed again. Every state change can be undone." source. Maybe. Might not be feasible, or desirable, when it comes down to it. Might be aided by immutability, and persistent data structures (if they are extended with history-traversal / operation logging features, in addition to structural sharing).
  • Serializable: A program (with its functions!) should itself be able to easily be turned into a format that can be stored (to file, memory, buffer, DB) and transmitted over a network (to potentially be reconstructed or analyzed later), and then potentially run/re-run. Should support serializing the state of its data structures, as well. Even if they contain functions. Counter-inspired by JS. Inspired by Clojure (see section: ‘Interactive’, on hot upgrades).

  • Transpiler, configurable, so it could translate between all language dialects and variations. So that the language could evolve in multiple directions, and consolidate later, without harm. The concern here is that for this to work the core language/AST may have to be the lowest common denominator to work across all those dialects, limiting how good any of the dialects could be (?).

    • Homoiconicity could perhaps give affordance to such interlinguality. Inspired by Racket. Making the language homoiconic could however tie the code structure too closely to a particular data structure (e.g. a list). On one hand, limiting the diversity of data structures is sometimes seen as a good and simplifying thing (enabling program composability): "It is better to have 100 functions operate on one data structure than to have 10 functions operate on 10 data structures." -- Alan Perlis' in Epigrams on Programming (1982). Inspired by Clojure. But on the other hand, a diverse set of data structures are better able to accomodate different needs and likely be more optimal/performant for their specific use cases (not necessarily the case for Clojure, for various reasons). Mechanical sympathy is important, and essential for optimal performance. Furthermore, "code as data" is basically to enable code to easily be targeted by other code, something which you can achieve by other means, such as directly manipulating the code's Abstract Syntax Tree (AST) through 'syntax objects' which have info about the relevant scope, namespace etc. (like in Racket). But there is some merit to having the data structure (syntax tree) be the very thing the programmer sees in the code, so mental transformations are reduced to a minimum when operating on it. In the end, homoiconicity may only be relevant if we want to allow for meta-programming. See section on 'meta-programming'.
  • Editor integration: Should afford simple integration into editors/IDEs like VS Code. Typically via the Language Server Protocol (LSP). Inspired by Rust. Syntax highlighting, a language server (for autocomplete, error-checking (diagnostics), jump-to-definition etc.), via the Language Server Protocol (LSP).

    • Interactive: facilitates an IDE-plugin (VS Code) that shows the contents of data structures while coding. Enable REPL'ing into a live system. Inspired by Clojure. But with some security, so that a rouge/unwitting programmer can't destroy the system / state. Counter-inspired by Smalltalk. Some form of Hot Reload / Hot Upgrades at runtime, even though the language is statically typed. Perhaps by requiring that the swapped in functions must take in and return the same types as the previous version (i.e. enforced interface). Inspired by Facebook's usage of Haskell. NB: Might conflict with compiling to WASM, since WASM gives a restricted environment. See section on WASM environment.

    • "Comments should be separate from code, joined at IDE or joined via manual tooling. This would allow comments to span multiple lines/function and files. IDE could also alert when breaking changes are made. Pairs well with the Content-addressable code wish." Inspired by supermancho @ HN. You could also show/hide comments, and click on a particular piece of code or variable to see the comments for that. Without having to visually map references on a comment line to the actual variable, which is also prone to documentation drifting out of sync with the code it is documenting. With comments tied to content-addressable code, then when deleting/updating the code you also delete/update the comment. Renamings would be transparent and automatic, but when the code changes structurally the IDE could warn that the corresponding comment/doc needs to be updated.

    • The expansion of function definitions inline, on demand. "Take the definition, splice it into the call site, rename local variables to match the caller", as JonChesterfield @ HN said. So you don't have to jump around different files, which may get you to lose your state of programming flow. Content/code should even be editable then and there, and simply stored back to the files where they reside. Inspired by supermancho @ HN Lisp IDE's, and TailwindCSS. Content/code (and navigating it) should be free'd from file boundaries (see also: content-addressable code). Inspired by Git.

  • Well-documented. Documentation on language syntax should be accessible from the editor/IDE, via. the LSP.

    • Docs should be versioned, so that docs for old versions never disappear, either from the web or from the IDE integration. Inspired by ReScript, and counter-inspired by Emotion (CSS-in-JS).

    • Comprehensive: Centralized and freely hosted repository for documentation for all libraries, so that all packages can be documented the same way. Reducing barriers to entry for authors and users. Inspired by Elixir's HexDocs.

      • Fast and uniform way to search. Not having to rely on generic internet search engines.
      • Discovery of libraries. A centralized resource leads to more reuse of existing libraries, and can enforce some standards such as writing a "why" section first and foremost for a given library, so it's never a question about the intended purpose or use case for a library would be.
      • Integration with testing: Tests should be executable directly from the documentation website, providing a practical approach to demonstrating functionality.
    • Generated docs from Markdown written directly alongside the code. For convenience and efficiency. So the docs can live and be up to date alongside the relevant code. The IDE should be able to show/hide it if it grows large. Even though the code generally should be self-documenting and newbie-readable on its own, some insight into intended use and gotchas are always helpful.

      • For new libraries, not already present in the ecosystem, the search should suggest best-in-class libraries from similar languages which could be easily be ported using ChatGPT or similar LLM AIs.

Ecosystem

  • The language should be MIT licensed. Permissive copyright and trademark. MIT + use however you want, just not malicious. Favorable to creators and teachers. People should be able to make paid courses. Counter-innspired by the Rust trademark drama.

  • The language should be "open to extension" by any in the community, without permission. So that it can evolve and converge to a consensus, based on real-world experience and feedback. This is mirrored in the important talk Growing a Language, by Guy Steele, and the point on crucial evolvability. But the culture of the language community should not encourage bending the language in unintended ways just for the sake of it, as staying close to the overarching general-purpose language (GPL) makes knowledge transfer/usage generally applicable across domains (being able to move between projects within the same language should in general be made easy), and afford a more cohesive ecosystem. Counter-inspired by DSL's (see: avoid DSL's).

  • Optional configuration, but providing sane and conventional defaults so you can get started quickly and without worry. (Not rely solely on convention over configuration (CoC), due to potentially too much implicitness/magic. Counter-inspired by Ruby on Rails.) "if you apply [CoC] dogmatically you end up with an awful lot of convention that you have to keep in your head. It's always a question of balance; Hard Coding vs. Configuring vs. Convention, and it's not easy to hit the optimum (which depends on the circumstances)." as Peer Reynders reminds us.

  • It should be small, but extensible by using simple primitives (see: community grown, configurable language). Pragmatically, it should use LLVM to compile to a single binary (a standalone binary on each of the mainstream UNIX and Windows operating systems). Inspired by Roc and Golang. The language should probably be built using OCaml (since it affords pragmatic sound static typing, and is more approachable than Rust). But other candidates are Rust (esp. due to availability of programmers), or Zig, or Rust for the compiler and Zig for the standard library (since they afford safety and speed). Another alternative is Racket (since it is a Lisp geared towards creating languages), or maybe Haskell (since it is good for working with AST's). LLVM has bindings to these languages, and they are typically hailed as suitable for creating other languages. Unison could be a candidate, too, since it is Haskell-like and supports content-addressable code, but it is too early days for it yet. Available programmers (i.e. community size) in these languages should be considered (see: self-hosting). Should do more with less. Inspired by Lisp. Since predictability is good for humans reading, and for machines interpreting, and if it's predictable to machines, humans also benefit. Important: "As one adds features to a language, it ramps up the complexity of the interpreter. The complexity of an analyzer rises in tandem." - Matt Might, on static analysis

  • Modularity. Module system which is sensible. Counter-inspired by the NodeJS controversy. Code-splittable and tree-shakeable. Inspired by Rollup. Function-level dead code elimination, inspired by Elm. This is possible in Elm because functions in Elm cannot be redefined or removed at runtime. This could potentially conflict with the envisioned Hot Upgrade feature inspired by Clojure (see: Interactive). This problem could perhaps be removed by disallowing modification/overloading of functions and data types in the standard library (or any 3rd party library). Alternatively: it should be possible to specify what parts of the application should be tied down and optimized (the client code), and which part should, at the potential expense of larger assets, be Hot Upgradeable (the server code).

    • First-class modules: Modules as a data structure (record of functions), that you can pass around in the program, and select and use at runtime. For increased dynamism (might be reconsidered if it compromises static analysis guarantees at compile time... like the aforementioned function-level dead code elimination). Inspired by OCaml.

    • Tree-shakeable code (esp. useful for client-server webapps). So it should need a source code dependency between the calling code and the called function. Which makes the language more FP than OOP, according to one definition of FP vs. OOP. In general, shifting concerns from runtime to compile-time is considered a good thing, as it makes the language more predictable, optimizable, and affords helpful coding tools. Having consequences of code changes appear at runtime is a bad thing (see: The Fragile Base Class problem of OOP).

    • Explicit imports, so tree-shaking (to remove unused code) can be done. Inspired by JS. Also, so that it is clear where imported functions come from. Counter-inspired by Golang.

    • Standard library should even be tree-shakeable. Inspired by Zig.

    • Predictability: Making dynamic/runtime things static/compile-time enables predictability and optimizations (such as tree-shaking). Inspired by ESM and counter-inspired by CJS in JavaScript (see How does it work on tree-shaking).

    • Not file boundary dependent. Can be split into files, but execution shouldn't be dependent on file boundaries. So the programmer is free to keep code tightly together. Inspired by SolidJS.

  • Self-hosting: In the future, the language should maybe be made self-hosting, meaning it's compiler would be built in its own language. For portability and language independence. But it's more important that the language is built initially (using LLVM) to facilitate the targeted usecases (webapp + systems dev.), rather than being implicitly optimized for writing a compiler. Also: building a compiler in the language could potentially mean dealing with so many low-level concerns that the restricted and high-level nature of the language will be compromised. But then again, the language should ideally be suitable for systems development, and writing a fast compiler for itself is a good test case for that... Another important side-effect by self-hosting is that when the language is written in itself, the community is more empowered to expand the language, and not rely on others to do it for them (in another language, which they might not know themselves). This is important for a community-grown language to avoid democracy turning into bikeshedding to death. The impetus is placed with the builders (not the vocal onlookers), and self-hosting empowers people to take matters into their own hands. Users become builders. The important part is to simplify merging of different directions, so that the language can converge. The language being about composing independent primitive abstractions should make this merging easier, since it is fewer intertwined features to de-complect and figure out how will interact. (see: 'Community grown').

  • Powerful primitives over batteries-included: Few, but powerful and composable, core primitives. Based on very few fundamental concepts to learn. Inspired by Lisp and Clojure. Prefer uniformity and consistency. Counter-inspired by JavaScript's only half-way interchangeable expressions and statements. Without feature uniformity then programmers will learn, use and prefer slightly different subsets of the language, but that leads to extensive knowledge being required to read others' code (Farooq et al., 2014). NB! But beware Tesler's Law of conservation of complexity: less complex primitives would mean more complex programs (and more complexity for the programmer to write and read), since the irreducible complexity has to be accounted for somewhere. The overall goal is to eliminate accidental complexity, by a curated set of powerful abstractions.

    • Levers which give developers options for various kinds of tradeoffs. Inspired by Remix. Ideally, if at all possible: an opt-in Garbage Collector (GC). Maybe enabled through a modular/plugin-oriented runtime (or as an added library). So the language would be easy to make a web API in, since: "Rust makes you think about dimensions of your code that matter tremendously for systems programming. It makes you think about how memory is shared or copied. It makes you think about real but unlikely corner cases and make sure that they’re handled. It helps you write code that’s incredibly efficient in every possible way. These are all valid concerns. But for most web applications, they’re not the most important concerns." -- Tom MacWright.

    • But avoid DSL's, since Domain-Specific Languages typically become mini-languages in their own right, so enabling and encouraging them can increase difficulty reading programs. Such languages are akin to dialects/sociolects that hinder generalised understanding and learnability (adds knowledge debt). Counter-inspired by Lisp (e.g. Lisp being too powerful since Lisp shepherds you into building your own language/DSL), and stack-based languages like Factor which maybe has too powerful (re-)factoring capabilities. It might be true that “Domain-Specific Languages are the ultimate abstractions.”, as Paul Hudak put it in 1998. But they are only ultimate within some particular domain. Because DSL's are not general and are underpowered, precisely because they are domain-specific (they sacrifice general expressivity over expressivity in one specific domain). But how do you know that you know your domain? That you have perfectly captured your domain in your DSL, and don't need to rework the DSL entirely, to account for some new understanding? Most domains are moving targets, to some degree. Generalized languages seem a better way to go, even though they might entail slightly more work within a domain, than a DSL. (A sharp knife is in general preferable to kitchen-appliances for every single use case.) Some cross-domain terms (keywords like function, if, return, etc.) are usually helpful for onboarding programmers, since they afford familiar knobs on which to hang the other unfamiliar code. Even if you don't understand the domain (or its plethora of abstractions), you would at least understand something. From where you could build your further understanding.

    • Small focused core, with powerful composable primitives, that lends itself well to abstraction. Language extensible by library authors. Strong convention and encouragement for abstractions based on generalizable JTBD naming, instead of business/domain-specific DSL's (reasoning above).

    • Modular composition, configurable. Inspired by Rollup plugins.

    • A language for library authors. Inspired by the success of C++. The language should be able to evolve by community convention, not by centralised specification: the language itself should be extensible with libraries (would probably need to have some limited metaprogramming facility in the form of compile-time macros... good idea?). See: community grown, configurable language.

    • Fast branching and merging: What would be important is to facilitate fast and simple language merging, due to all the community divergence and implementation branching that would appear (for the aforementioned reasons; library-driven). Inspired by Git (fast branching and merging was the big idea behind much of Git's success). So the community is enabled to easily find back together after a split/branch (if their ideas and goals come back in alignment, and they have converged to an agreement on a new feature). See also: "forward- and backward-compatibility". A standardization process can be good for cohesion, but can also become a roadblock for progress and innovation, due to taking a long time and effectively introducing a monopoly on the mind-share (causing the majority of the community to wait for the standardization process before adopting nearly anything, instead of taking a risk). Counter-inspired by the history of Lisp (and the ANSI standard of Common Lisp).

    • Type 2 bootstrapped, using a suitable base language that affords a small core of necessary abstractions to our language, with which the rest of the language can be built.

  • Compilation should be able to target some popular language & ecosystem, like transpile to JavaScript or compile to WASM, or potentially even the JVM, to get cross-platform interoperability. But not any target for any cost, if it would put unwieldy constraints on the language design. WASM seems like the best candidate. The goal of no Garbage Collector (GC) would align with targeting WASM, since WASM currently has no GC of its own, and thus it is difficult to target WASM for GC-based languages (without shipping a huge bundle to every client containing a custom GC). See Notes on the Future of WASM and JS. See: "No Garbage Collector". Would be open to also transpiling to Rust. Since coding in a stricter, more idiomatic single-paradigm language, which would be more beginnner-friendly than Rust, it could make systems level and performant web server programming even more accessible (see also Smaller Rust). One could then leverage the Rust crate ecosystem, too. If the dream language would have no GC, a borrow-checker (and maybe even initially be written in Rust, or maybe Ocaml instead), it may be possible to transpile to Rust.

    • Transpilation to another language should output human-readable code. So that it can be used for debugging. Counter-inspired by ClojureScript. Inspired by ReScript. The transpiled output code should also be able to be mapped/transpiled back to the original language, ideally by simple visual inspection, alternatively by tooling. So that communication with community members of the target language (e.g. JavaScript) is made easier, and debugging help in the target language can be applied back to the code in the original language.

    • Escape hatches: The language should have escape hatches that facilitate integration with other ecosystems, to aid in rapid adoption. This could compromise the strictness and guarantees of the language, but it should be possible for those who want to take on that risk/burden. (The language runtime should give proper warnings about loss of safety guarantees, in case someone unwittingly uses third party dependencies that cause it). For them, the language's guarantees will only be as reliable as the guarantees of the older languages the language interoperates with (see 'Bindings for types' below, and also later 'Typed Holes'). Escape hatches enable "making the easy jobs easy without making the hard jobs impossible", as Larry Wall of Perl said. In general, the language philosophy should lean towards uniformity and consistency with a small cohesive vocabulary. For those (10% of) tasks that potentially doesn't fit well within the constraints of the language, we ideally don't want to bolt on features and impurities to the language (making it more powerful at the expense of making the language more complex, harder to learn, more footguns, less uniform/consistent). Counter-inspired by Rust. Ideally, such features should be afforded by good interop with third-party libraries written in more suitable languages.

    • Library compatibility tool. So you can input a list of your stack of libraries, and it will tell if and where they are incompatible. Counter-inspired by JS Fatigue.

    • Bindings for types. There should be either official bindings towards the most popular libraries/frameworks in other languages. But even better, it should be some way to automatically generate bindings. Or even better, for interop with each langauge: an official adapter that automatically translates all primitive types to default safe types in the language (for simpler transparent use).

  • Small standard library. To have some common ground of consolidation, and to provide the basic and most common utils. So usage will be fairly standard, and coming into a new codebase not feel too foreign. But not too big standard library, since it would be connected to language updates, which are slower, and community competition is better for adaptability over the long run.

    • The minimal standard library should be designed and decided by one leader with good insight into what users need, and a strong appreciation for consistency. To avoid endless bikeshedding. This is the only place where the language should have a benevolent leader for a limited time.

    • Community grown / off-hands-leadership: No BDFL, since it impedes evolution & diversity. Every top 10 programming language has a single creator (aka. Benevolent Dictator For Life, BDFL). But the undertaking of a programming language is a massive task, so development of it ought to be parallelized from the start, and grown by a community. For it not to take the typical 7-10 years to develop and gain adoption. Since a language is dependent on a community of speakers, it is very beneficial for it to be a community effort from the start. Even though final decisions may come down to a BDFL as the cheif designer / architect (to avoid design by committee; or a death-by-a-thousand-incoherent-compromises), disagreements should always be tried to be resolved by the community first. "When a langauge accepts bottom-up adaptations (from the users) it will handle new topics and new problems more efficiently than when it need to wait for top-down approval of such adaptations." (from Will ugly languages always bury pretty ones?). The language designer should more be an arbitrator in discussions the community can lean on regarding what should be the default convention. The designer(s) and stewards of the language should also be nice (so the community will be welcoming and thus flourish). Inspired by Ruby's "Matz is nice, so we are nice". Even though none of this is strictly a language "feature", it is nonetheless of major impact to a language and its development, so it deserves a mention. Furthermore, that all people should have actual ownership of the development of the language, is vital for contributions and growth. The language might not grow exactly where the designer intends, but a centralizing authority (like a BDFL) may just as well be stifling growth (and causing pain), as it purports to lead it. Counter-inspired by Elm. Yes, wild growth might lead to some weeds (bad dialects/libraries), but leadership through conventions and good defaults could alleviate potential analysis paralysis and decision fatigue experienced by language users. Counter-inspired by JS fatigue. Ultimately, the power and impetus should reside with The Man in The Arena, and that man/woman should always be able to be you. The language standard / mainstream should upstream/incorporate changes found to be popular in the wild (amongst all the various dialects/customizations), and at the same time ensure they are incorporated well (coherently and consistently). This is opposed to initial/top-down design by committee, which often lacks vision and coherence, and pragmatic connection with the real world.

      • Free experimentation on branches that can be upstreamed: It is hard to predict the effect of novel language features, especially before they've been tried in the wild for some time. Languages evolving by centralized committee tend to evolve slowly, as for each new feature they have to come to a consensus, and predict and test its use. Whereas in extensible languages anyone in the community may modify it without permission, and test it on their own. This is much faster, and dialects can be tried in parallel. Then they could be upstreamed back into to the mainstream language dialect. This could be a sweet-spot.

      • How could the language be very constrained while at the same time be community grown? The language core should be very constrained around composition of a few core primitives (self-hosting), but it could be modified or built upon by others. So that it could evolve in multiple avenues of exploration, and gain from the competition. Where it would be up to the community to decide whether they want to use the constrained version(s) (suitable for large scale complex environments), which I prefer, or the bring-your-own syntax version(s) (suitable for small scale playful experimentation and research) which would inevitably appear. Inspirations here would be Lisp, Clojure and Racket.

  • Package manager and build system. For installing, upgrading, and configuring, as well as building, compiling, and distributing code written in the language. Inspired by Cargo for Rust.

    • Package managers should be separate from the core runtime. So alternate package managers could be developed (for performance or other reasons) by the community, and versioned and published independently. Inspired by the Node NPM debacle.
    • The core runtime should be possible to download with or without a compatible default package manager. For ease of installation and use, while allowing flexibility of choice.
  • Single package directory: Some sort of singular reference to a library package information service. So the community can organise around one common point, instead of scattering. Inspired by NPM. But doesn't necessarily need to be centralised package download/storage, the storage/download could be decentralised. But would need to be safe. Cert signing?

  • Runtime environment: Be able to run on some existing popular cross-platform runtime (like WASM, or maybe the JVM?). Inspired by Clojure. And/Or have a very minimal programming language runtime (without a GC). Inspired by Rust. But the runtime should in any case handle the scheduling of goroutines, inspired by Golang.

  • Ecosystem: Interoperable with one or more existing programming language ecosystems. To import or reuse libraries. Without too much ceremony. So the ecosystem doesn't have to start from scratch. Counter-inspired by Elm. Smooth interoperability with existing ecosystems and other systems, minimising glue code, is one of the largest underestimated features of a language, in terms of enabling its success. Inspired by C. While a fully integrated system can be very nice, it inevitably risks being disrupted by a thousand small cuts (i.e. are made irrelevant to a project because other tools/services outperform them on one or two critical features, or needs interop). Counter-inspired by Ruby on Rails, Lambdera, Elm and Dart. The world is heterogenous, and no single system yet has been able to solve all relevant problems for all people. Many have tried to own the world, like Smalltalk, Imba, Darklang, etc., but this can be an impediment to mass adoption. A language as a small focused tool which lives well in a heterogenous environment is the way to go.

    • The language should be a good citizen, and have a good Foreign-Function Interface (FFI), so that it can use libraries written in other languages, or be embedded into apps written in other languages, or interact with distributed services in other languages. Being able to adopt the language alongside an existing software stack is very important. Inspired by Kitten and JavaScript. Should not assume the language is the main language you're using. Inspired by Kitten. Counter-inspired by C# and Java.

    • C ABI: Compatible with the C language Application Binary Interface (ABI). So code in the language is usable from other languages. Inspired by Zig. Since compiling to WASM is desirable, WASM's C ABI could probably be used, instead of a separate implementation towards the C ABI.

    • "WebAssembly [WASM] describes a memory-safe, sandboxed execution environment" where WASM's security guarantees eliminates "dangerous features from its execution semantics, while maintaining compatibility with programs written for C/C++." NB: But WASM's restrictions might conflict with runtime dynamism and the desired live REPL feature (inspired by Lisp/Clojure)...? "Since compiled code is immutable and not observable at runtime, WebAssembly programs are protected from control flow hijacking [code injection] attacks."

  • Open-source, team: Developed as open-source from the start, of course. By more than 1 hero programmer (see: bus factor). Preferably by 4-5 people collaborating and to some minimum extent overlapping.

Sobering notion

  • All the while, the language should avoid the fate of the Vasa.😂 Which means a feature creep resistant core language. (I am aware the irony of this feature list, but read on...) So the feature set should be designed and decided upon as early as possible (when the degrees of freedom in the design space is as wide as possible), with a holistic view. Boring > clever. Designed to reach a 80% sweet spot of most important features, foregoing the most exotic and esoteric features, and foregoing the ability to solve edge-cases (such should be relegated to interoperability with other more specialized programming languages). Since 80% of the work and complexity would come from the last 20% (The Pareto Principle). This might include foregoing some of the more esoteric or novel features (see the summary for those). The language designer(s) should actively work to make it smaller, and more elegant, not bloat the syntax with every new feature. Counter-inspired by C++.

    • Not multi-paradigm: The language should not attempt to be multi-paradigm, but rather be holistically and uniformly designed around a single big vision from the start. Because multi-paradigm languages often have many warts and unintended consequences (as a result of the interaction between the paradigms; mixed metaphors make for confused thinking), and makes the experience using the language less coherent and less predictable (when moving from one codebase/project to another). A language catering exclusively for its own idioms would make it more cohesive and give a more predictable experience. Predictability is key in programming. A great deal of complexity in programming and programming languages today seem to be accidental rather than essential. We create bugs in our systems because we don't really understand them, and we don't understand them because they are so multi-faceted and complected, and thus we aren't able to easily reason about them. A small meta-language is also important, so most things afforded by the language should be self-explaining within its own idioms, and those idioms should be close to the natural world and most natural languages (see 'SVO syntax'). The focus of the language should be on onboarding complete beginners, not catering to some existing programming language community (which would likely experience an Uncanny Valley / misleading familiarity anyways). Existing programmers would also be able to quickly learn, if a transition guide is made (maybe using ChatGPT or RosettaCode or something). It's the complete beginners that usually present the greatest challenge to onboard.
  • "Are you quite sure that all those bells and whistles, all those wonderful facilities of your so-called 'powerful' programming languages belong to the solution set rather than to the problem set?" — Edsger W. Dijkstra

  • "In anything at all, perfection is finally attained not when there is no longer anything to add, but when there is no longer anything to take away" -- Antoine de Saint-Exupéry (Thanks for the reminder in your comment on the first draft, Costin Manda.)

  • "Inside every big ugly language there is a small beautiful language trying to come out." -- sinelaw

Future reference / reading

The Hairy Vision

The vision is to reduce complexity for app developers, through abstraction and wise platform defaults. I'll try to paint a picture of the overarching vision, even though I know it is a bit hairy.

The vision ties back to the principle that:

  • A language determines WHAT you can & have to think about, but also HOW you have to think about it.

And the desired features that the language should be:

  • Designed for fast onboarding of complete beginners (as opposed to catering to a specific language community who already have the curse of expertise).

  • Very high level. Abstract as much of the details as possible. (Abstraction means to "draw away" the concrete nature of things, so that their commonalities remain.)

  • Have a small Meta-Language. For onboarding with low overhead. Counter-inspired by BuckleScript.

  • Simple, with a well thought out vocabulary. Inspired by Clojure (except cons and conj, which are too similar-looking).

Most languages presume the app developer will deal with a lot of relatively low level concerns, to get pretty obvious benefits we should be able to take for granted (e.g. concurrency and parallelism). The sentiment "I don't know, I don't wanna know", as Rich Hickey put it, applies to this. However, it does not mean hostility towards learning, but a certain amount of healthy scepticism: if you have to document something to a large degree, have you really simplified it enough? (see: meta-language) "Everything should be made as simple as possible, but not simpler.", accordig to Occam, Sessions, and Einstein. We know that even too much documentation (i.e. meta-language) is a code smell, since code should be self-documenting. A language's complexity consist of it's syntax and semantics, but also its meta-language (which should be minimised). A language should not burden the speaker/thinker with unnecessary complexity, as that cognitive effort is better spent on the task at hand: the domain is often complex enough in itself! We shouldn't invent problems to solve, even if the solutions could be beautiful.

Here is the start of a non-exhaustive list of what the application developer should and should not have to be concerned with. I.e. what the language should afford as syntax and semantics when I'm coding (which does not exclude how the language libraries/runtime implements it under-the-hood):

What I want to think about:I don't want to know or think about:
Splitting up the problem/data into separate pieces.Concurrency vs. Parallelism, explicit parallelism constructs, goroutines, threads, Fibers, Actors, Channels, processes, CPU cores, Microservices, Distributed system topology, Mutexes, Locking, ...
Function composition.Monads, Monoids, Class structure, Contextual precoditions, Inheritance rules, Type / Category Theory, ...
Choosing appropriate algorithms and data structures.Immutability, Mutable vs. immutable references, Pointers, State management, Data-flow architecture, Complex types, type inference/conversion, ...
Expressing what should go together, co-location of code.Pointers, Memory management, Stack vs. Heap allocation, Fundamental security measures, Sandboxing, Scopes, Closures, ...
How to organize code to communicate the ideas better to the reader, how to conform to conventions.Performance, Syntax/keywords, Esoteric operators, ...
WHAT should be done, and to a limited extent also HOW it should be doneWHEN it should be done (sync vs. async, eager vs. lazy). WHERE it should be done (runtime environment-concerns, but also if the work should be done on another CPU-core or another machine) ...
The User Experience (UX) and the actual problem domain!Anything that detracts my thoughts from the UX and the actual problem domain. (everything above in this column)
......

You should simply be able to describe to the machine what the problem looks like, how it could be divided up, and how (the algorithm) to solve it. Then the machine (i.e. language runtime) should decide when and where it wants to solve it (based on it's hardware/environmental constraints): whether it means to single- or multi-thread parts of the work, or, in case the local resources are/become strained, if it should distribute the work over multiple machines (depending on the measured latency of their inter-network connection). So the language should have an adaptive runtime, but in lieu of that, it should at least have a platform-configurable compiler, that could make some generally and universally applied decisions based on configuration of specific platform constraints.

I want a capable tool, so I can write lean programs. Lean, as in: not loaded with what ought to be low-level concerns. Actually, I'd prefer the language to afford a set of powerful primitives, and the capabilities (i.e. more powerful abstractions) could just as well be in libraries than embedded in the language runtime/platform. As long as I don't have to seek out and configure all of those myself (re: JS Fatigue), but could import a curated set of sane conventional defaults.

The End

One or more of these requirements might be conflicting / mutually exclusive. Maybe. But maybe not?

One can always dream.

This is a list of my preferences. Some would probably be quite controversial. Like my aversion to certain features which a lot of other people like (e.g. meta-programming). I might just not be familiar enough with them to have developed an appreciation for them.

I will try to keep this list updated if and when I change my mind on any point, which I am open to doing. I have already changed my mind from negative to positive on generics and pattern-matching.

What features would your dream programming language have (or don't have)?