/End-user programming

End-user programming

In the classic software development model, engineers perform a save/build/run cycle that can take seconds, minutes, or longer. But end-user programmers need something with more immediacy. That means shortening the save/build/run loop, ideally until it collapses to nothing.

Spreadsheets show their pioneering approach here. A spreadsheet author “runs” the sheet by entering a formula into a cell and pressing enter. The entire sheet is recalculated without any additional steps, and for most sheets on modern computers, this happens instantly.

This liveness is visible in another end-user programming success story: SQL, a language for interacting with databases. Business analysts, product managers, and other non-engineering but motivated power-user roles in organizations often teach themselves a bit of SQL and use it to interact with their company’s production database (or a copy of it).

In spreadsheets and SQL, the user gets their immediate results after pressing enter. Taking this idea further, there is a whole academic community around “live coding.”

Adding live coding to programming tools is a huge technical and design challenge. Microsoft Research has explored this space in depth including creating several ground-up programming languages and IDEs that support live coding.

But the “living systems” quality of end-user programming is broader than the fast feedback loops of live coding. It also includes the ability for the system to change itself from within, giving the end-user programmer a feeling of open-ended possibility and complete ownership over their tools.

A small example here is is the developer tools console available in most desktop web browsers. Here the user can grab any element on the page and change it: its color, its size, whether it appears at all.

The lineage of Smalltalk, Squeak, and Pharo takes this idea to its furthest: if a system is fully written in itself (sometimes called “self-hosted”), the user can change absolutely anything about the system. Unlike a traditional operating system, where deeper changes require a reboot or even a separate development device, self-hosted systems can inspect and change anything on the fly. They are living systems.

A third quality that we think is necessary for end-user programming is an in-place toolchain. The user should be able to edit their programs without installing additional tools or programs. Further, they should be able to use an interface and set of abstractions that is as close as possible to the ones they use for their regular daily work.

On the iOS platform, for example, creating an app requires not only a separate toolchain, but an entirely different platform (macOS) and a $99/year developer program fee. This creates the widest possible chasm between end users and software developers.

In-place toolchains are not just about pre-installed tools. The truly hard part of this is allowing the user to apply the concepts and interface paradigms that they have already learned in their daily use of the application.

Compare to Unix, for example. A user of a Unix system learns to type commands, edit text files, and copy/move/delete files and directories as part of their regular use. And when they are ready to write a program, they can continue using the same concepts and interface, because a Unix shell script is just a series of commands saved into a text file. That text file can be copied, moved around, edited, and deleted just like any other file.

“A humane-interface principle is that the system itself should be built out of the same kind of pieces with which you are familiar from your everyday use of the system.” — The Humane Interface, chapter 5

Zapier and IFTTT offer tantalizing glimpses of end-user accessible automation for the web and cloud APIs. But by the full measure of the in-place toolchain idea, these fail because, for example, the act of automating your smarthome components with IFTTT requires a completely different interface and set of concepts from using those components day-to-day.

Automating a WeMo smart plug to turn on a light each night with IFTTT.

A similar problem of “tool in its own standalone universe” exists in academic examples of end-user programming. Eve, Sketch-n-Sketch, and Pharo are technical design marvels. But they require the user to enter an entire new world of tooling and concepts, leaving behind everything they already know about using computers. They have to be motivated to want to program to begin with, rather than having it there waiting for them in an environment they are already using, such as their word processor, their web browser, or their photo editor.

Coda gets high marks for creating a document editor that can be enhanced with calculations and automation. In theory, a user might want to use this as an alternative to Google Docs, Microsoft Word, or Dropbox Paper to author a document without programming and then add those capabilities later.
Airtable offers the user a conventional spreadsheet for storing tabular data with the ability to move to programming via advanced filtering and grouping.

In-place toolchains are one of the greatest obstacles to widely-used end-user programming.

With the context of those three qualities and what has come before, our research lab set out to do some end-user programming experiments. Our testbed was a tablet thinking tool called Capstone.

Capstone is a prototype by Ink & Switch which offers mixed-media cards and freeform inking on a shared canvas. It’s intended as a place for creative professionals to develop their ideas through research collection, sketching, and mood boards.

If we assume that a creative professional is using Capstone on a regular basis for their notes, clippings, and research, what programming or automation capabilities can we offer that would give them more power, flexibility, and customization within that context?

We ran five experiments to try applying embodiment, living systems, and in-place toolchains to Capstone.

Edit card source

The Capstone user interface consists of cards on a canvas. As a simple starting place, we built an editor inside the Capstone system that allows the user to edit the underlying code of a given card. Think of this as similar to the web browser’s “View Source” option, but with write capability.

Text card on the left, slide-out Javascript/React editor on the right. Here, the user makes small customizations to the color of the card.

On the surface, this seems to fulfill the in-place toolchain goal: the user needs nothing additional to start editing the code of their chosen card. It’s a live coding environment where every keystroke re-renders the card so that users see the results of their work right away.

However, we found the actual use of this approach uninspiring. It felt obvious, not a bold new direction that hasn’t been tried before. From the user’s standpoint, it felt editable but not inviting to the user. The jump from cards-on-a-canvas in a touch/stylus interface to a cryptic code editor with Javascript and React code was incongruous.

Data pipelines

For our next experiment, we wanted to embrace the cards+canvas model and touch interface of Capstone, while also borrowing some ideas from Unix pipes.

What would this look like for a more visual environment like that of Capstone’s cards-on-canvas? Our approach was to allow each card within Capstone to take input and provide output, allowing the user to chain them together with the visual equivalent of a pipe operator.

The now-defunct Yahoo Pipes allowed users to pull data from websites and APIs to produce outputs in a visual environment.

We added fields to each card: uses and exposes. Like the names suggest, cards could now wait for some input, and expose some output. Those inputs and outputs were strongly typed to allow for exchange of richer datatypes like arrays and objects.

In the examples below, we’ve borrowed a use case from this Emacs literate programming example. Here, a teacher begins with a table (spreadsheet or CSV) of student grades and wishes to create an ad-hoc dashboard showing pass/fail for each student.

A card contains a CSV that exposes raw text. Another card consumes that text and turns it into a table. Colors give cues as to which cards can be connected or are currently connected.

By dragging the “exposes” label from one card onto another card’s “uses” label, the user creates connections between cards. Connected cards are color-coded, inspired by spreadsheet cell-and-range color coding. Based on user feedback, we found that users wanted to keep cards in a pipeline physically near each other.

A “grep” card type filters input.

By combining multiple simple cards users could build data-processing pipelines. Further enhancements included multiple inputs/outputs and renaming inputs or outputs.

A multi-step pipeline including multi-input cards and named inputs/outputs.

Many attempts at more accessible programming languages are weakly typed, under the hypothesis that types are unforgiving to newcomers. Our team’s instinct is the opposite: strong typing, with the right interface, can be friendlier for newcomers by making program components “snap” together like building blocks. If the blocks fit, the program will probably work. See the previously mentioned Scratch; Elm’s strong typing for eliminating runtime errors; and Hazel’s “typed holes” live programming environment.

Strong typing prevents the user from connecting cards that don’t fit.

This standalone prototype is available here.

Unix pipelines continue to be the reigning champion for composability — something not yet replicated in GUI environments. We feel that this CSV pipeline experiment produced positive findings supporting the value of strong typing (here as uses/exposes), and showed a potential interface for in-place toolchains that don’t break out of the touchscreen interface.

On the other hand, visual embodiment of the data pipeline created some problems. It adds visual clutter (a problem with many/most visual programming systems). Furthermore we found a tradeoff between grouping cards together in a way that makes the program flow clear (e.g. pipeline goes top-to-bottom or left-to-right) versus grouping cards in a way that reflects how the user wants to think about their content more generally.

REPL

For our third experiment, we decided to relax the constraint of an in-place toolchain in exchange for better results on other dimensions. In particular, we wanted to see if the REPL (read-evaluate-print loop) used in many programming systems would be of value.

The built-in REPL for Ruby.

REPLs are traditionally built on wire protocols. That is, the user’s console sends commands to the runtime system over the network. But Capstone uses a synchronizing data model which keeps all of the visible elements stored in a live document. This allowed us to build the REPL by writing directly into the document from a session on another device.

The data layer for Capstone is based on a peer-to-peer CRDT system called Hypermerge.

Modifying canvas background color and moving cards via commands in the REPL.

We absolutely loved the resulting feel of this experiment.

Being able to interact with a live system felt magical. Much like browser Dev Tools, the user can change appearance of anything via CSS. They can also interact with the cards data model to move cards around on the screen, move cards between boards, or absolutely anything else within the system. But unlike Dev Tools, all of these changes persist. The user has modified their workspace, customized it to their taste, via a fully programmable interface.

Our team was energized by this result and we instantly wanted this capability for all of our existing systems such as the desktop computers we use in our regular work lives. But we also quickly ran into what would likely be the biggest chunk of work in making a system like this real: API design.

Typically software systems are built with an internal API used only by the professional engineers building that system. These functions are often minimally documented and have obscure names that may reflect history or even internal in-jokes, whereas a public-facing API is designed separately, well-documented, versioned, and kept more stable.

Our finding here is that for a living system to work, the internal and external APIs need to be mostly the same.

We also noted that living systems produce a tension between hackabilty and the danger of user breakage. For example, the user can change a card’s background color just as easily as executing a command that would discard every card onscreen or even put the system into a crashed state or infinite loop. What to allow, how to surface errors, and how to recover are deep and challenging questions we did not explore in the course of this experiment.

Hooks

Our next move was to extend the REPL with hooks for system events, such as the user dragging a card around the canvas.

With a hook for card dragging, we could then build a “window manager” within the Capstone environment:

Setting a hook on card movement to snap to a grid boundary.

Programmable window managers are an inspiring source of prior art on this. See xmonad, Phoenix, and HHTWM.

Hooks bring a significant downside: the computation is no longer visible. The user’s code could do things hidden behind the curtain, which is the opposite of embodiment.

A parallel here is triggers and stored procedures in databases like PostgreSQL. They also have the downside of no embodiment. Hence, while SQL is an end-user programming success story, triggers are typically reserved for professional database engineers rather than SQL console dabblers.

Bots

For our final experiment we wanted to explore how the user could create long-running programs (or daemons) inside of Capstone that solved the embodiment problem of hooks.

Typically when we say embodiment we mean a visual element onscreen. But we took inspiration from the world of chat bots: what if computation was embodied as something with a bit of personality, a sense of being an actor or a collaborator in the system along with human collaborators?

An autonomous bot uses a tiling algorithm to keep a canvas tidy as the user moves cards.

Like the REPL experiment, Capstone bots still suffer from no in-place toolchain. The end-user programmer must write the script in Javascript and then issue a command to add the card (or update an existing one) in their Capstone workspace. Technical details are available in the pull request.

A subtle but important piece of the bot interface design is that a bot subscribes to all changes in the document (similar to reactive programming). This is instead of subscribing to specific event types (similar to event-emitter) such as as moving or deleting a card.

Another variation on this experiment was to allow bots to expose a small UI. Since the bot’s card already has screen real estate, why not allow programming direct interactivity?

A bot’s card offers buttons to trigger actions: here, creating a timestamp for a journal entry.

Although creating bot code does not satisfy the in-place toolchain goal, embodying them as cards has its own set of in-place toolchain benefits. In the same way a Unix user can manipulate a script file like any other file, a Capstone user can manipulate a bot like any other file.

A user’s Bot Bin, with favorite bots saved. The user uses the mirror command to duplicate the bot and then bring it to the desired location.

We noticed user behavior like creating a board full of their favorite bots and mirroring those cards onto the shelf to take them to the board where they wanted it. Deleting a bot is done by throwing the card off the screen like any other card. All operations that worked on (for example) an image card worked on a bot card as well. Computation follows the same rules as data (it can be cloned, shared, stored, etc). This feels important.

The potential for end-user programming remains largely a dream in today’s computing devices. The huge amount of work done by academia and industry indicates this is a very hard problem indeed. But in working on this problem via these experiments, our team feels that it is achievable with enough concentrated effort from our industry, and worth doing so.

Our positive findings from these experiments include the strength of combining strong typing (uses/exposes) with visual program flows; the magic feeling of interacting with a living system; and embodiment of long-running computation as bots, visually represented as cards with properties the user already knows.

Our negative findings included that our best experiment (REPL) still required an external/separate toolchain; that visual arrangement of computation cards can get messy quickly; and that API design and documentation will be a huge challenge for a powerful end-user system.

At Ink & Switch we continue to believe that the end-user programming utopia is reachable. Are you working on this problem, or have thoughts on what we’ve written here? Get in touch: @inkandswitch or [email protected]


Appendix A: Data layer as an interface

Three of the five Capstone end-user programming experiments used a different device for the programming interface. In both cases, we were able to build on the CRDT / Hypermerge data storage layer rather than use a traditional network connection such such as ssh or nREPL.

It feels like this difference is significant.

We could speculate that this is the difference between imperative vs declarative code. Imperative programs say “run this function on the host.” Declarative says “update a portion of the document and let all subscribers to the document choose how to render the new state.”

Even in the case of a REPL, which is by nature imperative, simply storing the history of commands within the document gives us a scrubbable history of changes made to the system that is inspectable by anyone with access to the document.

We are already exploring this topic in our next Ink & Switch project. We’re experimenting with live documents as a basis for a new programming model; realtime version control that combines the best parts of Git/Github and Google Docs; inspectable change history as a way to surface the power of CRDTs to end users; and what happens with end-user programmable environments when everyone is connected to a shared document.

Appendix B: The web stack for sandboxing and hackability

We chose Chrome OS as our platform for Capstone over more mature options like the iPad or Surface largely because web technologies offer vastly more possibilities for end-user programming. On other platforms, we would have needed to embed a scripting language and runtime such as Lua.

Some examples of apps that make great use of the web’s native extensibility:

Embedding the Javascript toolchain for building and running Javascript programs in place (such as the editable cards shown in our first experiment) is straightforward. In this case we bundled the Babel compiler via @babel/standalone for building JSX files. As an added bonus, it bundles most popular presets and plugins.

One open question is how to use external libraries. A potential solution is using a service like unpkg to provide users with a fetch and cache mechanism.

Overall, the web is the only full-featured platform ever created that allows instant download and execution of a program written by a stranger (by visiting a website URL). But the perfect sandboxing of the Javascript runtime means that this action is almost completely safe. This is a truly stunning technical achievement and means that the web is a promising place for end-user programming capabilities.