The Data-First movement

We should talk.

Feb 05, 2025

This is an article aimed at techies! But if you’re not a techie, then why not have a peek just below at the projects I’ve listed that have similar goals and approaches to my Reality Computer project. Maybe I’m not a total nutcase after all!

This is a live document that I’ll keep updating, so do keep hitting refresh!

There’s a group of other researchers who seem to be thinking along similar lines to me.

Here’s a list of such people and their projects (the links keep you within this document and quickly get you down to their detailed entry):

Bret Victor’s Dynamicland
Omar Rizwan and Andrés Cuervo’s Folk Computer
Jonathan Edwards’ Subtext and other work
Ink & Switch’s Embark and other work
Alexander Obenauer’s Itemized OS
Beaudouin-Lafon and Wendy Mackay’s Substrates
Clemens Klokmose’s Webstrates
Chris Shank and Orion Reed’s Folk Canvas
Gordon Brander’s Subconscious and Noosphere
Chris Gebhardt’s InfoCentral
Federico Pereiro’s Cell and TODIS
Rahul Gupta’s Syntropize
Guyren Howe’s FREST
Duncan Cragg’s Object or Reality Network

All of these projects have similar goals and much in common.

Indeed, I’d go so far as to say they represent something of a new wave of innovation in computing.

I’m going to call that wave the Data-First movement!

The following projects are “affiliates” of the movement, as I’ll explain later:

Obsidian and other “Personal Knowledge Management” systems
HyperCard and its various derivatives to the present day

Not the same as these…

There are other movements and manifestos related to this:

The Data-First movement has many similar beliefs to each of these, especially around empowerment and freedom for end-users over their devices and data.

Indeed, Data-First could be seen as included under the umbrella of Malleable, whilst in turn embracing many of the principles of Local-First and Zero Data.

Except… without apps!

And that’s a key aspect that these projects have in common that makes them “Data-First”, as distinct from Malleable, Local-First or Zero Data.

Let’s take a quick dip into this major driver behind these projects…

App-free computing

Another name for the core driver behind this movement could be “app-free computing”: operating systems or environments without apps or application programs.

I wrote my thoughts on that here:

Apps considered harmful

Duncan Cragg

December 18, 2023

Read full story

In summary: apps take over and trap our data, splitting it all up into islands. The same interface elements and tools are re-invented in slightly different ways. Our access to our data is controlled by far away techies and Big Tech corporations. This is dis-empowering for all of us and can lead to us being constantly watched and manipulated. We don’t truly own our own devices, data or identities.

So a key characteristic of a Data-First project is to be exploring ideas that escape this “app trap”, even if that result isn’t explicitly stated.

Data-First is more positive

But: I find the whole “app-trap” angle to be a bit too negative for an ongoing movement.

I think we need to describe what we want, not what we don’t want!

Of course as soon as you dispose of apps, you immediately have no choice but to focus on the data instead. The data has now been freed from the app trap, and we now have to decide what we want to do with it!

So another way of characterising a Data-First project is that it focuses directly on the presentation and management of end-users’ data, instead of making them run individual application programs to access it.

Data-First in five bullet points

Now, no movement is complete without a pompous manifesto. But for now we’ll settle for just a short list of Data-First Principles:

End-user computing: easy as a spreadsheet
App-free: an OS or a browser only
Data-focused: viewing, editing, structuring, laying out and navigating
Live data: the system or someone else may change it while you're looking
Local-First: no silos; generic data and update servers connecting local data

I thought I’d ask a free online AI for a logo, but it wasn’t all that good:

Local-First, Collaborative Multimedia Spaces

Clearly, a Data-First project should easily allow the creation of live, interactive, collaborative, multi- and mixed-media or hyper-media spaces, from simple (well, nestable!) pinboards and whiteboards up to full 3D environments.

The ability to quickly set up a simple chat system, for exchanging text and image messages along with user “presence”, should drop naturally out of its design.

And, we’ll want that to be Local-First, not our Data-First-sucked-into-a-silo!

Data-First needn’t be an operating system, or 3D

Even though I personally prefer Data-First as an app-free operating system, a browser-like application can still qualify for the Data-First list; a number of the projects here take that approach. As long as it’s driven by a desire to put a user’s data right up front rather than behind an app-like interface, it’s good.

I also believe that it should work just as well in 3D or XR. I don’t expect any other project in the list to take that view, but a sense of creating an interactive space or a 2D gaming world would be nice. Dynamicland is getting there, given that it all happens within a 3D physical space!

The Data-First Chat Challenge

I thought it would be interesting to pose a challenge for all these projects to help bring out their different features and approaches.

And the smallest useful slice of “Collaborative Multimedia Spaces“ is chat.

So - do show us how you’d implement this in your Data-First system:

Chat functionality:

two users at two devices (let’s say they can already see each other over WiFi)
chat room or message sequence, hosted and owned on one of those devices
both users can create and add messages to the end of this sequence
each can see the sequence update instantaneously with every new message

If your project has an end-user programming language, then obviously just show some code, but otherwise describe how a user could achieve this using a simple interface, maybe using some event-action, pub-sub or syncing mechanism in your Data-First system.

Gathering Data-First projects

We’re looking more for open source and academic works than commercial ones. If Data-First hits the headlines or the marketplace, then all hell breaks loose and nothing means anything any more! Research-like projects, driven by some vision of a future of everyday human interaction with our computing devices.

Below, I will briefly describe each of the projects in the list, in a rough order of “project fame and Data-First-ness”, with some grouping for similarity. I also have a section to discuss how the Chat Challenge above may be attempted.

Dynamicland

Summary: Dynamicland’s philosophy is “be present on-site” and “physical” computing: you interact with physical objects in collaboration with others also physically present. Those physical objects are effectively enhanced with computation abilities. There’s a focus on numerical and graphical data types.

Lead: Bret Victor

Chat Challenge:

One of the goals of the Challenge is to bring out the key differences or unique characteristics of each project. And as Omar Rizwan pointed out to me, (a) DynamicLand and Folk Computer are more about non-textual data, and (b) chat isn’t even something you’d implement in these systems because you’re standing right next to people in the physical room! So the only other aspect that may be interesting is transcribing a conversation onto paper.

Folk Computer

Summary: Everything in Dynamicland’s write-up above essentially applies to Folk Computer as it’s a very similar project.

Lead: Omar Rizwan and Andrés Cuervo

Chat Challenge:

See the above DynamicLand notes!

Subtext, etc

Summary: A research odyssey on the good ship Subtext, an evolving programming language. The goal is to find the way for normal folk to use computers that’s in the gap between spreadsheets and “The Stack”.

Lead: Jonathan Edwards

Chat Challenge:

I’d be interested in Jonathan’s response to this.

Embark, etc

Summary: Embark at Ink&Switch has a spreadsheet-like feel and a dual interface for direct access to data on the left and a data view on the right for rendering maps, calendars, etc.. Ink&Switch have many other excellent Data-First projects, including Pushpin, Pixelpusher, Jacquard, Patchwork and Capstone.

Leads: Geoffrey Litt, Paul Sonnentag, Peter van Hardenberg, Martin Kleppmann, Josh Horowitz, Alexander Obenauer, et al.

Chat Challenge:

I&S have created a back-end sync tool, Automerge, which is the basis for collaboration in many of their projects, thus readily enabling chat.

Itemized OS

Summary: in The Itemized OS, everything is an Item; Items can be arbitrarily mashed up.

Lead: Alexander Obenauer

Chat Challenge:

I’d be interested in Alexander’s response to this, but I imagine this may work through the Pub-Sub mechanism he mentioned, and the ability to make your own lists of text Items. There’s a lot of text on Alexander’s site, but I found some apparently relevant pages: here, here, here and here!

Substrates

Summary: Substrates are layers of data that can depend on each other; at the top is the lowest level of abstraction, such as pixels, at the bottom are semantically-rich objects. There are re-usable tools or “Instruments” for acting on substrates. (Incoming events or actions from tools are passed down the levels of abstraction to the substrate that can handle them.)

Leads: Michel Beaudouin-Lafon and Wendy Mackay

Chat Challenge:

Substrates’ prototype (Stratify) does apparently include a backing database, ShareDB, which can be used to sync instances, which would make our chat challenge relatively easy to build.

Webstrates

Summary: Webstrates is a way to mirror a DOM across browsers via a syncing database backend. It allows transclusion and sharing tools to work on page elements.

Leads: Clemens Klokmose (also Substrates’ Michel Beaudouin-Lafon)

Chat Challenge:

Chat should obviously be trivial to implement in Webstrates

Folk Canvas

Summary: Folk Canvas has in-browser declarative programming; spreadsheets and spreadsheet like interactions with page elements including text and graphics; transclusion, cross-linking within the page, etc. It allows “wires” between page elements to have functional expressions on them declaring some dependency. They also have some ideas around networking.

Lead: Chris Shank and Orion Reed

Video: https://www.youtube.com/watch?v=mz1F55UoVQs&t=2182s

Chat Challenge:

I’d be interested to hear how Chris and Orion see this being implemented.

The following projects venture out of research, plus are more network- or back-end-oriented in contrast to the more front-end-oriented projects above.

Subconscious and Noosphere

Summary: The Subconscious project, embracing the Noosphere network, was a way to extend PKM into a decentralised network.

Lead: Gordon Brander

Chat Challenge:

I’d be interested in Gordon’s response to this.

InfoCentral

Summary: Somewhat Linked Data adjacent, InfoCentral specifically states “app-free” as a goal. Alludes to providing the missing decentralised, semantic back-end to novel front-end approaches such as listed above. “Information should never be trapped at service or application boundaries”. The solution is a decentralised immutable “Global Information Graph”. Short presentation linked here

Lead: Chris Gebhardt

Chat Challenge:

Chris explained to me how this would be implemented in his system. In brief summary, new messages are immutably logged in the internet Graph, linked together with hash references. Clients subscribe in change events and new items added to the Graph over a second query or event protocol.

Cell and TODIS

Summary: A notation for data and functions. “Single representation of data; Single dataspace; Call and response; Logic is what happens between call and response; Interface is call and response”. Implemented in a language called Cell.

Lead: Federico Pereiro

Chat Challenge:

I’d be interested in Federico’s response to this.

Syntropize

Summary: Syntropize has workspaces for accessing live linked data. Seemingly related to SOLID, the original HyperData environment which deconstructs the already app-free WWW architecture (REST), adding further decentralisation in the form of user data Pods.

Lead: Rahul Gupta

Chat Challenge:

There is a SOLID chat that uses the Pub-Sub protocol linked here. Plus TBL has this article linked. Rahul’s suggested approach is basically the same as this (Rahul was involved in the protocol’s design).

FREST

Summary: “FREST is a set of protocols that define network-first, relation-first, GUI-first data manipulation and programming environments.” Gives users direct access to data in relational form. Relational data and function invocation by table trigger on arrival of args, which can be distributed across many hosts, pushing through state dependencies asynchronously. All network-addressable data elements also have default renders or views.

Lead: Guyren Howe

Chat Challenge:

Guyren has responded below in the comments!

I put my own project last (but not least!)

The Object Network / The Reality Network

Summary: The Object Network or Reality Computer is built by an app-free, desktop-free operating system presenting instead a single global 3D Virtual World with all of our live digital property pinned together and around us. Described as “Live HyperData”:

The Inversion (simply)

Feb 5

Read full story

Lead: Duncan Cragg

Chat Challenge:

Chat in the Object Network can be trivially implemented without code by simply having a list object of text objects configured to be append-only (which many users can be given visibility and editability for), or as a more advanced configuration with two types of object and three simple rules as described at the end of the article linked here.

The following projects and products have a Data-First feel to them, but I’m not going to officially label them as that, as they don’t (yet) have a back-end server or networking story - they don’t meet our minimal Local-First requirement (in other words, “Local-only” isn’t the same as Local-First!)

PKM: AirTable, Obsidian, Notion

Summary: End-User-focused data management (or “Personal Knowledge Management”) beyond spreadsheets: AirTable, Obsidian, Notion, etc.. This also overlaps with “no/low-code” offerings, mind mapping tools, infinite canvases, programmable notebooks and so-on. These offer normal users direct access to their data, and often some degree of either event-action scripting or spreadsheet-style formulae to bring it to life. These products could be considered part of a “Data-First movement” (see what the CEO of Obsidian has to say about “File over App”), but most still have an app-centric philosophy and are mostly commercial, which seldom puts end-user needs before the need to monetise them. Most, for example, rely on lock-in to their dedicated back-end services.

Chat Challenge:

It’s hard to imagine these systems being easily used to build a chat. I’ve discussed that a little in this article linked. But in any case, most fail on Local First.

HyperCard and followers

Summary: the original lo-code database and end-user application builder! HyperCard (and related projects Boxer, Decker and LiveCode) was based on a database manifest as cards in a stack. Buttons could be used to jump around in a hypertext fashion. A set of cards could have a common background layer. Scripting through event-action programs written in a fairly natural language. LiveCode is a current commercial manifestation of HyperCard.

Chat Challenge:

You can’t do this “in the model”, as there was no simple native networking in HyperCard. It was, however, possible to write low-level programs to directly access TCP.

Plan

My plan is to work with the leads of each project to discuss bringing up some kind of forum and a webpage and/or a wiki.

And I’ll update this page as things evolve...

Can you suggest a project for the Data-First movement list? Is that a good name for it? Drop your thoughts into the comments below.

Please do share this post - by pushing the big purple button right below:

Guyren Howe

Feb 25

Frest https://frest.substack.com/ is relational at its core.

To build a chat app, you would define the type that represents a chat entry, and users would insert their messages into it. You might give each user a partially-applied view of the table that had already filled in their id. Knowing only that address, they can only assert values into the chat through that.

If you wanted more of an “app” than just the standard FileMaker like interface, you could override the UI for the chat type, generating custom chrome values containing the functions you wanted available in the UI.

Expand full comment

Federico Pereiro

Feb 16

Hi Duncan! Federico here, from TODIS/cell. Happy to have been included in this list!

I responded to the C to F converter in the Future of Coding Slack :), given the lengthy response I won't repeat it here. But I think we can agree that some sort of reactive system that on the update of one temperature, updates the other, is necessary at any rate.

Concerning your other questions:

> Does the “single dataspace” (presumably spanning internet domains) have globally-unique IDs pointing to instances of your data structures, or do you have paths that start with domain names like HTTP? How does the “single dataspace” cross domains, in other words?

I would leverage as much as possible what already exists: HTTP and DNS. I would make it as easy as possible for each person to own one node (one server, really) and send receive HTTP traffic with serialized calls.

HTTP calls, within cell, could be "polling" or auto-updated through a websockets signal, so that if they read data from another node, they get updated.

> And does this work in a decentralised way?

I think my goal here is to make it as easy as possible for everyone to have their own node. Either on a VM or dedicated server in the cloud that they own/rent, their own computer (with the drawback of intermittent connectivity) or provided as a service by myself (which would run on dedicated servers). I'm interested more in an open protocol and way of doing things, and tackling the distributed ownership of machines later.

I'm also thinking (thanks to your stimulus with these questions) on having a server that acts only as a relay between two local computer nodes (say, two phones). The server is just a relay and only stores until the receiving node receives the data. The communication is encrypted end to end through public keys. That would keep the convenience of a server (minus the backup) but with much more decentralization.

> And will it propagate data updates?

Yes, ideally, as long as the references to external nodes are kept being polled or there's a push mechanism to notify of changes. What excites me is that the model of call and response works for any source of data (internal or external, under your control or not).

> Could the shape of data above the basic lists, hashes, etc. be standardised in a global schema for common types such as calendar event, contact, etc.?

I surely hope so. It's just a matter of structuring it decently and having a number of people using it. And even if they don't, if they're explicit enough, you might be able to convert it to your own structure on the fly, with a LLM. My hope is, that by setting these four base data types, and an unambigous and readable way to serialize it as text, it will be easy to build new "types" (or "schemas") on top of that.

Concerning your approach, it is great! What I don't resonate with is the lack of distinction between "what you control" and "what you don't". I can see everything as interconnected, but personal (or organizational) control over a part of the dataspace is important. I'm concerned about "deep linking" to someone else's dataspace, it reminds me of NAS (network attached storage) and gives me an unreliable vibe. I could be completely off, though.

I do however strongly resonate with the need to break the app paradigm into something more decentralized, while keeping its practical aspects. Also, the combination of the web and spreadsheets is killer and I'm betting on that full on.

The Reality Computer

Apps considered harmful

The Inversion (simply)

Discussion about this post