0003 Splitting logics into builder declaration and interpretation
This RFC is marked as non-actionable due to its overly wide scope.
1 Summary
Logics within CatColab specify an underlying double theory, a collection of front-end widgets that users can use to build models, which analyses are available to models in that logic, and maybe some other odds and ends besides. It seems possible that a better separation of these aspects would make the creation and modification of logics simpler for developers, and also provide a clearer foundation for how we might think about migrations between logics.
Here I will sketch out why viewing logics as functors from instances of a schema to models of a theory could be a good idea, and what an implementation might look like. Much of this comes from conversation with Jason Brown and Evan Patterson.
The motto that I might try to summarise this with is something like the following:
The core should know about mathematical abstractions such as double theories, and the front-end must therefore have some understanding of this. But, as much as possible, the front-end should be a builder for instances of schemas with the extra knowledge of how these instances are to be interpreted in the core.
This RFC touches on two suggestions:
- Rethinking the front-end editor as an editor for builder declarations (Section 3.1)
- Allowing a single front-end logic to have multiple interpretations (Section 3.2).
It is reasonable that each of these suggestions merit their own RFC with more detail; this current document exists more to give context to those future conversations.
2 Motivation
Consider stock-flow diagrams. These are realised in CatColab as models of a tabulator double theory, and for good reason: the compositional structure of tabulators is useful. However, for many applications (i.e. for many choices of analyses), we could view stock-flow diagrams as Petri nets, where flows become transitions and links simply become another input arc; there exists an ODE semantics on Petri nets that, under this interpretation, precisely recovers mass-action semantics on stock-flow diagrams. Alternatively, we could consider stock-flow diagrams as causal loop diagrams (CLDs), where flows become spans and links become arrows to the apex of a span; there also exists an ODE semantics on (\mathbb{N}-graded) CLDs that, under this interpretation, precisely recovers mass-action semantics on stock-diagrams, but, furthermore, motif finding will show that flows do not compose.
In short, there are many double theories in which we might wish to interpret stock-flow diagrams, but the way in which users can build stock-flow diagrams (i.e. the fact that they can models as colimits of elementary stock-flow diagrams, namely a single stock, a single flow between two stocks, and a single link into a flow) remains unchanged.
This suggests that we could separate out the actual data structure that is being built in a notebook from the way in which it is being computed with by the analyses. That is, we could define the logic of stock-flow diagrams to consist of
- a single front-end builder declaration, namely the ability to create stocks, flows, and links; and
- a collection of interpretations of models thus created in various double theories.
3 Mathematical specification
The mathematics here is somewhat informal, since I have yet to actually be careful with proofs. This should not, however, affect the reading of this RFC, whose purpose is to figure out whether or not this seems like a sensible path to follow in the first place.
Definition 1 A (CatColab) logic \mathtt{L} consists of
- a small (possibly necessarily finite) category \mathsf{B}, called the builder declaration; and
- a collection of (1-)functors I_i\colon[\mathsf{B},\mathsf{Set}]\to[\mathbb{T}_i,\mathbb{S}\mathsf{et}] for some double theories \mathbb{T}_i, where each I_i is called an interpretation (of \mathtt{L}) in \mathbb{T}_i.
3.1 Builder declarations
It is reasonable to interpret the builder declaration \mathsf{B} as a schema, used for building instances \mathsf{B}\to\mathsf{Set}. However, this is not quite the attitude that I intend. Rather, we should consider \mathsf{B} as specifying a system of dependent types.
I am not a type theorist, at all, so I will give an example rather than use more words (and the words I do use will be softened by scare quotes). Consider the category \mathsf{B} = (E \rightrightarrows V) which we recognise as the schema for graphs, since to give a functor G\colon\mathsf{B}\to\mathsf{Set} is precisely to give sets G(E) and G(V) along with source and target functions s,t\colon G(E)\to G(V). One can think of constructing the object G(E) \rightrightarrows G(V) by gluing together basic blocks: we can posit the existence of some e\in G(E) (a “walking edge”) and this “automatically” implies the existence of two vertices, namely s(e) and t(e). We can then posit the existence of, say, x,y\in G(V), and make the identifications s(e)=x and t(e)=y.
Let’s compare this to what happens in practice in e.g. CatColab, where a user is building a graph as a model of the theory of graphs. The initial state of a new notebook has G(E)=G(V)=\varnothing. The user can posit the existence of an edge e\in G(E) by creating a morphism and naming it e. But this leaves the notebook in an incomplete (“non-compiling”) state: the arrow e needs to have a source and target set; CatColab does not freely create two objects for the source and target.
This difference in behaviour (or attitude) can be seen as an artefact of working with the 1-categorical schema E\rightrightarrows V rather than the corresponding (under some correspondence that we briefly describe in Section 3.2) double-categorical “schema” V\overset{E}{\mathrel{\mkern 3mu\vcenter{\hbox{$\shortmid$}}\mkern-10mu{\to}}}V.
But we can also change the way that we read our 1-categorical schema. Using the fact that our \mathsf{B} is a direct category, we can start from the locally initial elements (i.e. ones with no incoming morphisms) and assign a degree to each object in \mathsf{B} such that \deg(x)<\deg(y)\implies\operatorname{Hom}_\mathsf{B}(x,y)=\varnothing (and we can do this, to be imprecise, in a “best possible way”, making each object have the minimal possible degree). In our running example of \mathsf{B}=(E\rightrightarrows V) we can take \deg(E)=0 and \deg(V)=1. Then we can build up the data of type dependencies by running over objects of increasing degree: objects of degree 0 correspond to types with no dependencies; objects of degree 1 to types with dependencies on those of degree 0; and so on. To return to our example of graphs, we see that we get two types: the non-dependent type V\colon \mathcal{U}, and the dependent type E:V\times V\to\mathcal{U}. This means that to declare a term of type E we need to give two terms of type V at the moment of declaration.
Rather than rambling on about type theory (since, again, I do not know about type theory), I will simply repeat the main point:
Builder declarations can be understood as describing a system of dependent types, and notebooks are precisely constructions of terms of these types.
3.2 Interpretations
There are different adjectives that we can use to describe interpretations, corresponding to factorisations. For example, the co-continuous interpretations are those that arise from functors \mathsf{B}\to[\mathbb{T}_i,\mathbb{S}\mathsf{et}]. There is a more restrictive adjective that we might also be interested in, for which we will give a “definition” that makes sense whenever it makes sense, and doesn’t when it doesn’t.
Definition 2 Let \mathbb{T} be a double category with no non-trivial cells. Define the span replacement \mathsf{SpanRepl}(\mathbb{T}) of \mathbb{T} to be the 1-category given by replacing every loose arrow in \mathbb{T} by a span (thus freely adding a new object for each loose arrow). If \mathbb{T} has tabulators, then \mathrm{Tab}(f) is precisely the apex of the span corresponding to f.
For example, consider the theory \mathbb{T}_\text{cat} for categories, given by a single object V with the loose identity arrow E\colon V\mathrel{\mkern 3mu\vcenter{\hbox{$\shortmid$}}\mkern-10mu{\to}}V. Then \mathsf{SpanRepl}(\mathbb{T}_\text{cat}) is the schema for graphs E\rightrightarrows V. As another example, consider the tabulator theory \mathbb{T}_\text{tab} for categories with links. Then \mathsf{SpanRepl}(\mathbb{T}_\text{tab}) is the schema for stock-flow diagrams with links (shown in Section 4).
Definition 3 An interpretation (i.e. functor) I\colon[\mathsf{B},\mathsf{Set}]\to[\mathbb{T},\mathbb{S}\mathsf{et}] is ccf (co-continuous free) if it arises from a profunctor P\colon\mathsf{B}^\mathrm{op}\mathrel{\mkern 3mu\vcenter{\hbox{$\shortmid$}}\mkern-10mu{\to}}\mathsf{SpanRepl}(\mathbb{T}). It is furthermore truly ccf if P is a functor, and identity ccf if P is the identity functor.
For example, if \mathsf{B} is the schema for graphs, then the profunctor \mathsf{B}^\mathrm{op}\to\mathsf{SpanRepl}(\mathbb{T}_\text{cat})=\mathsf{B} that sends vertices to vertices and edges to spans, gives a ccf interpretation. As another example, if \mathsf{C} is the schema for stock-flow diagrams with links, then the identity functor \mathsf{C}^\mathrm{op}\to\mathsf{SpanRepl}(\mathbb{T}_\text{tab})=\mathsf{C} gives a ccf interpretation (in fact, it gives something much stronger, since the profunctor is actually a functor, and the functor is actually the identity).
I don’t yet know how exactly to classify this hierarchy of adjectives, coming from the increasingly general sequence of types of migration:
| adjective | form |
|---|---|
| identity ccf | \mathsf{B}^\mathrm{op}\xrightarrow{\operatorname{id}}\mathsf{SpanRepl}(\mathbb{T}) |
| truly ccf | \mathsf{B}^\mathrm{op}\to\mathsf{SpanRepl}(\mathbb{T}) |
| ccf | \mathsf{B}^\mathrm{op}\mathrel{\mkern 3mu\vcenter{\hbox{$\shortmid$}}\mkern-10mu{\to}}\mathsf{SpanRepl}(\mathbb{T}) |
| co-continuous | \mathsf{B}^\mathrm{op}\to[\mathbb{T},\mathbb{S}\mathsf{et}] |
| \mathsf{B}^\mathrm{op}\mathrel{\mkern 3mu\vcenter{\hbox{$\shortmid$}}\mkern-10mu{\to}}[\mathbb{T},\mathbb{S}\mathsf{et}] | |
| [\mathsf{B},\mathsf{Set}]\to[\mathbb{T},\mathbb{S}\mathsf{et}] |
but it seems like (a) this would be good to understand, as we do for e.g. duc-queries vs arbitrary queries, and (b) quite a few key examples seem to be at least ccf.
Rather than a front-end logic prioritising a single double theory in catlog, it could correspond to multiple double theories (and perhaps even other things besides).
4 Examples
We will take the logic for stock-flow diagrams as our running example, spelling out the mathematical story and then giving pseudo-code for implementation.
4.1 Mathematical example
The logic \mathtt{L}=(\mathsf{B},\{I_\text{tab},I_\text{sgn},I_\text{sym}\}) of stock-flow diagrams is given by the following:
\mathsf{B} is the schema for stock-flow diagrams with links
I_\text{tab}\colon[\mathsf{B},\mathsf{Set}]\to[\mathbb{T}_\text{tab},\mathbb{S}\mathsf{et}], where \mathbb{T}_\text{tab} is the tabulator theory of categories with links, and I_\text{tab} is induced by the “obvious” functor \begin{aligned} \mathsf{B}^\mathrm{op} &\longrightarrow [\mathbb{T}_\text{tab},\mathbb{S}\mathsf{et}] \\\mathsf{Stock} &\longmapsto x \\\mathsf{Flow} &\longmapsto x\to y \\\mathsf{Link} &\longmapsto x\overset{\overset{z}{\downarrow}}{\to}y \end{aligned} (note that this is identity ccf).
I_\text{sgn}\colon[\mathsf{B},\mathsf{Set}]\to[\mathbb{T}_\text{sgn},\mathbb{S}\mathsf{et}], where \mathbb{T}_\text{sgn} is the theory of signed categories, and I_\text{sgn} is induced by the functor \begin{aligned} \mathsf{B}^\mathrm{op} &\longrightarrow [\mathbb{T}_\text{sgn},\mathbb{S}\mathsf{et}] \\\mathsf{Stock} &\longmapsto x \\\mathsf{Flow} &\longmapsto x\xleftarrow{-}f\xrightarrow{+}y \\\mathsf{Link} &\longmapsto x\xleftarrow{-}\overset{\overset{z}{\downarrow}}{f}\xrightarrow{+}y \end{aligned} (note that this is ccf but not truly ccf).
I_\text{sym}\colon[\mathsf{B},\mathsf{Set}]\to[\mathbb{T}_\text{sym},\mathbb{S}\mathsf{et}], where \mathbb{T}_\text{sym} is the (modal) theory of symmetric monoidal categories (written as Petri nets), and I_\text{sym} is induced by the functor \begin{aligned} \mathsf{B}^\mathrm{op} &\longrightarrow [\mathbb{T}_\text{sym},\mathbb{S}\mathsf{et}] \\\mathsf{Stock} &\longmapsto X \\\mathsf{Flow} &\longmapsto X\to\boxed{F}\to Y \\\mathsf{Link} &\longmapsto X\to\overset{\overset{Z}{\downarrow}}{\boxed{F}}\to Y \end{aligned} (note that this is at least co-continuous, but I have no idea how \mathsf{SpanRepl}(-) behaves with respect to modalities, so am not too sure beyond this).
4.1.1 Analyses
If a logic admits interpretations into multiple theories, then we need to consider what should happen with analyses. As an obvious example, visualisation will be different for each interpretation. More serious, however, is an example like motif finding, where the results can be in clear disagreement: e.g. composite flows exist when we interpret a stock-flow diagram with I_\text{tab} or I_\text{sym}, but not with I_\text{sgn}. However, I also think that this is a feature, not a bug.
Analyses could come with drop-down menus for switching between interpretations, and it could be possible that having a global (i.e. per analysis notebook) option for picking one single interpretation proves user-friendly.
4.1.2 Migrations
We can specify a migration between logics in different ways, again in some order of increasing generality:
- Specify a functor of builder schemes
- Specify a migration of instances of builder schemes
- Specify a migration of models of interpretation theories.
5 Rationale and alternatives
We could take this idea and either considerably restrict its scope or lean into it even more. The former would look like still associating a single theory to each logic, but including information about non-trivial migrations within the logic definition itself. The latter would look like actually specifying the builder declaration as a schema even more explicitly, building it up as a colimit of \{ \mathsf{Object} \} \qquad\text{and}\qquad \{ \mathsf{Source}\leftarrow\mathsf{Arrow}\to\mathsf{Target} \}. For example, we might define the builder declaration for stock-flow diagrams as
interface Object {
name: string;
}
interface Arrow {
name: string;
source: Object[];
target: Object[];
}
let stock: Object = {
name: "Stock",
}
let flow: Arrow = {
name: "Flow",
source: [stock],
target: [stock],
}
let link: Arrow = {
name: "Link",
source: [stock],
target: [flow],
}Another option would be to actually allow for the interpretation functors to be defined in the core, and then merely called in the front-end.
6 Prior art
It seems possible that compositional notebooks can help with defining interpretation functors that send objects to motifs.