Today's work is particularly fun that it inspired me to write this appreciation piece for the past 2023-2024. 2023 is when I joined Actyx where I learned stuff relatively more exotic than my previous job.
The Other Meaning Of Systems
Let's disambiguate system software. There is system application engineering and there IS "system system" engineering. The first system of "system system" means things with many components. The second means software that is not applications--not used by end users. The two are quite different in disciplines, but having the same name means an engineer for one will eventually learn the other.
Let's talk about the first meaning of system--the many components. These components eventually get further and further, and they grew agencies and agendas--think replicas. With differing agency and agenda, the need for coordination arises, which people now know as distributed computing which coincides with my formal title which is Distributed System Engineering.
In Actyx is Actyx (or now called ax because of the similar naming shenanigans), a replicated event database that can coordinate many nodes without a central broker.
In contrast to my previous work which is a system application, Actyx is a "distributed system system".
The goal of system software is the same as any other software, which is to compute correctly. Distributed system adds CAP theorem into the mix of the problem, while a non-application system switches the problem from user-experience design to API design while reordering the importance of properties such as flexibility and performance to be higher.
Pure P2P
Speaking of coordinating without central broker, I realize that pure P2P is really a beast. Imagine discovery, identity, security, and all are done without central authority. Every responsibility of the central now belongs to each of the "peer" in the system. In a more abstract sentence: there is no single source of truth; everyone built their own truth for themselves.
Interesting feats can be achieved by imposing limit to oneselves. Actyx achieves events order consensus by using lamport clock. In an inverted perception, this is a form of limitation from all possible combinations of operation sequence: event creation, publishing, and reception must increment the clock and include the clock information. These are also known as protocols and the system relies on everyone running the right protocol.
P2P systems work by having pre-agreed protocols between the peers. This is an interesting and poetic property because it is similar to how DNA in nature works. The symbiosis of a flower and a bee is driven by each of their natural protocol; this protocol is pre-defined long before the flower's and the bee's conception.
Toying with Types
This week's work is inspiring because finally, I have a chance to design a type system.
When you program long enough, you'll learn to coexist with type which basically tell the computer what the shape of the thing you need to the computer to compute and what can the computer do with it. You can see on the internet that software devs have a love-hate relationship with type systems. JavaScript devs who adopt TypeScript is the perfect example of the split because some see it as additional obligation and some see it as a savior from the last-minute customer bug reports.
Some type systems are really fun because not only do they help me from making innocent mistakes, but can be very expressive. In my favorite languages, Rust and TypeScript, there's a way of saying to the computer that: This object I'm about to give you can be either this type or that type. You'll know later. It makes it very easy to handle dynamic data without giving up and then resorting to dynamically typed languages that don't scale in complexity. In Rust, it also solves the billion-dollar problem, null, by redefining nullable variable as a union called Option.
So today, instead of using the types, I'm building the types. Surprisingly it involves a lot of philosophical questions, such as: What is both a number and a 3? The answer is a 3. Next: what is both a number and a boolean? There shall never be a number that is simultaneously a boolean, so the answer is never; never is a useful type apparently because it is valid in the code; the caveat is that it is imaginary; it is impossible to construct an instance of never.
That's just the surface. Now let's answer this other class of philosophical question: is never | number equals to number? Obviously! But to computers it is not obvious because of an ontological ambiguity, they both have different bytes. This is why we have to crunch numbers!
So a type, as everything in a computer is just numbers with different properties, but instead of +,-,/,x we instead | and &, which I collectively call the collapsing when calculated.
One interesting revelation about types is that it is discrete, continuous, and a set at the same time.
Another is that intersection (|) of unions (&) are distributive just like + and * where (a|b) & (x|y) = a&x | a&y | b&x | b&y; no need to delve deep into the details; that thing just reminds my old self stressing out on a test paper during my elementary school days.
The Ledger
Every now and then I see a debate on X (formerly Twitter) about some arithmetics where one person shouts: "No, that's dumb! Remember PEMDAS". This thing happens too in distributed systems because peers receive the same signals at different times.
Imagine a system where there is a peer that publishes a sequence of arithmetics operations;
in another part of the world other peers must receive the operations and compute them.
The data passes through the internet which might reroute the packets and this shuffles the order of operations.
One peer can get 0 +1 *4 -3 +2 = 3 and the other gets 0 +2 -3 *4 +1 = -3.
One peer gets 3 and the other gets -3, which will cause disagreement over two different part of the world and potentially cause a fight on a phone.
If we treat these numbers as events and put Actyx in the middle of the peers like so peer1 -> Actyx1 -> Actyx2 -> peer2 then we have gotten rid of this inconsistency problem.
Solved!
So we thought (actually, my boss thought): let's build a state machine with these events!
This becomes machine-runner.
In practice, these state machine becomes a ledger that everyone agrees upon without peers asking "did you get that right?" back and forth. It is cheap and easy to use. Of course, with distributed systems there is no escaping the multiverses but that's for another day.
Building this library is intricate because of the user experience. The user will design their own states events events; then when the state machine runs, they must be able to see the correct type for the runtime states. These are done so that the user can't make mistakes because they have a lot more to deal with since we're talking distributed computing. So there are complex type operations and transformations involved between the function calls. This was long before the type calculation work and what initially piqued my interest in complex types and their transformation in the first place.
Open Source
We made Actyx (or ax) open source, which was quite a work.
Take a look, just in case you need a database solution and your use case is what it does best.
At the moment, it works better with TypeScript app because we have a mature SDK for it.
Nevertheless, it can work standalone and you can interact with it via a GUI or HTTP/WebSocket connection just like the other less-exotic databases out there.
Musicmaking
I rediscovered again my interest in music-making. But I'm often fall in the rabbit hole of creating instruments by chaining synths and effects. The other problem that music comes to me at the most inconvenient time, shower time, chore time, laundry time, and when I got in front of the laptop it is all gone. Probably it is a good idea to bring a recorder everywhere now to record my air guitaring and drumming.
-fin-