valand.dev

Extending String into a Multidimensional Entity

Imagine you have a SaaS. It has a lot of users. Your product provides “boxes” for users to store data. It is like web forum, but instead of threads and posts, it has boxes and data. For administrative purposes, you will need to list your users' data regardless the box they are in. That’s the whole business, boxes and data.

To solve that, you’ve decided to label all data with the box’s name. Data in box “a” is given label “box”: “a”, data in box “b” is labeled “box”: “b”. The box is the attribute name and a is the attribute value. On why you need to have name and value for an attribute, long story short is because it is one of the most common ways to describe something.

The attribute value “a” is a string. String, in computer science is basically a group of characters. “Hello, world”, “k”, and “We will rock you” are strings. String is one of the most common data types in modern programming languages. In programming, a data type is something you use to tell the computer how you intend to use the data. Other examples of data types are number, which you can add, subtract, multiply, and divide with, and boolean, which simply express true or false. Depending on the programming languages, there are other data types.

Strings are one-dimensional. They look like numbers on a ruler. If you take all possible unique strings, sort it, and spread them on a line, you get a one-dimensional space.

img

String’s pseudo-infinite one-dimensionality compared to numbers on an axis of a Cartesian coordinate system.

Other than the box name, the you don't have anything to differentiate one box from the other. Because of that, box is considered a one-dimensional entity.

img

The boxes are one-dimensional because it is described by one attribute with a one-dimensional attribute value

Jury-rigging String

One day one of your users asks, “Could you give us a box where we could put everybody’s things together? Also, the people who can see inside the box are those who have already put something inside the box.”.

That request is gaining traction in the community, there are hundreds of thousands of upvotes. It is good for traction. You can't ignore that. After a long thought, you give in to their wants.

You name this special box the “Main” box. Scattered around your product are logics which instructs “if it is the ‘Main’ box, do this, if not, do that”. Those lines of code are for supporting the visibility feature. Any programmers who have a year of experience can tell you that the same code for the same logic scattered around your program is not good. A more experienced programmer will notice that that logic is an important knowledge of your product's business. That’s one tech debt you should address some time.

img

See there is something different in this picture. Main is on a different level from A, B, C, D, E. If you noticed, that is not one-dimension anymore.

“That is not quite one-dimension, right?”, you ask yourself.

The new "if the box is Main" rule introduced a new dimension. The box are now two dimensional. One dimension is pictured as the horizontal axis, the name of the box. The other one is the vertical axis denoting if a box is “Main” or not “Main”.

With the new rule, something is describing the box that is not part of the box. “Main” box by itself is nothing special, but when you put it into a certain environment, you can see that it is treated differently. That is called an implicit behavior. An implicit behavior is bad because most of the time it is unpredictable.

Encountering implicit behavior is like rolling a dice that sometimes gives you prize and sometimes throws you in to jail.

“Another dimension would be a pain in the donkey”

Another user asked if you could make a box to be exclusive to a user. Other users must not be able to see what is inside a box without getting permission from the box’s owner.

img

The boxes with 3 dimensions.

You can’t get more dimensions from a string. The last dimension was not even from the string.

“What if we make box something other than a string?”, you asked.

And then one of your programmers said, “We’ve used string everywhere to describe box, we can’t just change that! It’ll take a lot of time to…”.

“I know, I know”, you interjected. You considered the amount of time your programmers need to fix the whole thing if the box is changed into something other than string. This hypothetical situation you were going to get into, if not stopped by your programmer, is called breaking change. It’s one of those moments programmers are not really fond of.

“Let me think for a bit. What if a box is not a string, but still can be described with string?“

You tinkered with the concept of the box. You try to treat the box to be anything and can have more than one attribute. From now on, the box is not a string. That is your first step detaching the box from the string and its one-dimensionality.

The box you first come up with, you will call them the Ordinary box. You need to keep them because you now have a lot of them. You don’t want to change all of them for your program to continue working.

As for the box with the special treatment with the name “Main” box, let’s call them Main box. The lack of apostrophe (“)? is to differentiate the “Main” box name and the Main box type. It is similar to the concept of string being a data type. Ordinary box and Main box are data types.

For the sake of fulfilling the users’ wants, you add the User box. User box is the exclusive one people are asking.

With all three types of boxes, now you realize what box actually is. A box is either an Ordinary box, a Main box, or a User box. If that is confusing, a similar concept would be the boolean. A boolean is either a true or a false.

img

What happens here? You freed yourself from dimensionality. It is magic! Ascension to greater dimension! The box is now three different kinds of boxes. In the future, it can be more than three, it is up to you. You now have one dimension which gives you the choice to create another concept with any dimensionality you want. Main Box is a zero-dimension box, it’s either there or not there. User box and Ordinary Box is a one-dimension box. So how many dimensions does the concept of box have right now? That’s right, 1+whatever number you want it to be!

The one additional dimension is a form of how to create polymorphism. You might have heard that one if you’ve read some OOP book.

Next, you will be dealing with the string form of a box. The boxes’ names are everywhere and they are string. You decide to make a library to translate those raw string names into a knowledge of box and vice versa. That library will be everyone’s go-to code. Every future translation implementation must go into that library.

To avoid breaking changes, you choose to adhere to the currently existing specification. The most specific rule in the existing specification is the Main box. The Main box must be represented with the name “Main”. You add a comment on why the Main box is represented with the name “Main”, explaining that it is legacy code.

You need a different string representation for User box from Ordinary box. You see that your users have never used the character colon (:) for any box. You decide to use that character to book a portion from all possibilities of Ordinary box’s. Every box' raw name that starts “:user:” is a User box. Example: a box with the raw name “:user:valand” is a User box belonging to a user with ID “valand”.

Next, for the Ordinary box, any box in which raw name does not start with a colon (:) and is not “Main” is Ordinary box.

Last, any raw string that does not conform with the rules above will result in an error if translated to a box.

That concludes the box problem and its solution. Defining a new type of box together with its string representation is very flexible to make a new box subtype. The colon (:) rule feels very effective in introducing a new string representation as you can add anything in between the colons (e.g. :group:, :admin:, etc).

Takeaways

The box is a thought experiment about a technique to overcome a problem with subtypes. I have been dealing with a similar problem for years, but it hasn't been solved due to its enormous scale. The solution above is one of the solutions that comes into mind.

I write this particular solution because what happened in the process of finding it was mind-opening. After a couple of years of professional days as a software developer and a couple of years being a information technology college student before that, I finally understand what is abstraction, what can be abstracted, and what must not. It is the kind of magic the digital wizard of old used to conjure integer, string, float from mere array of bits. The dimensionality ascension feels like waking up from the matrix, pure creation, surreal. In-between the programmer's higher-than-average intensity of interacting to an entity pure of logic and determinism, Talking about pure creation and feeling creative feels good.