2020-09-12

From Rust to TypeScript

Edits:

HN discussion here: https://news.ycombinator.com/item?id=24453007#24454277. Thank you HN folks for the corrections, kind responses, and insightful discussion.
*fixes in Getting Rid of Exceptions snippet 1.

I was introduced to Rust in 2018 and has been enamored since. Rust is a system programming language, much like C++. Unlike C++ though, being relatively new, its language design is more modern and sophisticated. Writing with can feel more like writing TypeScript or Haskell. Not surprising since, despite being a language with a very minimum runtime and no GC, it derives many principles of functional programming such as immutability, type inference, higher-order functions, pattern-matching, etc. Over the course of tinkering things with Rust, I realized that writing Rust code makes me a better coder in other languages. This article will describe how my attempt to push the best in Rust's language design into TypeScript, my main language, without altering the language itself.

Getting Rid of Exceptions

The first thing that strikes me the first time I learnt Rust is that there are no straightforward to write exception-like code in Rust. For someone used to exceptions after using many languages like C++, Java, JavaScript and TypeScript, this seems like an incomplete language. But Rust's lack of straightforward exception-like is actually thought out in advance.

Exceptions feel neat the first time you understand it. A thrown exception skips code execution into the catch block. By doing that you can ignore the rest of code in the function. The rest of the code in the function represents positive case, which may seem irrelevant when an error happens. Here's a piece of code:

function foo(someNumber: number) {
  if (someNumber === 0) {
    throw new Error("Error: foo");
  }
  return someNumber + 1;
}
function bar(someNumber: number) {
  return foo(someNumber) + 1;
}
function baz(someNumber: number) {
  return bar(someNumber) + 1;
}
baz(0);

When baz is called it will throw an uncaught exception. By reading the code you know that foo throws an Error. Debugging this snippet is a piece of cake, right?

Now let's see another snippet. In the snippet below you are importing a function that can throw exception.

import { callMe } from "somePackage";

try {
  // This line may throws an error
  callMe();
} catch (exception) {
  // This line may unexpectedly throws an error again
  // if error turns out a nullish value
  // Uncaught TypeError: Cannot read property 'message' of undefined
  console.error(exception.message);
}

Compared to the first snippet, this one is harder to debug. It seems so because 1.) You don't know that callMe will throw an error or not. 2.) If it throws an exception you will not know what is the type of the exception. The more modular your code is, the harder it is to debug exceptions.

Why is that? If you noticed, in the first snippet, you know which function is throwing what because the throw and the caller are in one file. You don't need to switch files or even scroll that far to see what's wrong. Furthermore, unlike in other language like java, TypeScript exceptions are not typed. Because of that, unlike return values, the compiler could not deduce if a function will throw an exception or not, and that's bad. Combined with the fact that an exception propagates up until they are caught. That means exception can be n-level deep. The deeper it is from, the harder it is to debug it.

If you're familiar with statically-typed language, deferring what's supposed to be done in compile-time to run-time is bad. It hands over problems from the compiler to the user.

There are best practices for writing try/throw/catch, such as developers should throw Error objects instead of nullish values. But pushing everyone to apply best practice is not enough. You cannot assert control toward what is being written in external dependencies. There are not many points introducing law when enforcers are nowhere to be found.

Only when you have dealt with complex codebase you will understand why we should not use exception for flow control.

So how Rust does this? Rust encourages us to return errors instead of throwing it by providing a data type called Result. It is a tagged union which can either be Ok(IdealData) or Err(NonIdealData). With this, a function can communicate the situation of a a call to its caller. For example:

fn main () {
    let number: Result<u8, _> = "1".parse();
    let non_number: Result<u8, _> = "a".parse();
    println!("{:?}", &number);      // Ok(1)
    println!("{:?}", &non_number);  // Err(ParseIntError { kind: InvalidDigit })
}

The function .parse() converts a string into u8. There are scenarios where parsing a string into a u8 can result in failure, one is if the parsed string is not a numeric string. Because .parse() has a negative case, it returns Result<u8, _>.

In Rust, Err does not jump like exceptions. Regardless the result is Ok or Err, the program will simply execute the next line. To handle the error the programmer has to check the return value is Ok or Err. Early return is encouraged instead of throw. Rust has the ? syntax which help making early returns more concise to write and read.

To adopt this feature to TypeScript, these are the important things:

Errors are returned instead of thrown
Errors types are known

To simulate this behavior I am going to use tagged union too. I found this helpful library, fp-ts. It covers a great deal of functional programming experience by leveraging TypeScript's type system, but I will only use a small portion of it.

fp-ts has Either<Left, Right> data type. It is like Rust's Result but more flexible. Either is a tagged union value container. Either is Left or Right, usually Right represents the ideal case and Left represents the non-ideal case but both can be used with any types. Below is the type definition of Either.

type Either<L, R> = Left<L> | Right<R>;
type Left<L> = { _tag: "left"; left: L };
type Right<R> = { _tag: "right"; right: R };

What's cool about Either is that TypeScript actually supports deducing tagged union from its tag. For example inside this block if (someEither._tag === "left") { } TypeScript knows that someEither is a Left. If there's a return in that block, TypeScript will deduce that for the rest of the function block, someEither data type is Right and not Either anymore.

Let's make a called tryCatch. It is used to wrap a function into Either. fp-ts actually provides this function, but let's write it so that we know what's under the hood.

const tryCatch = <T, E>(fn: () => T, onError: (error: unknown) => E) => {
  try {
    // right() wraps value as Right
    return right(fn());
  } catch (error) {
    // left() wraps value as Left
    return left(onError(error));
  }
};

With our newly created tryCatch, creating a URL from string for example can be done like this.

export class InvalidURLError extends Error {}
const makeURLFromString = (str: string) =>
  tryCatch(
    () => new URL(str),
    () => new InvalidURLError()
  );

Calling the function will look like this. The following snippet receives a string called maybeUrl. If maybeUrl is not a valid URL it returns an error. If the URL is not calling "example.com" it returns an error. Otherwise, it returns a promise from the fetch call.

export class NotExampleDotComError extends Error {}
const callThisMaybeUrlIfItIncludesExampleDotCom = (
  maybeUrl: string
): Either<InvalidURLError | NotExampleDotComError, Promise<any>> => {
  const urlResult = makeURLFromString(maybeUrl);
  // Early return the error
  if (isLeft(urlResult)) return urlResult.left;

  const url = urlResult.right;
  if (url.origin !== "example.com") return left(new NotExampleDotComError());

  return right(fetch(urlResult));
};

There's a bit more line of codes here, but it is great that the TypeScript compiler now can deduce the error type and you have more control over the flow. You can even deduce if the variable error is instanceof NotExampleDotComError or InvalidURLError for cases like printing different errors.

Safe TypeScript

If you have played an early 2000s FPS games and tried to cheat, usually you will find the noclip command. It makes your avatar fly through walls and grounds. Sometimes, though, you take a shortcut and triggers the wrong script. When that happens, you can't finish your level and must restart to make it work again.

That's how I see Rust's unsafe and TypeScript's any.

These two features solve a similar problem. In Rust's case, it enables developers to temporarily unlock the power of handling raw pointer in case someone needs it. In TypeScript's case, someone might need to escape into the free-typed JavaScript world. And in both, it is the programmer's responsibility to make sure everything is fine before the program goes back to the safe system.

Unsafe to safe Rust is like any type to typed TypeScript. Unlike Rust, TypeScript's type only works until you compile it to JavaScript code. Rust has TryInto/TryFrom trait to convert raw data into a data type. But in TypeScript's case, if you inject a value that does not match TypeScript's type annotation, TypeScript can't do anything. This comes from TypeScript team's decision to make it only a thin layer on top of JavaScript. They thought it is best if for people to write TypeScript like JavaScript and the compilation result to look almost the same as the source code.

My approach for this would be securing all external hole of my application, which are:

WebAPIs, which consist of DOM, deserializable user inputs, fetch API, etc.
External Dependencies which returns any

When I was trying this approach, I was lucky to find io-ts, created by the same author as fp-ts. This library helps me create a runtime type checker, or "codec" as the library's author calls it. Instead of writing:

type User = { username: string; userId: string };

I wrote the codec for User:

const UserCodec = ioTs.type({ username: ioTs.string, userId: ioTs.string });
type User = ioTs.TypeOf<typeof UserCodec>;

Again, this is longer, but this is very useful if we want to secure those holes.

export class InvalidTypeError extends Error {}
export class Fetch4xxError extends Error {}
export class Fetch5xxError extends Error {}
export class FetchUnknownError extends Error {}
export class FetchNetworkError extends Error {}

const fetchCurrentUser = (): Promise<
  Either<
    | InvalidTypeError
    | Fetch4xxError
    | Fetch5xxError
    | FetchUnknownError
    | FetchNetworkError,
    User
  >
> =>
  fetch("https://some/url/to/fetch/current/user")
    .then(async (res) => {
      if (res.status >= 200 && res.status <= 299) {
        const maybeJSON = await tryCatchAsync(
          () => res.json(),
          () => null
        );
        if (!isLeft(maybeJSON)) return left(new InvalidTypeError()); // Not JSON, Left<InvalidTypeErorr>
        const maybeUser = maybeJSON.right;
        if (!UserCodec.is(maybeUser)) return left(new InvalidTypeError()); // Not User, Left<InvalidTypeErorr>
        return right(maybeUser); // Right<User>
      }
      if (res.status >= 400 && res.status <= 499)
        return left(new Fetch4xxError()); // Left<Fetch4xxError>
      if (res.status >= 500 && res.status <= 599)
        return left(new Fetch5xxError()); // Left<Fetch5xxError>
      return left(new FetchUnknownError()); // Left<FetchUnknownError>
    })
    .catch(() => left(new FetchNetworkError())); // Left<FetchNetworkError>

The code above already includes error checks for 4xx, 5xx, non-JSON string, network error, any other error we are not specifying, not-user error, and the positive case. With only 15 lines of code, these function already covered exhaustive cases. You don't need to worry about undefined behaviors.

Furthermore, you can use these abundant information to show the user many things. For example, in React we can write it like this:

render() {
  const { userResult } = this.state;
  return (
  	<>
      { isRight(userResult) && <UserDisplay user={userResult.right} /> }
      { isLeft(userResult) && (userResult.left instanceof Fetch5xxError) && <span>There's a mistake in our side. Contact our administrator.</span> }
	  { isLeft(userResult) && (userResult.left instanceof InvalidTypeError) && <span>Cannot find user. Did you click the right link?</span> }
	  // and so on
    </>
  )
}

There's another value gained from adopting both Result and the type safety. If you notice, the fetchCurrentUser function almost looks like it is declarative and/or functional. Writing this way gives a semantic value, a more concise but readable function that returns more information.

Types as Business Objects

Types are considered hard to manage for those coming from a dynamic-typed language like JS. Adopting TypeScript has its hurdles and most of the time types are the cause of the hurdles. Compilers can be pretty hard to deal with, spewing errors on compilation. But if done correctly, types can be a great ally in safe-guarding business objects and logics, leveraging the compiler to do logic checks instead of being burdened by it.

This method, "Types as business objects", is inspired by Rust's struct enum and pattern matching. Enum and pattern matching provide an easy way to write a polymorphic object and methods for it. Both are from FP influence.

Below is a snippet of polymorphic enum shape which can be Rectangle, Triangle, or Circle.

enum Shape {
	Rectangle { width: f64, height: f64 },
    Triangle { base: f64, height: f64 },
    Circle { radius: f64 }
}

A function to find the area and circumference of a shape can be written as

impl Shape {
    fn get_area(self: &Self) -> f64 {
        match self {
            Shape::Rectangle { width, height } => width * height,
            Shape::Triangle { base, height } => base * height / 2f64,
            Shape::Circle { radius } => std::f64::consts::PI * radius * radius,
        }
    }
}

As you can expect from FP features, enum and pattern matching can make calculation looks declarative.

As explained above, in TypeScript we can use tagged union to simulate Rust's enum. It turns out that tagged union are really good in describing an application's state. I will use an example that uses React.

An example: Interactive login page. An interactive login page has 3 states, IDLE, LOGGING_IN, LOGIN_SUCCESSFUL. A common practice would be using TypeScript's string enum to describe the states. I utilize type system a little bit more.

// Instead of this
enum StateEnum {
  IDLE,
  LOGGING_IN,
  LOGIN_SUCCESSFUL,
}
type PageState = {
  state: StateEnum;
  lastError: Error;
  username: string;
  password: string;
};

// I write it like this
type PageInput = { username: string; password: string };
type PageState =
  | ({ kind: "IDLE"; error: Error } & PageInput)
  | ({ kind: "LOGGING_IN" } & PageInput)
  | { kind: "LOGIN_SUCCESSFUL" };

Having states with the correct types you will be able to write pattern-matching-like render() function.

class LoginPage extends Component {
  // rest of the code
  render() {
    return (
      {state.kind === "IDLE" && state.error && <ErrorComp error={state.error} />}
      {state.kind === "IDLE" && <SubmitButton />}
      {state.kind === "LOGGING_IN" && <SubmitButton disabled={true} />}
      {state.kind === "LOGIN_SUCCESSFUL" && <SuccessNotice />}
    );
  }
}

Having states with the correct types also lets your compiler remind you to not stray from your business logic. For example, you are not allowed to accidentally write...

{ state.kind === "LOGGING_IN" && <ErrorSection error={state.error} /> }

...because the compiler knows that when LOGGING_IN, Login Page cannot have error.

Also, it is easier to write a critical section of the page with partial pattern-matching-like code.

class LoginPage extends React {
    // the rest of the code

    attemptLogin = async () => {
        const pageState = this.state.pageState;
        if (pageState.kind !== "IDLE") return;
        await attemptLogin(username, password)
			.then(res => {
            	if(isRight(res)) return this.setState({ kind: "LOGIN_SUCCESSFUL" });
            	return this.setState({ pageState: { ...pageState, error: res.left } });
            })
    }

    // the rest of the code
}

This pattern effectively turns compilers, editors, IDE, anything that scans ASTs into guards for your business logic.

Block as a Story

In Rust, blocks have a special relationship with memory allocation. In other languages, a block only serve as a place for sequential code to exists. In Rust, variables that are allocated in a block will be dropped when a block ends.

{
    let some_variable = get_some_value();
    // ... do stuff

    // here some_variable will be dropped/deleted
}

Moving data outside of the current block cannot be done easily. You have to instruct Rust to either 1.) let the variable be a reference in the first place, or 2.) move the variable out and render it unusable in the lines below. For people coming from TypeScript/JavaScript, in which every non-primitive values are actually references, this behavior is very restrictive for people coming from TypeScript/JavaScript. But this is actually a good pattern to use in TypeScript and JavaScript because you have the whole story in a block and you are sure that if a block is at its end, the variables inside will be GC'ed.

Before I am conditioned into that pattern, this is how I write a callback in React App

class SomeComponent extends Component {

  validate(){
      // do some validation
      this.setState({ validation });
  }

  submit(){
      const { data } = state;
      fetch(someUrl, {
          body: somehowConvertToFormData(data)
      })
  }

  onHandleSubmit(){
      this.validate();
      this.submit();
  }

  // render
  render() {
    return (
      <form onClick={this.onHandleSubmit}>
      	// ...
      </form>
    );
  }
}

After I am conditioned into the pattern, this is how I write a callback in React App

class SomeComponent extends Component {

  async submitImpl() {
      // data starts here
      const { data } = this.state;
      // data should be passed as parameter by validate()
      // validate doesn't need access to this
      // therefore it does not need to be class method
      const validation = validate(data);
      if(validation.isValid()){
          await fetch(someUrl, { body: data });
      }
      this.setState({ validation })
      // data ends here
  }

  async submit() {
  	if (this.state.isSubmitting) return;
      // Critical Section
      this.setState({ isSubmitting: true });
  	this.submitImpl();
      this.setState({ isSubmitting: false });
  }

  // render
  render() {
    return (
      <form onClick={this.submit()}>
        // ...
      </form>
    );
  }
}

You can see the whole submission scenario in the submitImpl. The source of data is single, from the beginning of submitImpl. Validation function receives data and produce validation object instead of creating side-effect. Developers reading this would read it like sequential narration. They don't have to jump between functions to know what it is doing.

Others

TryFrom and TryInto that I mentioned earlier are very convenient for business objects serialization and deserialization. I use that pattern all the time making TS based CLI, script, app logic, etc.
I use type and functions to replace class. Class is good for scoping variables and allowing methods to use those. But other than that, for example if you only need a simple data structure, use type. Type is intuitive to write. The type declaration { username: string, password: string } looks almost similar as the instantiation { username: "someusername", password: "somepassword" } . Writing class takes a lot of effort. You have to write the constructor, default value, and instantiating it. It takes quite an effort to make the instantiation looks like its declaration. You can imagine how the constructor looks if you instantiate it like this new User({ username: "someusername", password: "somepassword" }).

Adoption Is Not Perfect

Beauties do not come without a price. There are always things to be sacrificed. Most of the snippet above requires more code to be written, worse it compiles to bigger code in the case of Codecs and tryCatch. Changes in paradigm need time and software developers' times are precious. This is because those beauties are inherent traits of Rust. It is obvious that writing Rust code in the Rust way is a lot cheaper than writing TypeScript in the Rust way. In spite of that, I choose to pay the price, especially in big projects. Big projects tend to scale fast and it's better to pay the price early and reap the benefits later. That's a cheap price to pay to convert your compiler into your logic reviewer and communicate business objects to your peers.

And then there is Rust's macro, one of its best features. There's no such thing as adopting macro for TypeScript. Whenever I write codecs and then having to write the type for the codecs, I'm frustrated why TypeScript doesn't have macro. Whenever I see JS code that creates functions dynamically at initialization time, it's macro again in my mind. I have experimented with it, making a code generator for TypeScript, but it's not the same.

I am both happy and sad that TypeScript's team chose to write TypeScript as a thin layer on top of JavaScript. It makes it easy for JavaScript developers to adopt TypeScript. That way TypeScript sets out a path for JS developers to understand how static types is more useful than it is restricting. However, I wish TypeScript were a 99%-sound language, without those non-essential part of JS like class or the near-misleading part like exception. Instead, I wish runtime type checks, Either data type, and macros were parts of the language.