26 June 2018

At the end of last year, I wrote as part of a thesis work an energy market simulator modeling the Finnish electricity market. While I moved onward after finishing that work, I’ve been intending to return to the project to fix a few of the nagging TODO items.

So, while taking a look at that I also noticed that copy-pasting URLs from the simulator did not work anymore. Ouch!

But why is this a problem? The simulator has a few interesting implementation details:

  • It runs completely in the browser — the Monte Carlo simulation runs as a web worker in the browser.
  • It is written in Scala (not JavaScript). Actually, it uses Scala.JS which generates JavaScript from Scala sourcecode (while being mostly cross-compilable to JVM too).
  • There is no backend and thus no state stored in any backend.
  • All of the simulation world state (those user can manipulate) is encoded in the URL.

The last one is intended to make two things possible:

  • If you do modifications on the world state and bookmark the page, then loading the bookmark will get you the modified world and not the default one.
  • You can share the URLs, as opening the URL will get the same world state as you had.

Something broke

I have been exclusively testing this on Chrome and I do not make any claims or attempts about whether the application works on any other browser.

Late last year the URL copying worked. When I tested it a few weeks back, it did not. Something had changed in Chrome. Or OS X. (I checked Chrome changelogs from last December but could not find anything immediately obvious.)

Regardless of the cause, I wanted to make URL copying work again.

Solutions, so many solutions to choose from!

This was a problem I had considered before, and knew the solution to that already: encode only changes from the default world state. So, what to use? Since the original (“version 1”) data encoding scheme dumped the whole world state as base64-encoded JSON, a reasonable step might have been using JSON diffs — but no, I could not find reasonable Scala.JS-compatible implementations. Also, many “JSON diffs” looked quite verbose and might not have actually solved the problem at all.

Maybe if I encoded the world state as binary JSON (BSON) instead? Alas, I did not find libraries with sufficient Scala.JS support.

No automated luck this time. Let’s roll our own then!

Since the UI only allows users to change the enabled/disabled state and capacity or sources and lines and they have unique ids, it is possible to make a short cut and only encode changes from the default value on an identifier-by-identifier basis. So I wrote a JSON encoder/decoder wrapper for a class that encapsulates such changes.

So now the default world state URL is small since there are only some metadata encoded (no changes to encode). Then, toggle all and change capacities (using the global toggles and sliders) and … too long URL. Can’t copy paste. Damn.

JSON ends up too verbose in this case. Partially this is also due to the encoder/decoder logic which maps a case class Change(name: String, version: Int, changes: Seq[Change]) into {"name": ..., "version":, ..., "changes": [...]} where each change repeats the name, version and changes strings in verbatim. I could have changed to encode the changes as an array ([id,enabled,capacity]) but… decided not to.

I decided to go for a binary encoding directly. BSON was not an option, so what others? MsgPack would have been nice, since it at least has a specification and some cross-platform support, but again, I did not find a ScalaJS-compatible implementation that I was happy with.

There are quite a few binary encoders supporting Scala.JS. Out of those, I settled on BooPickle. With that I got the worst case data encoded as (I’ve broken it to lines of 80 characters, in reality this is all a single unbroken string):


That’s 1950 characters. Not bad! That actually we can copy and paste. (That’s also what is a “version 2” of the data format.)

Yet it is possible to do much, much better. Here is the same URL encoded in “version 3” format:


Only 591 characters! Yet it encodes exactly the same information. How is that possible?

The world data is named and versioned with the assumption that any structural change will result in a new version number. This means that all source and line identifiers in the model are static and sorting the identifiers will result in a sequence where a particular identifier will stay at the same index! The version 3 data format uses this fact to turn identifier into integers. This helps a lot since the identifiers are actually pretty long (descriptive) strings.

I kept support for the older formats in the code, so if you had a version 1 encoded URL and can get your browser to open it, it should still work. Similarly you can try to open this URL. If you manipulate the model in any way (try toggling a checkbox) it will convert the URL into version 3 format (like this).

If you are interested in the code, you can find it here (I linked a commit version since I might refactor the code later).

P.S. If you are bothered because of inconsistent indentation, it is caused by me sometimes editing the code in IntelliJ IDEA and sometimes in Emacs (with ENSIME). I strongly refrain from re-indenting source files on a whim as it breaks a lot of version history tracking, even on my own source code. As a professional programmer I have learned a long time ago to check in my own ego (regarding indentation and code style) at the door and insead adjust to the style of the codebase currently being worked on.

So if you are a junior: Don’t be an ass — don’t arbitrarily re-style existing code to your own tastes. Touch only the code you actually work on.

blog comments powered by Disqus