quick note on optimizing scala.js compilation and linking

quick note on optimizing scala.js compilation and linking

If you use scala.js, you probably use mill or sbt to handle the compilation and linking phases.

sbt’s fastOptJS runs the scala.js compiler as a plugin to the scala compiler. Then the linker is run to create the final .js file.

Both the the scala.js compiler and linker are provided as standalone CLI at https://github.com/scala-js/scala-js-cli so you can download and run the compiler and linker by hand. However, you need to add scala.js compiled jars to include external libraries. It’s not convenient to use the compiler or linker by hand if you have alot of dependencies. Also, the compiler and linker are jvm programs so they have slow startup times.

Can we do better?

You could use the CLI individually without needing to run mill or sbt. This would shrink, somewhat, the continuous memory overhead at the expense of compile time. You could also run bloop standalone which reduces overhead from build tools. Running the scalajs linker takes the longest time.

Let’s say you are developing a web app and need to have the generated .js file to load into the browser, say with webpack or parcel. In an intensive edit-compile-debug loop, bloop’s incremental compilation is fast.

Here are some other things to try. Note that to build a native image you must ensure you have graal’s native-image on your path. If you use sdkman, try sdk use java 20.0.0.r11-grl.

  • Create a bloop configuration in your favorite build tool. sbt for example, has a bloop plugin. Run sbt bloopInstall to create the .bloop files once you have setup your scala.js project with the bloop plugin. If you start up metals on your scala.js project, it uses bloop and will automatically produce the bloop config for you. Bloop is here.

  • When you need to focus on your edit-compile-debug cycle, stop running sbt. Run bloop via bloop compile -w -p <project name>. This runs both a JVM and a python interpreter. The -w uses a watcher that efficiently compiles your project when the sources change. It would not be easy to native-image bloop server.

  • In another folder, install scala-js-cli.

    • git clone https://github.com/scala-js/scala-js-cli.git
    • cd scala-js-cli
    • ./scripts/assemble-cli.sh
  • Add the sbt-native-packager as a plugin to the default project in build.sbt.

    • Modify project/plugins.sbt:
      • addSbtPlugin("com.typesafe.sbt" % "sbt-native-packager" % "1.6.1")
    • Modify build.sbt, the main project:
      • .enablePlugins(GraalVMNativeImagePlugin)
    • Add a main class to .settings:
      • mainClass in Compile := Some("org.scalajs.cli.Scalajsld")
    • Add config for the graal generation task:
      • graalVMNativeImageOptions in Compile := Seq("--report-unsupported-elements-at-runtime")
  • Generate the native executable using sbt graalvm-native-image:packageBin

With a .bloop config generated, we can use it directly with jq to pull out the compile classpath that scalajsld needs to generate the .js file. You will see the use of jq below. It’s easy to imagine that you could create a small driver program that reads the bloop config generated by sbt and then runs the linker using the scala-js-cli as a library. You can native-image that driver program for convenience.

You now have a native image of scalajsld and a way to run it.

Does it help?

Here’s the time for the CLI linker which cold starts a jvm:

$ cd <your scala.js project>
# Use scalajsld from the standard install of scala-js-cli
$ time /path/to/scala-js-cli/pack/scalajs_2.13-1.0.1/bin/scalajsld -s -d -f -k ESModule -o test.js \
   .bloop/yourproject/bloop-bsp-clients-classes/bloop-cli \
   jq '.project.classpath|join(" ")' .bloop/yourproject.json`
 ...
real	0m16.854s
user	0m26.288s
sys	0m2.340s

The native executable:

time /path/to/scala-js-cli/target/graalvm-native-image/scalajs-cli -s -d -f -k ESModule -o test.js \
   .bloop/yourproject/bloop-bsp-clients-classes/bloop-cli `jq '.project.classpath|join(" ")' \
   .bloop/yourproject.json`
...
real	0m6.454s
user	0m13.046s
sys	0m0.713s

With sbt standalone (and no bloop), first we compile to remove the compilation stage from the comparison, then run a fresh link. We alse run a 2nd link. Notice that with sbt the time drops from 10s to 2s:

sbt:root> myproject/compile
...
sbt:root> myproject/fastOptJS
[info] Fast optimizing /me/app/myproject/target/scala-2.13/myproject-fastopt.js
[success] Total time: 10 s, completed Mar 10, 2020, 11:25:36 AM
... alter a file and relink...
[success] Total time: 2 s, completed Mar 10, 2020, 11:51:49 AM

The CLI linker is not competitive on 2nd links since it never has a chance to cache, like sbt does. The link time is about the same between sbt and scalajsld native image on the first link. With a much larger project that I also ran for benchmarking, the native-image was 2x faster on the first link. On subsequent links however, sbt was faster. The data points are a bit noisy and this is not an exhaustive test. sbt is fast. Both mill and sbt are great tools to use. Looking at the scala-js-cli code, there is already a facility to add caching more robustly on subsequent runs and most likely the improvement on incremental links is due to the scala.js linker code already having caching designed into it but it’s not able to use caching when you only run it once.

We can easily add caching to the scalajsld CLI. I made a small change to scala-js-cli so it stays running and runs a link (via a -w flag) when something on the classpath changes. The native image link time is consistently faster (~500-800ms) and uses less memory on incremental links. The “stay hot” code also works on a standard JVM even if you do not use native-image.

On a large, client project, the enhanced scalajsld linker is < 1s:

// scalajsld build with graal using sbt-native-packager and scala 2.13.1 and scala.js 1.0.1
scalajsld -w  -s -r -d -f -k ESModule -o blah.js <classpath entries>
Elapsed time: 12966249385ns / 12s
Watching for changes...
Elapsed time: 568780790ns / 0s

Sticking with your build tool is a good thing and as long as it works for you, it has everything baked in. If you have to cold-start the link many times, then a native-image scalajsld is fairly easy to employ. native-image gives you options. If you use metals, it will use bloop, so if your sbt is not using bloop, you may be running more JVMs than you know about: sbt’s jvm (if running without bloop) and bloop’s python+jvm, ouch!

A few thoughts on changes:

  • If we want to run a CLI link and have good incremental link performance, the linker needs to be modified to run in “watch” mode.
  • The compiler, even if using bloop directly, still runs a JVM and python program. That’s alot of overhead. If the scala-js cli compiler (not linker) could be enhanced to run under native-image and a simple watcher placed on top of it, this could be a very efficient model–no build tool overhead and roughly native speed.
  • Even if we altered scalajsld, which we should for a watch mode like above, you would still be running a full jvm for compilation, either by CLI, bloop or sbt with/without bloop. So we need both the compiler and linker running fast and native-image.
  • Running bloop in watch mode and a linker in watch mode (the scala-js-cli scalajsld with native-image and the enhancements mentioned above), is a good combination for a tight cycle and minimizing resource consumption especially if you are using vscode metals. Yeah, it’s fast!

For now it probably makes sense for most people to stay with a build tool. However, its within grasp that some enhancements of the basic CLI tools and perhaps a wrapper that sits on both of them, could create CLI that is efficient like other js tooling e.g the typescript, babel or reasonml compilers. Typescript is slow if you only run it once but its much faster due to caching if you use -w flag. Essentially, you are pulling parts of a build tool into the compiler and linker. This could make sense for some scala.js target such as web applictions that rely on other js-based tooling to assembly your program versus a scala build tool being the main task runner.

With some programming effort, it is quite possible to run the scala.js compiler and linker independent of a build tool (after a config is generated) and use native-image to make it faster. The tricky part for the compiler is the native-imaging of the underlying scala 2.13 compiler although 2.12 was native-imaged (https://stackoverflow.com/questions/20322543/how-to-invoke-the-scala-compiler-programmatically). You could then retain a full build tool for aspects of your build but at least optimize the edit-compile-debug cycle. It would be great if the bloop backend server could be native-imaged but its less of a priority since bloop can be simultaneously used by multiple tools.

As always, more testing and fiddling is needed to make sure everything works as it should.

Note: graal appears to be gaining traction and libraries are embedding native-image metadata into their jars to help with native-image generation.

Comments

Popular posts from this blog

zio layers and framework integration

typescript and react types

dotty+scala.js+async: interesting options