In serverless environments, applications should scale to zero when not in use and then be able to create new instances immediately to handle increased load. Open Liberty can start an application from zero to first response in just 1 second. While this can be considered fast, start up time for the application framework should ideally be less than 200 milliseconds to avoid high latency for the application user in a serverless environment.
One approach to make Java faster to start up has been to compile to native. This has proven to greatly reduce the startup time of many Java cloud applications to less than 100 milliseconds. However, native compilation of Java while being powerful when it works, also has many challenges that may prohibit some applications from taking advantage of the quick startup without undergoing significant changes. Also, compiling to native often has a negative effect on throughput of the application.
In this talk I will discuss a different approach to achieve "Instant On" for Java applications using Checkpoint/Restore in Userspace (CRIU). CRIU is a feature available on Linux that enables a snapshot of a running application to be taken. This snapshot can then be restored very quickly from the point the snapshot was taken and resume serving the application users. One advantage of this type of approach is that once the application is restored it is business as usual for the Java application in that all the functionality of a normal Java environment are available to the application. No additional changes should be necessary for the application to take advantage of the instant on functionality provided by CRIU.
There are several things to consider when using CRIU to ensure the restored process behaves properly. For example, ensuring a snapshot doesn't contain secrets and keys that must not be shared with each restored instance of the application, or ensure random number generators continue to give uniquely random sequences to each application instance. I will discuss how we are working with the Eclipse OpenJ9 and Open Liberty projects in order to achieve "Instant On" while addressing these kinds of issues for Java applications. I will also demonstrate an example of using this approach with OpenJ9 and Open Liberty in a (docker) container image.