There are a number of interesting and useful stories from enterprise Java and Javascript workloads - ranging from common memory tuning issues that affect the traffic efficiency, to extremely complex multi-thread race conditions leading to application crash, these exciting stories provide vital insights on the platform's interactions with common workloads. The session aims to bring insightful production anomalies, covering how the problem manifested, how we tracked those down, and what lessons were learned.
Objective of the presentation:
The objective of this talk is to illustrate deeper issues in the runtime when large scale deployments are run in production, as well as to share knowledge that are gathered through long and effortful exercises with end users, so that these insights can be easily reused without spending similar effort.
The outline of this talk is in this format: Problem symptom, business impact, diagnostic data, debugging experiments, root cause, the key learnings and best practices for these stories, each one representative of a symptom class:
- Unexpected large memory retention[ Memory ]
- CPU Register corruption [ Crash ]
- Libuv tight loop / high CPU with UDS unplugged [ high CPU ]
- Orphaned lock [ Hang ]
- Crash on exit (aka exit race) [ Crash ]
Attendee pre-requisites - If none, enter "N/A":
Knowledge of Java