to dynamically optimize the in memory machine
code of a running process
modern English: from a combination of
dynamic and optimize
a software agent that dynamically optimizes the
in-memory machine code of a running process
modern English: from a combination of
dynamic and optimizer
Dynimizer is a user mode program that quickly optimizes the in-memory machine
code of target processes based on profiling information it gathers at run-time,
improving performance for some workloads. Dynimizer typically runs as a background
system process and does not require the source code of target programs that it optimizes.
2. Installation & Quick Start
To install Dynimizer with default options, run the following commands in a Linux terminal:
wget https://dynimize.com/install -O install
wget https://dynimize.com/install.sha256 -O install.sha256
if sha256sum -c install.sha256 | grep OK; then sudo bash ./install -default; fi
Dynimizer is now installed and can be further configured by editing /etc/dyni.conf.
To optimize any CPU intensive target process who's exe is listed in the [exeList]
section of /etc/dyni.conf, run:
$ sudo dyni -start
The command dyni -status will show target processes progressing from the "profiling",
to "dynimizing", and then "dynimized" states. A process has been fully optimized once
in the "dynimized" state:
$ sudo dyni -status
Dynimizer is running
mysqld, pid: 8375, dynimizing
• 4GB virtual memory (swap + RAM) per target process being
dynimized in parallel. This is because Dynimizer requires 4 GB per target process in the
"dynimizing" phase, releasing the memory after that.
• Above virtual memory requirement increased to 6GB (swap + RAM)
per target process being dynimized when the target is the MySQL 8 mysqld process,
because the mysqld executable for MySQL 8 is twice the size as that of MySQL 5.7.
3.1 VM specific requirements
There are no specific requirements for running inside virtual machines from the KVM,
Xen or ESXi (VMWare) hypervisors. Dynimizer can even use
perf_event_open() with the system task clock event when virtual
PMUs are not enabled. Therefore Dynimizer can be run in most cloud environments.
3.2 Docker specific requiremets
Dynimizer must be run in the same container as the target process being optimized.
When run, the dyni process requires the ability to use the ptrace(),
and perf_event_open() system calls to profile and interface with
the target process being optimized. This requires the dyni process to be run with the
CAP_SYS_ADMIN and CAP_SYS_PTRACE
capabilities. Docker as of the time of this writing (version 18) disables these system calls
and capabilities by default. The simplest way to enable them is by starting a docker container
with the --privileged flag. For example,
$run -it --privileged my_image.
Alternatively, see Docker's documentation on seccomp profiles for specifically enabling just
these system calls.
4. Command Line Options
Print the help message
Start Dynimizer as a background daemon process
Stop Dynimizer daemon process
Print the state of Dynimizer and its optimized processes.
States will be listed in the following format:
target process executable name, pid, state
Print the Dynimizer version
Specify a path to the output log
Do not re-dynimize a process after it has been dynimized and its workload
has dramatically changed. Dynimizer will exit after dynimizing a process
when used in conjunction with -pid <number>
Shorten the amount of ramp-up time it takes to dynimize a process, at the
expense of final steady state performance after ramp-up
Reduce profiling overhead for target processes, while incurring a slight
profiling overhead across the entire system while dynimizing those processes.
May also prolong the dynimizing phase and marginally reduce the final
speedup of dynimized processes
• ‑exe <name>
Only processes run from this executable can be Dynimized
Specify that the JIT code caches generated by Dynimzer not be both readable
and writable simultaneously. May incur a marginal performance penalty.
Usually necessary when SELinux is being enforced
• ‑pid <number>
Only dynimize the process specified by this pid number. This will cause Dynimizer
to exit after dynimizing a process when used in conjunction with
5. Dynimizer Usage Examples
The sudo command should be appended to the start of all dyni commands
if not executed as superuser. This has been left out for readability.
Start the Dynimizer daemon process with a new log path directory "/tmp/log":
$ dyni -start -log /tmp/log
$ dyni -stop
Read Dynimizer's status:
$ dyni -status
Dynimize a process at pid 9618 and exit once dynimized:
$ dyni -start -pid 9618 -optimizeOnce:y
Launch Dynimizer as a foreground process, only dynimizing process 9618
and then exiting. Note that the -start option is excluded here so that
Dynimizer is not launched in the background, which can be useful for
running Dynimizer from shell scripts:
$ dyni -pid 9618 -optimizeOnce:y
Only dynimize processes run from the "myprog" executable:
$ dyni -start -exe myprog
Dynimize processes and shorten the amount of time spent in the
$ dyni -start -fastCompile:y
Dynimize processes, specifying the JIT code caches are not both readable
and writeable at the same time, which is a common requirement in SELinux
$ dyni -start -secureCodeCache:y
* A JIT code cache is a memory region that is automatically
loaded into a target process by Dynimizer. Dynimizer primarily uses it to store the
new optimized machine code sequences it generates, amongst other things.
6. Configuration File
On startup, Dynimizer is configured based on the settings in /etc/dyni.conf.
The following is a description of these settings. Many of these settings can be
overridden by command line options. Note that starting a line with the # character
will cause that line to be skipped.
All options follow the line in the .conf file containing the string
[options]. If any of those following options are missing,
their default settings are used. Note that the < | >
characters should not actually be placed in the log file:
Same as the command line option -log
The max total size of both log files at <path> and <path>.old combined.
Size is in bytes unless specified by MB, M, GB, G, KB, K
Same as the command line option ‑optimizeOnce
Same as the command line option ‑fastCompile
Same as the command line option ‑lowOverhead
Set to y for Dynimizer to be launched on system startup
Initially set to y if ‑selinux:y
was selected when running ./INSTALL.sh.
Same as the command line option ‑secureCodeCache
The lines following [exeList]
denote the whitelist of possible optimization targets. These are the names of
the executables that are used to launch the processes that can be targeted by
Dynimizer, one name per line. So for example, the MySQL server process would
be denoted as mysqld. A process must have been started by an executable on
this list to be dynimized. Note that if a symbolic or hard link is used to call
an executable, the real executable file name must be specified on the
The lines following [users] list the possible user
names of the owners of processes that can be targeted by Dynimizer, one name per line.
If present, a process must match both the [exeList] and
[users] list in order to be dynimized.
Default: no user name requirement.
The output log specified by log: records all errors and Dynimizer status updates
along with their timestamps. On installation the log path is set to /var/log/dyni.log
in /etc/dyni.conf. Once the log file reaches half the maxLogSize, it is moved to the
same path name appended with .old, and a new log is started as the original log file name.
Logging is disabled if the log option is removed from dyni.conf.
7. Workload Requirements
To obtain benefit from the current version of Dynimizer,
all of the following workload conditions must be met:
A small number of CPU intensive processes
On a given OS host where the workload is running, the workload must be comprised of
one or a few CPU intensive processes. Optimizing a large number of processes at once is
Long running programs
The processes being optimized have long lifetimes, and their workloads are long running
in order to amortize the warmup time associated with optimization.
Optimized processes must be 64-bit, derived from x86-64 executables and shared
libraries, which must comply with the x86-64 ABI and ELF-64 formats.
Most statically compiled applications on Linux meet this requirement.
Target processes must be dynamically linked to its shared libraries. Statically
linked processes are not yet supported. Most Linux programs are dynamically linked.
No self modifying code
The target application must not be running its own Just-In-Time compiler such as
those found in Java virtual machines. This therefore excludes Java Applications.
Front-end CPU stalls
The workload wastes a lot of time in CPU instruction cache misses, instruction TLB misses,
and to a lesser extent branch mispredictions.
User mode execution
Much of that wasted time is spent in user mode execution (as opposed to kernel mode),
as Dynimizer only optimizes user mode machine code.
Because of these requirements, Dynimizer takes a whitelist approach when determining
if programs are allowed to be optimized, with MySQL and its variants being the currently
supported optimization targets on that list for this early beta release. Other programs are
not currently supported, and while they can be used with Dynimizer, they should be very
thoroughly tested by the user or system administrator before being deployed in a
Future versions of Dynimizer may eliminate many of these workload requirements,
broadening the variety of applicable scenarios as well as further increasing the performance
delivered in previously beneficial cases.
8. Miscellaneous Notes
Dynimizer 1.0 Beta Not fit for production outside of MySQL,
MariaDB, and Percona Server
Dynimizer has not been extensively stress tested with non-MySQL targets.
It is only suitable for demonstration purposes there.
Sequential CPU speedup
Dynimizer usage may result in the speedup of the target application and/or a
reduction in CPU resources consumed by that application. Speedup typically occurs
when sequential CPU performance is a bottleneck. When sequential CPU performance is
not a bottleneck, a reduction in CPU resources used by the optimized process is
experienced which can be observed by an increase in CPU idling. All improvements
are generally limited to time spent in user mode execution.
Dynimizer performs work in response to CPU usage
When a process whose executable is listed in the exeList consumes large amounts of
CPU resources, Dynimizer will automatically begin to optimize the in memory
instructions of that process. Dynimizer only performs work in response to CPU
resources consumed by a target process. The more CPU resources consumed by a target process,
the more intensely Dynimizer will work to dynimize it and the more quickly the process
will become dynimized. A target process that consumes little CPU resources will therefore
take a long time to become fully dynimized.
Responds to changes in workload
Once all target applications are sufficiently optimized, Dynimizer will enter idle mode.
It will then be prompted to perform more work in response to a significant change in the
CPU workload of an already dynimized process, in which case it will re-dynimize it.
It will also be prompted to do more work when a new undynimized process begins,
where its executable is on the exeList and the process is CPU intensive.
Target applications do NOT need to be restarted in order to be dynimized.
Once Dynimizer is started it will automatically detect and begin dynimizing them immediately.
One Dynimizer instance per OS host
Dynimizer is not designed to be run as multiple instances on the same host OS.
If a second instance is launched in parallel, it will detect an already running
instance of Dynimizer and exit.
No handoffs between Dynimizer instances
A new Dynimizer process will not dynimize a target process that has already been
dynimized by a previous Dynimizer process.
No stale shared libraries
Dynimizer will not dynimize a running process if its shared libraries on disk have changed
after they were loaded into that running process. dyni -status will report "stale shared libraries"
for that target process and it must be restarted for Dynimizer to target it.
Dynimizer is currently single threaded and will only consume the resources of at
most one CPU core while running.
9. Example Lifecycle of Dynimizer and a Target Application
Dynimizer is launched and begins in idle mode, monitoring system processes.
This state consumes virtually no CPU resources:
$ dyni -start
A new target application process such as mysqld is then launched and becomes
CPU intensive, or an already running target application becomes CPU intensive.
Dynimizer detects this and begins to dynimize the target application. Incremental,
atomic updates to the target application's machine code are made. If the target
application remains CPU intensive then this optimization typically takes around 60
seconds to complete:
$ dyni -status
Dynimizer is running
mysqld, pid: 8375, dynimizing
Dynimizer has completed its current batch of optimization work and enters idle mode:
$ dyni -status
Dynimizer is running
mysqld, pid: 8375, dynimized
A new target application is launched or the workload of the previously dynimized
process has drastically changed. In either case, Dynimizer returns to step 3.
Although Dynimizer is single threaded, it can apply these steps to multiple
target processes at the same time.
10. Dynimizer Overheads
Dynimizer introduces both CPU and memory overheads when ramping up performance during the
dynimizing phase.* The following section addresses these overheads.
10.1 Dynimizer CPU overhead
CPU overhead exists during the warm-up phase when a process is being Dynimized. There are two components of CPU overhead during this phase. The most obvious is the amount of CPU cycles that the dyni process is actually consuming. While at first spiking to 100% utilization of a single core for less than a second, dyni typically fluctuates at around 20% CPU utilization (of a single core) for the remainder of the warm-up phase. Because it is quite rare to be fully utilizing all hardware threads on a large multicore system, the single threaded dyni process is unlikely to make too much of an impact in this manner. The second CPU performance overhead that takes place is that of application profiling, and although brief, it is typically far more drastic. Both these overheads are offset by the gradual machine code optimizations that take effect, and are completely eliminated once the process reaches the dynimized phase. An initial warmup period should therefor be set aside for workloads using Dynimizer.
10.2 Dynimizer Memory overhead
Memory overhead also exists, where the dyni process typically requires around 4 GB of
virtual memory for each target process being dynimized, while it is dynimizing them.
This large amount of virtual memory is used for book keeping purposes when dynimizing,
however the dynamic range of memory accesses is quite limited and in our experience does
little to trigger additional page faults in memory constrained workloads.
That said, at least 4GB of free swap space is required for each dynimized process during
the dynimizing phase, and even more may be allocated to provide for a margin of safety.
Once a processes is dynimized, most of that memory is freed. The dyni process then typically
consumes 50-150 MB of virtual memory per dynimized process. This is used for book keeping
purposes in order to react to drastic workload changes in dynimized processes. This memory
also undergoes a very small range of dynamic accesses, and should therefore have negligible
impact on system paging. That steady state memory overhead can be eliminated all together
by using the -optimizeOnce:y option with Dynimizer. Without that option, if a drastic
workload change occurs and the target processes are re-dynimized, they will return to the
dynimizing phase, in which case dynimizer will briefly incur the 4 GB virtual memory
overhead per target process again. An additional use of memory is the 35 MB code cache that
gets loaded into the target process when dynimized. However the actual access patterns of
machine code instruction memory with the addition of the code cache are more constrained
than that of the original undynimized process, and so the resident memory pages used there
should be less in most cases.
Overall, because of the limited range of dynamic memory accesses, these memory overheads
should not affect performance in memory constrained environments. Therefore more RAM is
typically not required. However one should make sure that the swap space can accommodate
the increased virtual memory used during the dynimizing phase.
* Reductions in both the CPU and memory overheads incurred by Dynimizer
are planned for future releases.
11. Using Dynimizer on Applications Other Than MySQL
Many have asked us why MySQL is the main target for the initial release of Dynimizer.
Database workloads are often IO bound and do not benefit from improved CPU performance to
the same extent as other workloads. Extreme levels of instruction cache misses are common
in many enterprise software workloads, which leaves many places where Dynimizer could be put
to productive use in the future. However a lot of these other examples such as Redis or NGINX
spend most of this CPU time in system mode execution (in the Linux kernel). The machine code
underlying this system mode execution cannot be optimized by Dynimizer in the current release
and therefore these applications benefit to a lesser extent than MySQL. We've measured around
10% speedups with Nginx in a CPU bound setup for example. Other applications are often
configured in a multiprocess configuration where there may be too many processes for Dynimizer
to handle effectively, or where the processes have short lifetimes. For these reasons,
we've spent little time examining other use cases for now.
Future versions of Dynimizer will be able to optimize workloads with these characteristics too.
At the moment, MySQL and its variants have been the main focus of the current release, with other
single process, multithreaded relational databases on Linux being the most likely candidates to
benefit from this release in our opinion. Because this is such new technology, we've decided to
limit the scope of this release to focus on delivering a stable user experience when improving
the performance of a single, highly popular family of programs: MySQL, MariaDB and Percona Server.
Stay tuned for expanded scope in future releases.
If you find Dynimizer useful in other workloads outside of MySQL, we'd love to hear about it.
Let us know at email@example.com.
Questions & Feedback
We love answering questions and your feedback is extremely valuable! Please use this comment
form for anything related to this documentation.