Profiling Dune Builds

Javier Chávarri
Ahrefs
Published in
5 min readFeb 5, 2024

--

Co-authored with Louis Roché.

There are multiple ways to know what’s going on in Dune (build system for OCaml). The tool itself includes a few ways that are mentioned at the end of the guide, but they’re not sufficient as of the current date (Feb ‘24). So the most relevant information at this moment is obtained through external tools.

Note that the process explained below with perf and memtrace is applicable to any OCaml program, not just Dune itself.

Performance trace with perf

perf is a tool coming with the Linux kernel. Essentially, it will run any application and take measurements every few milliseconds to get a snapshot of the callstack.

The main subcommand to record a run is perf record.

perf reports created with perf record can be opened with perf report. But if you prefer more visual ways, Firefox has a profiler that can be used as well.

perf record -F 1000 --call-graph dwarf dune build @all # or any other alias / target
perf script -F +pid > test.perf # converts perf.data to test.perf so it can be opened

Then upload the resulting test.perf to the online tool in https://profiler.firefox.com/.

Call tree view in Firefox profiling tool
Call tree view in Firefox profiling tool
Flame graph view on Firefox profiling tool
Flame graph view in Firefox profiling tool

Custom version of dune

If you want to profile some custom version of Dune you’re working on locally, remember to replace the path to the dune binary when recording.

First, build the custom version of dune (run the command from wherever the Dune fork is located in your machine):

dune build bin/main.exe

Then, update the path in perf record (run the command from the folder where the project build you are trying to profile is located):

perf record -F 1000 --call-graph dwarf ../dune/_build/default/bin/main.exe build @all

Source lines

perf has an option to include the source files linked to each function call:

perf report -F +srcline

Unfortunately, there does not seem to be a way to view the source lines in Firefox reports.

In case the visualization of source files is needed, we can use Speedscope instead of Firefox profiling tool.

After getting the perf report using perf record as explained above, process it with the following commands:

perf script -F +srcline > test.perf
./perf-addlines.pl test.perf > test-speedscope.perf

You can obtain perf-addlines.pl here.

Now, open test-speedscope.perf from https://www.speedscope.app/. Remember to select main.exe thread in top selector:

Flame chart view in Speedscope profiling tool
Flame chart view in Speedscope profiling tool

Memory profiling with memtrace

Sometimes you might encounter a lot of work done in garbage collection functions in the traces, like caml_call_gc. In these cases, you might want to know more about how memory is being used in Dune.

To do this we can use memtrace. To add it to your custom dune build:

From the Dune folder

  • Install memtrace in the opam switch of your local Dune copy: opam install memtrace
  • Add memtrace lib to the main exe here
  • Add Memtrace.trace_if_requested (); in the main entry point here

From your project folder

  • Run your custom version of dune with the flag to activate memtrace: MEMTRACE=dune-build-all.ctf ../dune/_build/default/bin/main.exe build @all

Then you can open the generated report using memtrace-viewer:

opam install memtrace_viewer
memtrace-viewer js_of_ocaml-leaky.ctf
Main view of a report obtained with OCaml memtrace
Main view of a report obtained with OCaml memtrace

For a more complete guide to memtrace, refer to this link.

Garbage collection stats

As Dune is an OCaml program, we can request the runtime to print information about the garbage collector by setting the v value in the OCAMLRUNPARAM environment variable. To get a summary of the garbage collection work for an entire build, set v to 1024.

OCAMLRUNPARAM=v=1024 dune build @all

The output looks like this:

allocated_words: 5027692323
minor_words: 4856526503
promoted_words: 869694473
major_words: 1040860293
minor_collections: 18873
major_collections: 14
heap_words: 890589184
heap_chunks: 28
top_heap_words: 890589184
compactions: 0
forced_major_collections: 0

There are many different messages which can be displayed. All the options are described in the OCaml manual.

(verbose) What GC messages to print to stderr. This is a sum of values selected from the following:

1 (= 0x001)
Start and end of major GC cycle.
2 (= 0x002)
Minor collection and major GC slice.
4 (= 0x004)
Growing and shrinking of the heap.
8 (= 0x008)
Resizing of stacks and memory manager tables.
16 (= 0x010)
Heap compaction.
32 (= 0x020)
Change of GC parameters.
64 (= 0x040)
Computation of major GC slice size.
128 (= 0x080)
Calling of finalization functions
256 (= 0x100)
Startup messages (loading the bytecode executable file, resolving shared libraries).
512 (= 0x200)
Computation of compaction-triggering condition.
1024 (= 0x400)
Output GC statistics at program exit.
2048 (= 0x800)
GC debugging messages.
4096 (= 0x1000)
Address space reservation changes.

Dune internal tracing tools

Dune provides at least three ways to inspect what it’s doing behind the scenes.

--trace-file

This flag will generate a json file that can be loaded in Firefox profiling tool or Speedscope, for example:

dune build @all --trace-file dune-build-all.json

Unfortunately, these traces don’t include any of the internal work Dune is doing, only calls to external commands like ocamldep, ocamlc or melc. So it’s only useful if you want to investigate slowness in those parts. If you want to keep up with any updates on that functionality, you can follow ocaml/dune#3862.

--print-metrics

This flag shows some internal measurements taken during the build, both for Memo metrics and also other code. At the moment the latter only includes a single function generic_digest. Example usage:

$ dune build @all --print-metrics
Memo graph: 2/1/0 nodes/edges/blocked (restore), 531541/2985230/599695
nodes/edges/blocked (compute)
Memo cycle detection graph: 328685/898708/599695 nodes/edges/paths
(21.26s total, 271.9M heap words)
Timers:
generic_digest - time spent = 0.38s, count = 195711

--dump-gc-stats

This flag will print some GC metrics to a given file, e.g.

dune build @all --dump-gc-stats all-gc-stats.txt

The contents in the generated txt file look like:

((minor_words 2753427060.)
(promoted_words 384483892.)
(major_words 398169012.)
(minor_collections 10718)
(major_collections 19)
(heap_words 271897088)
(heap_chunks 28)
(live_words 200276449)
(live_blocks 43138894)
(free_words 71620633)
(free_blocks 16)
(largest_free 8766464)
(fragments 6)
(compactions 1)
(top_heap_words 271897088)
(stack_size 255))

--

--

Frontend at Ahrefs • Passionate about The Web, JavaScript, and now OCaml and ReScript