Skip to content

Instantly share code, notes, and snippets.

@Benitoite
Last active September 15, 2024 00:57
Show Gist options
  • Save Benitoite/6f6ab1e0a15e8dca2dfa964fc4082218 to your computer and use it in GitHub Desktop.
Save Benitoite/6f6ab1e0a15e8dca2dfa964fc4082218 to your computer and use it in GitHub Desktop.
RawTherapee linux zsh performance timer
git clone https://github.com/Benitoite/raw-test ./raw-test ; num=$(nproc) ; sockets=$(lscpu | grep 'Socket(s):' | awk '{print $2}') ; total_threads=$((num * sockets)) ; name=$(lscpu | grep 'Model name:' | sed 's/Model name: *//') ; mhz=$(lscpu | grep 'CPU max MHz:' | awk '{print $4}') ; proc=$(cat ./AboutThisBuild.txt | grep 'Processor') ; echo "\`\`\`" ; echo "================================" ; echo "Available threads = $total_threads / CPU = $name / $mhz MHz / Target = $proc" ; for (( threads = 2; threads <= total_threads; threads *= 2 )); do export OMP_NUM_THREADS=$threads ; total_time=0.0 ; n=5 ; for file in typewriter.CR2 naturalbridges.CR2 beachcabin.ARW; do time_sum=0.0 ; for (( i = 0; i < n; i++ )); do start=$(( $(date +%s%N)/1000000 )) ; rawtherapee-cli -j -s -Y -c ./raw-test/$file &> /dev/null ; finished=$(( $(date +%s%N) / 1000000 )) ; time_sum=$(( "$time_sum + $finished - $start" )) ; done ; avg_time=$(( "$time_sum / $n" )) ; total_time=$(( "$total_time + $avg_time" )) ; done ; echo "$(printf '%.0f\n' "$total_time") total milliseconds elapsed (average of $n runs) using OMP_NUM_THREADS = $threads" ; done ; echo "================================" ; echo "\`\`\`"
@Benitoite
Copy link
Author

Benitoite commented Aug 29, 2024

Instructions:

  1. be testing 5.11
  2. have rawtherapee-cli in your PATH.
  3. open a zsh
  4. run the one-liner
  5. be in the build directory to see build target in result.
================================
Available threads = 16 / CPU = AMD Ryzen 9 5900HX with Radeon Graphics / 4890.0000 MHz / Target = Processor: x86_64 (native)
32491 total milliseconds elapsed (average of 5 runs) using OMP_NUM_THREADS = 2
20964 total milliseconds elapsed (average of 5 runs) using OMP_NUM_THREADS = 4
15747 total milliseconds elapsed (average of 5 runs) using OMP_NUM_THREADS = 8
15347 total milliseconds elapsed (average of 5 runs) using OMP_NUM_THREADS = 16
================================
================================
Available threads = 16 / CPU = AMD Ryzen 9 5900HX with Radeon Graphics / 4890.0000 MHz / Target = Processor: sandybridge-ivybridge
33771 total milliseconds elapsed (average of 5 runs) using OMP_NUM_THREADS = 2
21651 total milliseconds elapsed (average of 5 runs) using OMP_NUM_THREADS = 4
18099 total milliseconds elapsed (average of 5 runs) using OMP_NUM_THREADS = 8
21596 total milliseconds elapsed (average of 5 runs) using OMP_NUM_THREADS = 16
================================
================================
Available threads = 16 / CPU = AMD Ryzen 9 5900HX with Radeon Graphics / 4890.0000 MHz / Target = Processor: generic x86
34148 total milliseconds elapsed (average of 5 runs) using OMP_NUM_THREADS = 2
21845 total milliseconds elapsed (average of 5 runs) using OMP_NUM_THREADS = 4
18260 total milliseconds elapsed (average of 5 runs) using OMP_NUM_THREADS = 8
21701 total milliseconds elapsed (average of 5 runs) using OMP_NUM_THREADS = 16
================================

-flto=`nproc` -fgraphite-identity -ftree-loop-distribution -floop-nest-optimize -O3

cf. https://www.reddit.com/r/AMDHelp/comments/7ttszv/comment/dtjtq5m/

================================
Available threads = 16 / CPU = AMD Ryzen 9 5900HX with Radeon Graphics / 4890.0000 MHz / Target = Processor: x86_64 (native)
27268 total milliseconds elapsed (average of 5 runs) using OMP_NUM_THREADS = 2
18234 total milliseconds elapsed (average of 5 runs) using OMP_NUM_THREADS = 4
16349 total milliseconds elapsed (average of 5 runs) using OMP_NUM_THREADS = 8
20162 total milliseconds elapsed (average of 5 runs) using OMP_NUM_THREADS = 16
================================
================================
Available threads = 16 / CPU = AMD Ryzen 9 5900HX with Radeon Graphics / 4890.0000 MHz / Target = Processor: sandybridge-ivybridge
27808 total milliseconds elapsed (average of 5 runs) using OMP_NUM_THREADS = 2
18515 total milliseconds elapsed (average of 5 runs) using OMP_NUM_THREADS = 4
16549 total milliseconds elapsed (average of 5 runs) using OMP_NUM_THREADS = 8
20393 total milliseconds elapsed (average of 5 runs) using OMP_NUM_THREADS = 16
================================
================================
Available threads = 16 / CPU = AMD Ryzen 9 5900HX with Radeon Graphics / 4890.0000 MHz / Target = Processor: generic x86
28136 total milliseconds elapsed (average of 5 runs) using OMP_NUM_THREADS = 2
18556 total milliseconds elapsed (average of 5 runs) using OMP_NUM_THREADS = 4
16621 total milliseconds elapsed (average of 5 runs) using OMP_NUM_THREADS = 8
20155 total milliseconds elapsed (average of 5 runs) using OMP_NUM_THREADS = 16
================================

@Lawrence37
Copy link

With a native build:

================================
Available threads = 2 / CPU = Intel(R) Core(TM) i7-6500U CPU @ 2.50GHz / 3100.0000 MHz / Target = 
58148 total milliseconds elapsed (average of 5 runs) using OMP_NUM_THREADS = 2
================================

Somehow it only detected 2 threads instead of 4, so I ran it one more time with num manually set to 4 (and skipped the 2 threads test since I already obtained the results).

================================
Available threads = 4 / CPU = Intel(R) Core(TM) i7-6500U CPU @ 2.50GHz / 3100.0000 MHz / Target = 
47676 total milliseconds elapsed (average of 5 runs) using OMP_NUM_THREADS = 4
================================

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment