How to debug\profile scheduler utilization

when our app working: scheduler utilization stocking on 100%, but total cpu usage is about 50%, how we can understand where is problem and optimize it?

[{total,0.5281588460041117,"52.8%"},
 {weighted,1.0563176920082233,"105.6%"},
 {normal,1,1.0,"100.0%"},
 {normal,2,1.0,"100.0%"},
 {normal,3,1.0,"100.0%"},
 {normal,4,1.0,"100.0%"},
 {cpu,5,0.08379955305185231,"8.4%"},
 {cpu,6,0.01765390297141619,"1.8%"},
 {cpu,7,0.082705427673925,"8.3%"},
 {cpu,8,0.04070323424296014,"4.1%"}]
                       lock   id  #tries  #collisions  collisions [%]  time [us]  duration [%]
                      -----  --- ------- ------------ --------------- ---------- -------------
                  run_queue    6 5557965        17786          0.3200     265722        0.0750
             alcu_allocator   10  138892         1264          0.9101     253522        0.0715
                  proc_main 2402 1553115         1480          0.0953      90911        0.0256
                  port_lock  289  198267          269          0.1357      23203        0.0065
               drv_ev_state  128  510171         2246          0.4402      20033        0.0057
                       mseg    1   73798           82          0.1111      13695        0.0039
    erl_db_catree_base_node    1   94973          128          0.1348       8790        0.0025
                  proc_msgq 2402 2324178          765          0.0329       4670        0.0013
                proc_status 2402 1398041           56          0.0040       4003        0.0011
                crypto_stat   41   42574           19          0.0446       3305        0.0009
              pid_ref_table    5  580838           61          0.0105       2688        0.0008
               db_hash_slot  448  738619           16          0.0022        138        0.0000
                     db_tab   64 1433299            5          0.0003         38        0.0000
 dirty_run_queue_sleep_list    2   50657            8          0.0158          5        0.0000
Average thread real-time    : 1037288 us
Accumulated system run-time : 4465310 us
Average scheduler run-time  : 1037413 us

        Thread    alloc      aux      bifbusy_wait check_io emulator      ets       gc  gc_full      nif    other     port     send    sleep   timers

Stats per thread:
     async( 0)    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%  100.00%    0.00%
     async( 1)    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%  100.00%    0.00%
     async( 2)    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%  100.00%    0.00%
     async( 3)    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%  100.00%    0.00%
     async( 4)    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%  100.00%    0.00%
     async( 5)    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%  100.00%    0.00%
     async( 6)    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%  100.00%    0.00%
     async( 7)    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%  100.00%    0.00%
     async( 8)    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%  100.00%    0.00%
     async( 9)    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%  100.00%    0.00%
       aux( 1)    0.05%    0.42%    0.00%    0.00%    0.12%    0.00%    0.00%    0.00%    0.00%    0.00%    1.42%    0.00%    0.00%   98.00%    0.00%
dirty_cpu_( 1)    1.25%    0.00%    0.00%    1.53%    0.00%    0.04%    0.00%    7.16%    4.27%    0.00%    0.02%    0.00%    0.00%   85.72%    0.00%
dirty_cpu_( 2)    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%  100.00%    0.00%
dirty_cpu_( 3)    0.24%    0.00%    0.00%    0.35%    0.00%    0.01%    0.00%    1.72%    0.99%    0.00%    0.00%    0.00%    0.00%   96.68%    0.00%
dirty_cpu_( 4)    0.72%    0.00%    0.00%    1.36%    0.00%    0.04%    0.00%    5.74%    2.91%    0.00%    0.01%    0.00%    0.00%   89.22%    0.00%
dirty_io_s( 1)    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%  100.00%    0.00%
dirty_io_s( 2)    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%  100.00%    0.00%
dirty_io_s( 3)    0.00%    0.00%    0.00%    0.02%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%   99.98%    0.00%
dirty_io_s( 4)    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%  100.00%    0.00%
dirty_io_s( 5)    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%  100.00%    0.00%
dirty_io_s( 6)    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%  100.00%    0.00%
dirty_io_s( 7)    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%  100.00%    0.00%
dirty_io_s( 8)    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%  100.00%    0.00%
dirty_io_s( 9)    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%  100.00%    0.00%
dirty_io_s(10)    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%  100.00%    0.00%
      poll( 0)    0.00%    0.00%    0.00%    0.00%    0.03%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%   99.97%    0.00%
 scheduler( 1)   16.98%    0.26%    7.91%    0.00%    0.23%   62.26%    0.42%    2.98%    0.05%    7.55%    0.31%    0.86%    0.08%    0.00%    0.10%
 scheduler( 2)   18.75%    0.11%    1.55%    0.00%    0.13%   75.10%    0.13%    0.72%    0.06%    2.39%    0.24%    0.75%    0.06%    0.00%    0.02%
 scheduler( 3)   15.70%    0.09%    2.75%    0.00%    0.13%   79.12%    0.11%    0.27%    0.01%    0.02%    0.21%    1.35%    0.22%    0.00%    0.02%
 scheduler( 4)   16.46%    0.07%    1.02%    0.00%    0.09%   74.49%    0.03%    1.66%    0.06%    5.23%    0.20%    0.67%    0.02%    0.00%    0.01%

Stats per type:
         async    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%  100.00%    0.00%
           aux    0.05%    0.42%    0.00%    0.00%    0.12%    0.00%    0.00%    0.00%    0.00%    0.00%    1.42%    0.00%    0.00%   98.00%    0.00%
dirty_cpu_sche    0.55%    0.00%    0.00%    0.81%    0.00%    0.02%    0.00%    3.66%    2.04%    0.00%    0.01%    0.00%    0.00%   92.91%    0.00%
dirty_io_sched    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%  100.00%    0.00%
          poll    0.00%    0.00%    0.00%    0.00%    0.03%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%    0.00%   99.97%    0.00%
     scheduler   16.97%    0.13%    3.31%    0.00%    0.15%   72.75%    0.17%    1.41%    0.04%    3.79%    0.24%    0.91%    0.09%    0.00%    0.04%
1 Like

There are no visible lock contention signs in your paste, hence the problem might be outside of the BEAM.

  1. How many CPU threads this machine has? In your paste, BEAM uses only 4 (which is relatively rare nowadays, unless the server is a VM with CPU limitations)
  2. Are there any CPU limiters (e.g. cgroupsv2 with CPU controller) running?
1 Like

Hello @Polexy, how can I get these tables please ?

1 Like

@Abdelghani

First element: scheduler:utilization/1.

Second element: lock profiler.

Third element: msacc.

2 Likes

@paulo-f-oliveira Thank you so much bro, that will be very helpful

Do you only have 4 cores or is that all you are using? How many cores does the machine have?

1 Like

yes it’s only 4 cpu without additional rules\configuration
it has reproduced when worked with big json, so with eprof i found some problem with hackney

1 Like