A shared FIFO queue is implemented using ETS

Using atomics instead of ets:update_counter

 utTc:tm(1000,100,shq_atomics, out, [T]).
=====================
execute Args:[{#Ref<0.1270903068.2801664003.78349>,
               #Ref<0.1270903068.2801664003.78350>}]
execute Fun :out
execute Mod :shq_atomics
execute LoopTime:100
execute ProcCnts:1000
PMaxTime:    9401316(ns)   0.009401(s)
PMinTime:     185371(ns)   0.000185(s)
PSumTime: 3823492648(ns)   3.823493(s)
PAvgTime: 3823492.64(ns)   0.003823(s)
FAvgTime: 38234.9264(ns)   0.000038(s)
PGrar   :        561(cn)       0.56(%)
PLess   :        439(cn)       0.44(%)
=====================
3 Likes

Not 100% sure as Iā€™m on a train, so reading this on a tiny phone screen and canā€™t try it myself, butā€¦ Try init, out, in, out, in that order :wink:

2 Likes

Am I reading that right? It got 100 times faster using atomics?

edit: I was not reading it right :laughing:

2 Likes

Merged results for easier comparison.
C: prefix is ETS counters
A: prefix is atomics

=====================
C: PMaxTime:   12487019(ns)   0.012487(s)
A: PMaxTime:    9401316(ns)   0.009401(s)
C: PMinTime:     328385(ns)   0.000328(s)
A: PMinTime:     185371(ns)   0.000185(s)
C: PSumTime: 3748337374(ns)   3.748337(s)
A: PSumTime: 3823492648(ns)   3.823493(s)
C: PAvgTime: 3748337.37(ns)   0.003748(s)
A: PAvgTime: 3823492.64(ns)   0.003823(s)
C: FAvgTime: 37483.3737(ns)   0.000037(s)
A: FAvgTime: 38234.9264(ns)   0.000038(s)
C: PGrar   :        397(cn)       0.40(%)
A: PGrar   :        561(cn)       0.56(%)
C: PLess   :        603(cn)       0.60(%)
A: PLess   :        439(cn)       0.44(%)
=====================
3 Likes

Did you mean init, in, out, in?

Initialization of the queue is not really part of the bench, but putting small test for in + out together now.

Eshell V11.1.8  (abort with ^G)
1> c(shq).
2> {ok, T0} = shq:init(test0).
{ok,#Ref<0.1099474642.1732640769.159934>}
3> utTc:tm(1000,100,shq, test_in_out, [T0]).
=====================
execute Args:[#Ref<0.1099474642.1732640769.159934>]
execute Fun :test_in_out
execute Mod :shq
execute LoopTime:100
execute ProcCnts:1000
PMaxTime:   33086708(ns)   0.033087(s)
PMinTime:     460905(ns)   0.000461(s)
PSumTime: 1398331599(ns)  13.983316(s)
PAvgTime: 13983315.9(ns)   0.013983(s)
FAvgTime: 139833.159(ns)    0.00014(s)
PGrar   :        545(cn)       0.55(%)
PLess   :        455(cn)       0.46(%)
=====================
ok
4> c(shq_atomics).
{ok,shq_atomics}
5> {ok, T1} = shq_atomics:init(test1).
{ok,{#Ref<0.1099474642.1732640769.160023>,
     #Ref<0.1099474642.1732640769.160024>}}
6> utTc:tm(1000,100,shq_atomics, test_in_out, [T1]).
=====================
execute Args:[{#Ref<0.1099474642.1732640769.160023>,
               #Ref<0.1099474642.1732640769.160024>}]
execute Fun :test_in_out
execute Mod :shq_atomics
execute LoopTime:100
execute ProcCnts:1000
PMaxTime:   20723251(ns)   0.020723(s)
PMinTime:     376104(ns)   0.000376(s)
PSumTime: 7618716033(ns)   7.618716(s)
PAvgTime: 7618716.03(ns)   0.007619(s)
FAvgTime: 76187.1603(ns)   0.000076(s)
PGrar   :        577(cn)       0.58(%)
PLess   :        423(cn)       0.42(%)
=====================
ok

Looks like using atomics is approximately twice as fast when combining both in/out operations.

2 Likes

I ran some testsļ¼Œ

Blockquote

This is the test result for a single processļ¼š

DsName        V_Num   insert/per     read/per     update/per      for/per   delete/per     termSize
=====================================================================================
SQueueDs          8      28.74ns      78.92ns     notSupport   notSupport   notSupport          160
SQueueDs         16      21.64ns     123.83ns     notSupport   notSupport   notSupport          288
SQueueDs         32      18.06ns      97.73ns     notSupport   notSupport   notSupport          544
SQueueDs         64      27.19ns      76.90ns     notSupport   notSupport   notSupport         1056
SQueueDs        128      29.78ns      67.17ns     notSupport   notSupport   notSupport         2080
SQueueDs        256      48.13ns      58.80ns     notSupport   notSupport   notSupport         4128
SQueueDs        516      29.59ns      55.86ns     notSupport   notSupport   notSupport         8288
SQueueDs       1024      29.85ns      55.37ns     notSupport   notSupport   notSupport        16416
SQueueDs       2048      28.38ns      52.88ns     notSupport   notSupport   notSupport        32800
SQueueDs       4096      24.70ns      66.00ns     notSupport   notSupport   notSupport        65568
SQueueDs       8192      26.03ns      60.38ns     notSupport   notSupport   notSupport       131104
SQueueDs      16384      27.95ns      57.94ns     notSupport   notSupport   notSupport       262176

DsName        V_Num   insert/per     read/per     update/per      for/per   delete/per     termSize
=====================================================================================
SUtFifoDs         8     314.88ns     401.36ns     notSupport   notSupport   notSupport          117
SUtFifoDs        16     280.98ns     382.25ns     notSupport   notSupport   notSupport          117
SUtFifoDs        32     275.33ns     411.32ns     notSupport   notSupport   notSupport          117
SUtFifoDs        64     269.73ns     401.47ns     notSupport   notSupport   notSupport          117
SUtFifoDs       128     279.53ns     389.12ns     notSupport   notSupport   notSupport          117
SUtFifoDs       256     313.52ns     366.07ns     notSupport   notSupport   notSupport          117
SUtFifoDs       516     290.24ns     360.48ns     notSupport   notSupport   notSupport          117
SUtFifoDs      1024     297.09ns     343.08ns     notSupport   notSupport   notSupport          117
SUtFifoDs      2048     284.66ns     343.89ns     notSupport   notSupport   notSupport          117
SUtFifoDs      4096     280.31ns     346.25ns     notSupport   notSupport   notSupport          117
SUtFifoDs      8192     278.99ns     354.70ns     notSupport   notSupport   notSupport          117
SUtFifoDs     16384     284.85ns     359.10ns     notSupport   notSupport   notSupport          117

DsName        V_Num   insert/per     read/per     update/per      for/per   delete/per     termSize
=====================================================================================
SUtLifoDs         8     313.07ns     385.18ns     notSupport   notSupport   notSupport          117
SUtLifoDs        16     286.83ns     370.18ns     notSupport   notSupport   notSupport          117
SUtLifoDs        32     275.22ns     400.88ns     notSupport   notSupport   notSupport          117
SUtLifoDs        64     271.34ns     389.83ns     notSupport   notSupport   notSupport          117
SUtLifoDs       128     276.57ns     384.26ns     notSupport   notSupport   notSupport          117
SUtLifoDs       256     276.53ns     379.04ns     notSupport   notSupport   notSupport          117
SUtLifoDs       516     288.33ns     360.05ns     notSupport   notSupport   notSupport          117
SUtLifoDs      1024     280.32ns     356.36ns     notSupport   notSupport   notSupport          117
SUtLifoDs      2048     279.17ns     343.29ns     notSupport   notSupport   notSupport          117
SUtLifoDs      4096     275.01ns     347.54ns     notSupport   notSupport   notSupport          117
SUtLifoDs      8192     274.39ns     353.88ns     notSupport   notSupport   notSupport          117
SUtLifoDs     16384     274.45ns     357.87ns     notSupport   notSupport   notSupport          117

DsName        V_Num   insert/per     read/per     update/per      for/per   delete/per     termSize
=====================================================================================
SFwQueueDs        8     296.46ns     288.02ns     notSupport   notSupport   notSupport          117
SFwQueueDs       16     273.41ns     257.70ns     notSupport   notSupport   notSupport          117
SFwQueueDs       32     271.46ns     301.00ns     notSupport   notSupport   notSupport          117
SFwQueueDs       64     278.71ns     299.60ns     notSupport   notSupport   notSupport          117
SFwQueueDs      128     281.00ns     280.90ns     notSupport   notSupport   notSupport          117
SFwQueueDs      256     282.93ns     264.43ns     notSupport   notSupport   notSupport          117
SFwQueueDs      516     297.66ns     243.62ns     notSupport   notSupport   notSupport          117
SFwQueueDs     1024     287.28ns     243.58ns     notSupport   notSupport   notSupport          117
SFwQueueDs     2048     286.14ns     242.52ns     notSupport   notSupport   notSupport          117
SFwQueueDs     4096     278.60ns     243.89ns     notSupport   notSupport   notSupport          117
SFwQueueDs     8192     276.70ns     250.73ns     notSupport   notSupport   notSupport          117
SFwQueueDs    16384     279.50ns     254.94ns     notSupport   notSupport   notSupport          117

DsName        V_Num   insert/per     read/per     update/per      for/per   delete/per     termSize
=====================================================================================
SShqDs            8     135.25ns    2170.19ns     notSupport   notSupport   notSupport       noSize
SShqDs           16     117.56ns    1985.69ns     notSupport   notSupport   notSupport       noSize
SShqDs           32     146.95ns    1822.43ns     notSupport   notSupport   notSupport       noSize
SShqDs           64     147.38ns    1819.74ns     notSupport   notSupport   notSupport       noSize
SShqDs          128     158.59ns    1837.62ns     notSupport   notSupport   notSupport       noSize
SShqDs          256     255.53ns    1820.30ns     notSupport   notSupport   notSupport       noSize
SShqDs          516     398.12ns    1551.47ns     notSupport   notSupport   notSupport       noSize
SShqDs         1024     426.73ns    1545.10ns     notSupport   notSupport   notSupport       noSize
SShqDs         2048     402.67ns    1553.08ns     notSupport   notSupport   notSupport       noSize
SShqDs         4096     424.71ns    1391.29ns     notSupport   notSupport   notSupport       noSize
SShqDs         8192     415.96ns    1411.83ns     notSupport   notSupport   notSupport       noSize
SShqDs        16384     431.48ns    1335.03ns     notSupport   notSupport   notSupport       noSize

DsName        V_Num   insert/per     read/per     update/per      for/per   delete/per     termSize
=====================================================================================
SShq2Ds           8     311.04ns    1725.08ns     notSupport   notSupport   notSupport       noSize
SShq2Ds          16     117.28ns    1792.48ns     notSupport   notSupport   notSupport       noSize
SShq2Ds          32     141.41ns    1637.89ns     notSupport   notSupport   notSupport       noSize
SShq2Ds          64     151.33ns    1628.06ns     notSupport   notSupport   notSupport       noSize
SShq2Ds         128     168.60ns    1564.77ns     notSupport   notSupport   notSupport       noSize
SShq2Ds         256     205.66ns    1468.19ns     notSupport   notSupport   notSupport       noSize
SShq2Ds         516     330.46ns    1257.55ns     notSupport   notSupport   notSupport       noSize
SShq2Ds        1024     326.78ns    1187.75ns     notSupport   notSupport   notSupport       noSize
SShq2Ds        2048     280.50ns    1188.43ns     notSupport   notSupport   notSupport       noSize
SShq2Ds        4096     271.93ns    1190.28ns     notSupport   notSupport   notSupport       noSize
SShq2Ds        8192     267.30ns    1188.58ns     notSupport   notSupport   notSupport       noSize
SShq2Ds       16384     279.21ns    1185.12ns     notSupport   notSupport   notSupport       noSize
  • This is the result of a multi-process concurrent testļ¼š
Eshell V12.2  (abort with ^G)
1> testQueue:initFq(tttt).
tttt
2> utTc:tm(100, 10000, testQueue, insertFq, [tttt]).
=====================
execute Args:[tttt]
execute Fun :insertFq
execute Mod :testQueue
execute LoopTime:10000
execute ProcCnts:100
PMaxTime:  378821242(ns)   0.378821(s)
PMinTime:   15096784(ns)   0.015097(s)
PSumTime: 2173840209(ns)  21.738402(s)
PAvgTime: 217384020.(ns)   0.217384(s)
FAvgTime: 21738.4020(ns)   0.000022(s)
PGrar   :         56(cn)       0.56(%)
PLess   :         44(cn)       0.44(%)
=====================
ok
3> utTc:tm(100, 10000, testQueue, readFq, [tttt]).
=====================
execute Args:[tttt]
execute Fun :readFq
execute Mod :testQueue
execute LoopTime:10000
execute ProcCnts:100
PMaxTime:  444666180(ns)   0.444666(s)
PMinTime:   19017927(ns)   0.019018(s)
PSumTime: 2048784021(ns)   20.48784(s)
PAvgTime: 204878402.(ns)   0.204878(s)
FAvgTime: 20487.8402(ns)    0.00002(s)
PGrar   :         55(cn)       0.55(%)
PLess   :         45(cn)       0.45(%)
=====================
ok
4> Pid = shq:start().
{ok,<0.1190.0>}
5> {ok, P}  = shq:start().
{ok,<0.1192.0>}
6> utTc:tm(100, 10000, testQueue, insertShq, [P]).
=====================
execute Args:[<0.1192.0>]
execute Fun :insertShq
execute Mod :testQueue
execute LoopTime:10000
execute ProcCnts:100
PMaxTime:  206973130(ns)   0.206973(s)
PMinTime:  159462971(ns)   0.159463(s)
PSumTime: 1876425642(ns)  18.764256(s)
PAvgTime: 187642564.(ns)   0.187643(s)
FAvgTime: 18764.2564(ns)   0.000019(s)
PGrar   :         51(cn)       0.51(%)
PLess   :         49(cn)       0.49(%)
=====================
ok
7> utTc:tm(100, 10000, testQueue, readShq, [P]).
=====================
execute Args:[<0.1192.0>]
execute Fun :readShq
execute Mod :testQueue
execute LoopTime:10000
execute ProcCnts:100
PMaxTime: 1184566642(ns)   1.184567(s)
PMinTime: 1177180841(ns)   1.177181(s)
PSumTime: 1182314248(ns) 118.231425(s)
PAvgTime: 1182314248(ns)   1.182314(s)
FAvgTime: 118231.424(ns)   0.000118(s)
PGrar   :         60(cn)       0.60(%)
PLess   :         40(cn)       0.40(%)
=====================
ok
8>
8>
8> shq2:start().
{ok,<0.1396.0>}
9> {ok, P2} = shq2:start().
{ok,<0.1398.0>}
10> utTc:tm(100, 10000, testQueue, insertShq, [P2]).
=====================
execute Args:[<0.1398.0>]
execute Fun :insertShq
execute Mod :testQueue
execute LoopTime:10000
execute ProcCnts:100
PMaxTime:  199369662(ns)    0.19937(s)
PMinTime:    5489423(ns)   0.005489(s)
PSumTime: 1276080648(ns)  12.760806(s)
PAvgTime: 127608064.(ns)   0.127608(s)
FAvgTime: 12760.8064(ns)   0.000013(s)
PGrar   :         60(cn)       0.60(%)
PLess   :         40(cn)       0.40(%)
=====================
ok
11> utTc:tm(100, 10000, testQueue, readShq, [P2]).
=====================
execute Args:[<0.1398.0>]
execute Fun :readShq
execute Mod :testQueue
execute LoopTime:10000
execute ProcCnts:100
PMaxTime: 1040311848(ns)   1.040312(s)
PMinTime: 1028508840(ns)   1.028509(s)
PSumTime: 1036916348(ns) 103.691635(s)
PAvgTime: 1036916348(ns)   1.036916(s)
FAvgTime: 103691.634(ns)   0.000104(s)
PGrar   :         57(cn)       0.57(%)
PLess   :         43(cn)       0.43(%)
=====================
2 Likes

No :smile: init, out, in, out, just as she said. Just call it in the shell in that order, I bet the final out will tell you the queue is empty even though you did an in right before: The front counter overtakes the rear one.

2 Likes

Yeah I see that :sweat_smile:. Examples should be corrected now with counter increments being reversed when table is empty.

2 Likes

Better idea (IMO), but works only with ets counters: Put both counters in the same key (so you can update/read (via Incr=0) them both in one op), and change the semantics of the front counter to mean ā€œdistance to rearā€. Update the front counter with a treshold of 0.

Sorry for the abbreviated message, Iā€™m still on phone only :sweat_smile:

2 Likes

Howeverā€¦ IMO this entire approach has ā€œRace conditionā€ written all over it :sweat_smile::sweat_smile::sweat_smile:

2 Likes

Since ETS counters and atomics are both atomic Iā€™m curious to understand where you see a potential race condition?

This is all really just a thought exercise, but I used separate counters for ETS to reduce any potential blocking behavior/conflict between in/out operations. A combined counter, like I used in the atomics version would work as well.

2 Likes

There are two operations on shared resources involved here, counter update and insert/take, with the latter depending on the former.

1 Like

One scenario I can sketch up off the top of my head: Imagine concurrent calls to out and in_r. Now ifā€¦

  • out increments+reads the front counter
  • in_r decrements+reads the front counter
  • in_r inserts the value (as out hasnā€™t taken yet, this is a replacement, ie a value is lost)
  • out takes the value (and now the front counter points to a position that does not exist)
3 Likes

I think it would work if the *_r variants were dropped. Then all keys would be used only once, right? :slightly_smiling_face:

3 Likes

There would still be a possibility for inconsistencies when a process exits between counter update and insert/take.
If that happened in an in operation, the result would be a ā€œholeā€, in an out operation an orphan (ie, a memory leak).

2 Likes

Hi :slight_smile:

Could you briefly explain to me what your results show, and which is for which queue implementation? I have to admit that I have trouble understanding the test setup, partly because I donā€™t know the tiniest bit Chinese (that is Chinese, right? :sweat_smile:).

2 Likes

We have removed the gen_server overhead and implemented the shared queue process as a special process in the streamlining branch. Not sure we got everything 100% correct (handling of system calls and name registration), but looks like it basically works, so there :wink:

Anybody here who wants to take it for a spin? :smiley: ā†’ :upside_down_face: ā†’ :grimacing: ā†’ :upside_down_face: ā†’ :woozy_face: ā†’ :upside_down_face: ā†’ :flushed: ā†’ :upside_down_face: ā†’ :face_vomiting:

5 Likes

I mainly test the following types of encapsulated fifo
utFifo.erl is based on ets ets:first and ets:take
fwQuque.erl is based on ets ets:first_take
shq.erl is your shq
shq2.erl is based on your shq. The data is stored in ets instead of stored in the process dictionary

The in/out functions of fwQueue and shq shq2 are encapsulated in testQueue and then tested using utTc

I suggest you simply take a look at the code and then run the test code, which is more intuitive.

Simple conclusion
The fwQueue/shq in function is about the same time
fwQueue/shq out fwQueue is faster than shq
The additional fwQueue is more memory friendly than shq

2 Likes

itā€™s great job

2 Likes

I have a question: Why do you have to have a process to manage this data?

2 Likes