[wip] `BM_TCPEchoServerLatencyNQDRSubprocess` benchmark by jiridanek · Pull Request #326 · skupperproject/skupper-router

jiridanek · 2022-04-12T19:40:58Z

First few benchmarks is already in main, the new one is the BM_TCPEchoServerLatencyNQDRSubprocess benchmark.

This shows what adding a router to a long chain does with latency when sending a small tcp message through. C is a client that measures timing, S is an echo server.

C <-> R1 <> R2 <> R3 <> ... <> RN <-> S

(use arguments such as --benchmark_filter=.*BM_TCPEchoServerLatencyN.* to run only chosen benchmarks, or to run multiple times and compute stats)

What would be interesting would be latency percentiles/distributions, which are not readily available now, but the benchmark can be updated with that, of course.

Looks like adding routers to the chain increases average (yes, I am ashamed for using average) latency linearly. And this could be used to measure where the latency is coming from, hopefully, and to track improvements if improvements are called for.

/home/jdanek/repos/skupper-router/cmake-build-relwithdebinfo/tests/c_benchmarks/c-benchmarks
2022-04-12T21:21:09+02:00
Running /home/jdanek/repos/skupper-router/cmake-build-relwithdebinfo/tests/c_benchmarks/c-benchmarks
Run on (12 X 4300.03 MHz CPU s)
CPU Caches:
  L1 Data 32 KiB (x6)
  L1 Instruction 32 KiB (x6)
  L2 Unified 256 KiB (x6)
  L3 Unified 12288 KiB (x1)
Load Average: 0.89, 1.32, 1.66
----------------------------------------------------------------------------------
Benchmark                                        Time             CPU   Iterations
----------------------------------------------------------------------------------
BM_RouterInitializeMinimalConfig              58.4 ms        0.049 ms          100
BM_AddRemoveSinglePattern                     1.24 us         1.23 us       562571
BM_AddRemoveMultiplePatterns/1                1.27 us         1.27 us       559319
BM_AddRemoveMultiplePatterns/3                3.06 us         3.05 us       221506
BM_AddRemoveMultiplePatterns/10               9.32 us         9.30 us        76641
BM_AddRemoveMultiplePatterns/30               27.6 us         27.5 us        25368
BM_AddRemoveMultiplePatterns/100              92.8 us         92.6 us         7702
BM_AddRemoveMultiplePatterns/1000             1074 us         1071 us          662
BM_AddRemoveMultiplePatterns/100000         350917 us       349669 us            2
BM_AddRemoveMultiplePatterns_BigO           211.27 NlgN     210.52 NlgN 
BM_AddRemoveMultiplePatterns_RMS                 1 %             1 %    
BM_TCPEchoServerLatencyWithoutQDR            0.014 ms        0.006 ms       120267
BM_TCPEchoServerLatency1QDRThread            0.103 ms        0.008 ms        86610
BM_TCPEchoServerLatency1QDRSubprocess        0.101 ms        0.008 ms        87909
BM_TCPEchoServerLatency2QDRSubprocess        0.164 ms        0.008 ms        92487
BM_TCPEchoServerLatencyNQDRSubprocess/2      0.165 ms        0.008 ms        92226
BM_TCPEchoServerLatencyNQDRSubprocess/3      0.264 ms        0.009 ms        89734
BM_TCPEchoServerLatencyNQDRSubprocess/4      0.308 ms        0.008 ms        10000
BM_TCPEchoServerLatencyNQDRSubprocess/5      0.382 ms        0.008 ms        10000
BM_TCPEchoServerLatencyNQDRSubprocess/6      0.466 ms        0.009 ms        10000
BM_TCPEchoServerLatencyNQDRSubprocess/7      0.534 ms        0.009 ms        10000
BM_TCPEchoServerLatencyNQDRSubprocess/8      0.612 ms        0.009 ms        10000
BM_TCPEchoServerLatencyNQDRSubprocess/9      0.689 ms        0.009 ms        10000

Process finished with exit code 0

jiridanek · 2022-04-13T18:49:27Z

Looking at this, it seems to me that adding a router should (in ideal case) add 0.014 ms of latency. That is time that the round trip to echo server without any routers in between takes. Adding a router to the chain adds two hops to the path of the packet, which should equal to +0.014 ms of latency.

Actual latency added is 0.07, on average. That means there is 0.056 ms of overhead caused by the router. Is this a little, is this a lot? Where is this time spent? Is it spent usefully?

jiridanek · 2022-04-13T18:55:56Z

In these latency tests, there is ever only a single TCP send in flight at a time, so the routers are as little loaded as is ever possible. So the latency measured should be the lowest achievable.

edit: there should be tls in this

jiridanek · 2022-04-14T11:20:38Z

On the whole, there is absolutely no reason to orchestrate the router subprocesses from C++ test. Much nicer to do this in Python and to use existing tooling, like echo server, some tcp ping utilities, iperf3, like a normal perf test would. Much more trustworthy results, that way, as well. When the thing stops being a microbenchmark, there is no point in trying to treat it as a microbenchmark.

jiridanek added 5 commits April 12, 2022 19:02

Fix helpers.hpp function REQUIRE: it must always be run

cf2bd76

Typofixes

686c786

Few improvements to the Socket classes

7b5a2ba

todo echo server that accepts multiple; may not be needed, actually

57e37c6

TODO line of routers latency tcp test

eb6c08d

jiridanek marked this pull request as draft April 18, 2022 16:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[wip] `BM_TCPEchoServerLatencyNQDRSubprocess` benchmark#326

[wip] `BM_TCPEchoServerLatencyNQDRSubprocess` benchmark#326
jiridanek wants to merge 5 commits into
skupperproject:mainfrom
jiridanek:jd_2022_04_12_benchmark

jiridanek commented Apr 12, 2022

Uh oh!

jiridanek commented Apr 13, 2022

Uh oh!

jiridanek commented Apr 13, 2022 •

edited

Loading

Uh oh!

jiridanek commented Apr 14, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

jiridanek commented Apr 12, 2022

Uh oh!

jiridanek commented Apr 13, 2022

Uh oh!

jiridanek commented Apr 13, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jiridanek commented Apr 14, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

jiridanek commented Apr 13, 2022 •

edited

Loading