-
Notifications
You must be signed in to change notification settings - Fork 510
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
common: benchmarking utility and workflow #6069
Conversation
1e6b179
to
3aec78f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 1 files at r1, 2 of 2 files at r2, 2 of 2 files at r4, 2 of 2 files at r5, 4 of 4 files at r6, 2 of 2 files at r7, 2 of 2 files at r8, all commit messages.
Reviewable status: all files reviewed, 17 unresolved discussions (waiting on @janekmi and @osalyk)
.github/actions/pmem_benchmark_run/action.yml
line 7 at r5 (raw file):
description: The root directory of the repository designated as runtime required: true reference:
Lets make argument names more informative
Suggestion:
reference_path
.github/actions/pmem_benchmark_run/action.yml
line 10 at r5 (raw file):
description: LD_LIBRARY_PATH where the first version of PMDK is built required: true rival:
.
Both options are acceptable ;)
Suggestion:
rival_LD_LIBRARY_PATH
.github/actions/pmem_benchmark_run/action.yml
line 31 at r5 (raw file):
DEV=$(mount | grep ${{ inputs.pmem_path }} | cut -d' ' -f1 | cut -d'/' -f 3) NUMA=$(ndctl list -v | jq ".[] | select(.blockdev==\"$DEV\" and .mode==\"fsdax\") | .numa_node") echo "node=$NUMA" >> $GITHUB_OUTPUT
Shall this be a part of ./utils/benchmarks/run_and_combine.py
?
That helps a lot to use this workflow locally.
.github/workflows/pmem_benchmark.yml
line 12 at r5 (raw file):
type: string default: stable-2.0 push: # XXX
Can we ALSO run this regularly (daily) and store artifacts with a short retention period?
retention-days: 2
We can have a fresh result on a daily basis without any additional effort.
Code quote:
workflow_dispatch:
inputs:
reference_ref:
type: string
default: master
rival_ref:
type: string
default: stable-2.0
push: # XXX
.github/workflows/pmem_benchmark.yml
line 15 at r5 (raw file):
# XXX missing in Ansible files # sudo dnf install numactl python3-pip -y
Please fix this in ansible scripts opensuse-setup.yml
and rocky-setup.yml
There is already a place for that:
- name: Install benchmarks dependencies (optional)
and for pip
see https://docs.ansible.com/ansible/latest/collections/ansible/builtin/pip_module.html
Code quote:
# sudo dnf install numactl python3-pip -y
.github/workflows/pmem_benchmark.yml
line 26 at r5 (raw file):
include: - ROLE: reference REF: master # XXX ${{ inputs.reference_ref }}
github_ ref or checkout_ref
Suggestion:
GITHUB_REF
.github/workflows/pmem_benchmark.yml
line 28 at r5 (raw file):
REF: master # XXX ${{ inputs.reference_ref }} - ROLE: rival REF: stable-2.0 # XXX ${{ inputs.rival_ref }}
Suggestion:
- ROLE: reference
REF: stable-2.0# XXX ${{ inputs.reference_ref }}
- ROLE: rival
REF: master # XXX ${{ inputs.rival_ref }}
.github/workflows/pmem_benchmark.yml
line 30 at r5 (raw file):
REF: stable-2.0 # XXX ${{ inputs.rival_ref }} - ROLE: runtime REF: ''
Combining reference
and rival
with runtime
decreases the clarity of the solution.
I would suggest to have a separate step for runtime
Code quote:
matrix:
include:
- ROLE: reference
REF: master # XXX ${{ inputs.reference_ref }}
- ROLE: rival
REF: stable-2.0 # XXX ${{ inputs.rival_ref }}
- ROLE: runtime
REF: ''
.github/workflows/pmem_benchmark.yml
line 65 at r5 (raw file):
rival: ../rival/src/nondebug config: perf scenario: ${{ matrix.SCENARIO }}
Shall we just checkout runtime
as a root folder and only have separate subfolders for reference
and rival
?
Suggestion:
steps:
- name: Benchmark
uses: ./.github/actions/pmem_benchmark_run
with:
runtime_dir: .
reference: ./reference/src/nondebug
rival: ./rival/src/nondebug
config: perf
scenario: ${{ matrix.SCENARIO }}
src/benchmarks/obj_pmalloc.cpp
line 239 at r6 (raw file):
int ret = pmalloc(ob->pop, &ob->offs[i], ob->sizes[i], 0, 0); if (ret) { fprintf(stderr, "pmalloc ret: %d\n", ret);
perror()
is already used in other places.
Suggestion:
perror("pmalloc");
utils/benchmarks/run_and_combine.py
line 24 at r1 (raw file):
PARSER.add_argument('-p', '--pmem_path', required=True, help='PMEM-mounted directory to use') PARSER.add_argument('-n', '--numa_node', required=True, help='NUMA node to use')
Redundant argument.
NUMA can be identified based on the given pmem_path
.
See pmem_benchmark_run/action.yml
:
DEV=$(mount | grep ${{ inputs.pmem_path }} | cut -d' ' -f1 | cut -d'/' -f 3)
NUMA=$(ndctl list -v | jq ".[] | select(.blockdev==\"$DEV\" and .mode==\"fsdax\") | .numa_node")
echo "node=$NUMA" >> $GITHUB_OUTPUT
utils/benchmarks/run_and_combine.py
line 38 at r1 (raw file):
"""Run PMEMBENCH according to the provided parameters""" config = f'src/benchmarks/{args.config}.cfg' file = f'{args.pmem_path}/testfile.obj'
Every test creates its own folder on pmem, let's do the same with the pmembench.
Suggestion:
file = f'{args.pmem_path}/pmembench/testfile.obj'
utils/benchmarks/run_and_combine.py
line 58 at r1 (raw file):
COLUMNS_COMBINE = [
Shall we put this table before all functions?
Code quote:
COLUMNS_COMBINE
utils/benchmarks/run_and_combine.py
line 60 at r1 (raw file):
COLUMNS_COMBINE = [ 'total-avg[sec]', 'ops-per-second[1/sec]',
Can we start with one well-known metric?
Suggestion:
'ops-per-second[1/sec]',
'total-avg[sec]',
utils/benchmarks/run_and_combine.py
line 75 at r1 (raw file):
COLUMNS_COPY = [
.
Code quote:
COLUMNS_COPY
utils/benchmarks/run_and_combine.py
line 89 at r1 (raw file):
'operation', 'lib' 'min-size',
min_size
two times.
Suggestion:
'min-size'
'type-number',
'operation',
'lib'
'min-rsize',
utils/benchmarks/run_and_combine.py
line 109 at r1 (raw file):
Output data files: - combined - contains data from both ref and riv, and a normalized
Suggestion:
contains data from ref and riv together with a normalized
3c21371
to
38a0d6f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 1 files at r1, 2 of 2 files at r7, 2 of 2 files at r8, 1 of 5 files at r9, 2 of 2 files at r10, 2 of 2 files at r11, 2 of 2 files at r12, 1 of 7 files at r13, 2 of 2 files at r14, 2 of 2 files at r15, 2 of 2 files at r16, all commit messages.
Reviewable status: all files reviewed, 17 unresolved discussions (waiting on @grom72 and @osalyk)
.github/actions/pmem_benchmark_run/action.yml
line 7 at r5 (raw file):
Previously, grom72 (Tomasz Gromadzki) wrote…
Lets make argument names more informative
Done.
.github/actions/pmem_benchmark_run/action.yml
line 10 at r5 (raw file):
Previously, grom72 (Tomasz Gromadzki) wrote…
.
Both options are acceptable ;)
Done.
.github/actions/pmem_benchmark_run/action.yml
line 31 at r5 (raw file):
Previously, grom72 (Tomasz Gromadzki) wrote…
Shall this be a part of
./utils/benchmarks/run_and_combine.py
?
That helps a lot to use this workflow locally.
The thing is getting to know the NUMA node of a PMEM device is easy for a human operator but requires some sophistication from software as the example above showed. We can consider doing it in the second round if we can afford it.
.github/workflows/pmem_benchmark.yml
line 12 at r5 (raw file):
Previously, grom72 (Tomasz Gromadzki) wrote…
Can we ALSO run this regularly (daily) and store artifacts with a short retention period?
retention-days: 2
We can have a fresh result on a daily basis without any additional effort.
Yes, we can. But IMHO it is pointless. Without an automatic validation of the result, it will just burn down hardware. And I can bet nobody will view the results every other day anyway.
.github/workflows/pmem_benchmark.yml
line 15 at r5 (raw file):
Previously, grom72 (Tomasz Gromadzki) wrote…
Please fix this in ansible scripts
opensuse-setup.yml
androcky-setup.yml
There is already a place for that:- name: Install benchmarks dependencies (optional)
and for
pip
see https://docs.ansible.com/ansible/latest/collections/ansible/builtin/pip_module.html
Done.
.github/workflows/pmem_benchmark.yml
line 26 at r5 (raw file):
Previously, grom72 (Tomasz Gromadzki) wrote…
github_ ref or checkout_ref
Done.
.github/workflows/pmem_benchmark.yml
line 28 at r5 (raw file):
REF: master # XXX ${{ inputs.reference_ref }} - ROLE: rival REF: stable-2.0 # XXX ${{ inputs.rival_ref }}
These values will be provided from inputs. The current values are only for testing till it will land on the master branch.
.github/workflows/pmem_benchmark.yml
line 30 at r5 (raw file):
Previously, grom72 (Tomasz Gromadzki) wrote…
Combining
reference
andrival
withruntime
decreases the clarity of the solution.
I would suggest to have a separate step forruntime
Can you elaborate on where did you lost the track?
.github/workflows/pmem_benchmark.yml
line 65 at r5 (raw file):
Previously, grom72 (Tomasz Gromadzki) wrote…
Shall we just checkout
runtime
as a root folder and only have separate subfolders forreference
andrival
?
It is messy. A directory inside a repository is a part of it.
src/benchmarks/obj_pmalloc.cpp
line 239 at r6 (raw file):
Previously, grom72 (Tomasz Gromadzki) wrote…
perror()
is already used in other places.
Done.
utils/benchmarks/run_and_combine.py
line 24 at r1 (raw file):
Previously, grom72 (Tomasz Gromadzki) wrote…
Redundant argument.
NUMA can be identified based on the givenpmem_path
.
Seepmem_benchmark_run/action.yml
:DEV=$(mount | grep ${{ inputs.pmem_path }} | cut -d' ' -f1 | cut -d'/' -f 3) NUMA=$(ndctl list -v | jq ".[] | select(.blockdev==\"$DEV\" and .mode==\"fsdax\") | .numa_node") echo "node=$NUMA" >> $GITHUB_OUTPUT
Doing this thing:
- Human operator (easy)
- Bash (medium)
- Python (nasty complex)
utils/benchmarks/run_and_combine.py
line 38 at r1 (raw file):
Previously, grom72 (Tomasz Gromadzki) wrote…
Every test creates its own folder on pmem, let's do the same with the pmembench.
I don't like this idea. Now the test file is created and deleted by pmembench
. When this directory is created by this script it is just a liability.
- Has to create it.
- Has to handle error cases on creation.
- Has to delete it.
- Has to handle error cases on deletion.
Where this script is not about creating directories and files.
At the same time, the provided directory is hand-picked by the user so he can create a suitable directory structure considering all of these.
IMHO unnecessary complexity. Nay.
utils/benchmarks/run_and_combine.py
line 58 at r1 (raw file):
Previously, grom72 (Tomasz Gromadzki) wrote…
Shall we put this table before all functions?
Done.
utils/benchmarks/run_and_combine.py
line 60 at r1 (raw file):
Previously, grom72 (Tomasz Gromadzki) wrote…
Can we start with one well-known metric?
Done.
utils/benchmarks/run_and_combine.py
line 75 at r1 (raw file):
Previously, grom72 (Tomasz Gromadzki) wrote…
.
Done.
utils/benchmarks/run_and_combine.py
line 89 at r1 (raw file):
Previously, grom72 (Tomasz Gromadzki) wrote…
min_size
two times.
Done.
utils/benchmarks/run_and_combine.py
line 109 at r1 (raw file):
Output data files: - combined - contains data from both ref and riv, and a normalized
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 2 files at r18, all commit messages.
Reviewable status: all files reviewed, 17 unresolved discussions (waiting on @grom72 and @osalyk)
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #6069 +/- ##
=======================================
Coverage 70.12% 70.12%
=======================================
Files 133 133
Lines 19568 19568
Branches 3261 3261
=======================================
+ Hits 13722 13723 +1
+ Misses 5846 5845 -1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 6 of 7 files at r13, 1 of 2 files at r18.
Reviewable status: all files reviewed, 5 unresolved discussions (waiting on @janekmi and @osalyk)
.github/actions/pmem_benchmark_run/action.yml
line 31 at r5 (raw file):
Previously, janekmi (Jan Michalski) wrote…
The thing is getting to know the NUMA node of a PMEM device is easy for a human operator but requires some sophistication from software as the example above showed. We can consider doing it in the second round if we can afford it.
What about having a simple bash script that does it and returns just a number?
utils/benchmarks/run_and_combine.py
line 24 at r1 (raw file):
Previously, janekmi (Jan Michalski) wrote…
Doing this thing:
- Human operator (easy)
- Bash (medium)
- Python (nasty complex)
What about having a simple bash script that does it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 6 unresolved discussions (waiting on @janekmi and @osalyk)
a discussion (no related file):
git clone https://github.com/pmem/pmdk.git reference
cd rival
git checkout stable-2.0
make -j
cd ..
works perfectly and is ready for comparison between stable and actual working branches.
No jams no errors, etc.
utils/benchmarks/run_and_combine.py
line 17 at r18 (raw file):
help='LD_LIBRARY_PATH where the first version of PMDK is built') PARSER.add_argument('--rival', metavar='LD_LIBRARY_PATH', required=True, help='LD_LIBRARY_PATH where the second version of PMDK is built')
Can we make it optional with the default value ./src/nondebug
or something similar?
Code quote:
PARSER.add_argument('--rival', metavar='LD_LIBRARY_PATH', required=True,
help='LD_LIBRARY_PATH where the second version of PMDK is built')
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 7 files at r19, 2 of 2 files at r20, 2 of 2 files at r21, 2 of 2 files at r22, all commit messages.
Reviewable status: all files reviewed, 6 unresolved discussions (waiting on @grom72 and @osalyk)
.github/actions/pmem_benchmark_run/action.yml
line 31 at r5 (raw file):
Previously, grom72 (Tomasz Gromadzki) wrote…
What about having a simple bash script that does it and returns just a number?
Done.
utils/benchmarks/run_and_combine.py
line 24 at r1 (raw file):
Previously, grom72 (Tomasz Gromadzki) wrote…
What about having a simple bash script that does it?
Done.
utils/benchmarks/run_and_combine.py
line 17 at r18 (raw file):
Previously, grom72 (Tomasz Gromadzki) wrote…
Can we make it optional with the default value
./src/nondebug
or something similar?
I would prefer not to. Note this path is not preserved in the output so you would have no idea what these results are.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 2 files at r24, all commit messages.
Reviewable status: all files reviewed, 5 unresolved discussions (waiting on @grom72 and @osalyk)
a discussion (no related file):
Previously, grom72 (Tomasz Gromadzki) wrote…
git clone https://github.com/pmem/pmdk.git reference cd rival git checkout stable-2.0 make -j cd ..
works perfectly and is ready for comparison between stable and actual working branches.
No jams no errors, etc.
Done.
Signed-off-by: Jan Michalski <jan.michalski@intel.com>
Signed-off-by: Jan Michalski <jan.michalski@intel.com>
- numactl - pandas Signed-off-by: Jan Michalski <jan.michalski@intel.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 7 of 7 files at r19, 2 of 2 files at r20, 2 of 2 files at r21, 2 of 2 files at r23, 2 of 2 files at r24, all commit messages.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @janekmi and @osalyk)
a discussion (no related file):
There is one small issue with the "rival - reference" concept.
There is no information in the final results on what was compared.
Can we have a file that contains the branches' names as one of the output artefacts?
.github/workflows/pmem_benchmark.yml
line 40 at r24 (raw file):
GITHUB_REF: master # XXX ${{ inputs.reference_ref }} - ROLE: rival GITHUB_REF: stable-2.0 # XXX ${{ inputs.rival_ref }}
Who is the rival and who is the reference?
Suggestion:
- ROLE: reference
GITHUB_REF: stable-2.0 # XXX ${{ inputs.reference_ref }}
- ROLE: rival
GITHUB_REF: master # XXX ${{ inputs.rival_ref }}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 2 files at r26, 1 of 1 files at r27, 1 of 1 files at r28, all commit messages.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @grom72 and @osalyk)
a discussion (no related file):
Previously, grom72 (Tomasz Gromadzki) wrote…
There is one small issue with the "rival - reference" concept.
There is no information in the final results on what was compared.
Can we have a file that contains the branches' names as one of the output artefacts?
Done.
.github/workflows/pmem_benchmark.yml
line 40 at r24 (raw file):
Previously, grom72 (Tomasz Gromadzki) wrote…
Who is the rival and who is the reference?
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @grom72 and @osalyk)
a discussion (no related file):
I need to use this PR to tune the scenarios which seem unstable because of the short execution time. Bear with me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 2 files at r25, 1 of 2 files at r26, 1 of 1 files at r27, 1 of 1 files at r28, all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @osalyk)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @osalyk)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @osalyk)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @janekmi and @osalyk)
a discussion (no related file):
Previously, janekmi (Jan Michalski) wrote…
I need to use this PR to tune the scenarios which seem unstable because of the short execution time. Bear with me.
Shall we close this and tune in a separate one?
60567c4
to
264bd8c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 3 files at r29, 2 of 2 files at r30, all commit messages.
Reviewable status: complete! all files reviewed, all discussions resolved (waiting on @osalyk)
Signed-off-by: Jan Michalski <jan.michalski@intel.com>
Preview: https://github.com/pmem/pmdk/actions/runs/8451896391
This change is