PMEM RAS #896

Workflow file for this run

.github/workflows/pmem_ras.yml at 59f5696

	# Run RAS test: Unsafe Shutdown Local.
	#
	# This workflow is run on 'self-hosted' runners.
	#
	# RAS tests require a different approach compared to the standard tests - they need to
	# reboot the runner during the test. Normally, rebooting and continuing the job on GHA
	# is not possible, due to losing connection with the runner. To work around this issue,
	# an additional runner (not connected to the GH) runs the tests instead.
	#
	# The general idea of the solution is:
	# - First platform [self-hosted runner] functions as the controller [ras_controller],
	# - Second platform functions as the test runner [ras_runner],
	# - The workflow launches its steps on the controller,
	# - The controller will then run an ansible playbook on the second platform [ras_runner],
	# with options provided by the workflow,
	# - The test runner follows the steps given by the controller,
	# running the tests in the process and providing results as output,
	# - The controller gathers this output and prints it in GHA job.
	#
	# The only drawback of this idea is that workflow would always finish successfully.
	# The solution was added as an additional step, at the end of the workflow, parsing the output.
	#
	# More detailed information about the ansible playbook and tests themselves can be found in:
	# utils/gha-runners/run-ras-linux.yml
	name: PMEM RAS

	on:
	workflow_dispatch:
	schedule:
	# run this job every 8 hours
	- cron: '0 /8 * *'

	jobs:
	linux:
	name: PMEM_RAS
	if: github.repository == 'pmem/pmdk'
	runs-on: ${{ matrix.os }}
	strategy:
	fail-fast: false
	matrix:
	os: [[self-hosted, ras_controller]]
	env:
	WORKDIR: utils/gha-runners

	steps:
	- name: Clone the git repo
	uses: actions/checkout@v3

	# Variables, such as $ras_runner are hidden on the controller platform as environmental variables.
	# 'sed' command is used to filter out IP addresses from the ansible output, it will show up as the 'ras_runner' instead.
	# 'tee' command is used to save the overall output to the file. This file is needed for the next step.
	- name: Prepare and run RAS Linux tests via ansible-playbook
	run: \|
	cd $WORKDIR
	ansible-playbook -i $ras_runner, run-ras-linux.yml -e "host=all ansible_user=$ras_user" \| sed 's/[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}/ras_runner/' \| tee playbook_output.txt

	# This simple step will look through the output in search of specific fail strings.
	# If any phrase is found in the file, the workflow will fail.
	- name: Fail the workflow if the playbook finished with a failure
	run: \|
	cd $WORKDIR
	if grep -E 'fatal: \[ras_runner\]: FAILED!\|failed: \[ras_runner\]' "playbook_output.txt"; then
	exit 1
	else
	exit 0
	fi

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PMEM RAS #896

Workflow file

PMEM RAS #896

Jobs

Run details

Workflow file for this run