Skip to content
This repository has been archived by the owner on Mar 12, 2024. It is now read-only.

Commit

Permalink
Release 1.5.0 (#13)
Browse files Browse the repository at this point in the history
* Extend terminateProcess.sh with other victims

* Add 5s sleep into all shell experiments

* Fix version file generation

* Fix the tag version of the Maven image used in CICD

* Automatic update of copyright in docs

* Fix YEAR property definition

* Version update to 1.5.0

* Add additional details to list of configuration parameters

* Add memorystore module

* Add gcp common module

* GCP credentials provider service moved to gcp-common module

* Add assembly plugin

* Renamed GcpService to GcpCredentialsProviderService

* POC of a failover experiment

* Always expose Anchore scan results

* Increment Dockerfile to build from OpenJDK 11.0.6

* Cleaning unintentionally committed pom changes

* Fix version of parent module

* Fix version of parent module

* Add new gcp modules into coverage module dependencies

* Add second failover flavour

* Move GcpConstants to gcp-common module

* Use wildcard instead of user defined locations

* Redesign roster generation

* Set DataDogIdentifier

* Platform health check added

* PlatformHealth bean implementation

* Load projectID from credentials metadata

* Update compute unit test with GcpCredentialsMetadata bean

* Memorystore error codes

* Version update to 1.5.0

* Cleaning unintentionally committed pom changes

* Add instance filtering

* GcpMemorystoreHealth unit test

* Fix incorrect package name

* Remove instance state property

* compareUniqueIdentifierInner implementation

* GcpMemorystoreInstanceContainer unit tests

* Throw an exception when recycle attempt is done

* All static fields must be final

* Remove redundant null check

* GcpMemorystorePlatform unit tests

* Remove IOException from method signature

* Remove commented methods

* Update GCP compute configuration variables

* Update credentials warning

* Remove redundant dependency

* Include filter is not mandatory

* Add Memorystore module to README.md

* Memorystore module documentation

* Fix python path

* Add `Error Codes Numbering Convention` to developer documentation

* Add `Error Codes Numbering Convention` to developer documentation

* Add memorystore module

* Add gcp common module

* GCP credentials provider service moved to gcp-common module

* Add assembly plugin

* Renamed GcpService to GcpCredentialsProviderService

* POC of a failover experiment

* Fix version of parent module

* Fix version of parent module

* Add new gcp modules into coverage module dependencies

* Add second failover flavour

* Move GcpConstants to gcp-common module

* Use wildcard instead of user defined locations

* Redesign roster generation

* Set DataDogIdentifier

* Platform health check added

* PlatformHealth bean implementation

* Load projectID from credentials metadata

* Update compute unit test with GcpCredentialsMetadata bean

* Memorystore error codes

* Add instance filtering

* GcpMemorystoreHealth unit test

* Fix incorrect package name

* Remove instance state property

* compareUniqueIdentifierInner implementation

* GcpMemorystoreInstanceContainer unit tests

* Throw an exception when recycle attempt is done

* All static fields must be final

* Remove redundant null check

* GcpMemorystorePlatform unit tests

* Remove IOException from method signature

* Remove commented methods

* Update GCP compute configuration variables

* Update credentials warning

* Remove redundant dependency

* Include filter is not mandatory

* Add Memorystore module to README.md

* Memorystore module documentation

* GCP Error codes refactoring

* Revert back to original design and load GCP project id from configuration variables

* Add GCP configuration examples

* Cover partially tested conditions reported by Sonar

* Remove real project name

* Change platform level

* Do not pass Redis Instance to Container

* The FailoverInstanceRequest class should not be imported into the Container class

* Replace Items with Map.Entry

* Remove redundant email field

* Fix formatting

* Allow stubbing of final classes

* Test experiment runnables

* Test instance queries

* Make CloudRedisClient a global mock

* Reference detailed documentation on protection mode

* Docker build job need version info

* Remove Anchore from GitLab CI

It is catching false positives preventing merges, and we are moving to GitHub with Snyk container image scanning.

* Add a CODEOWNERS file in the .github directory for use in GitHub

* Skip self healing when not method is set

* Update to latest version of k8s client in order to allow experiments on EKS clusters based of k8s 1.17.x

* Update list of prefixes

* Correct spring profile selection

* Describe how to test SA role binging

* Swagger UI

* AWS SDK

* GCP SDK

* Correct sequence of steps to create an SA

* Update parent openjdk image to address SSL issues while communicating with K8S platforms

* Add a note regarding DD startup (#8)

* Update GCP SDK

* Patch NPE while generating compute container from an instance with no access to Internet

* Upgrade Spring Boot

* Mockito collision

* GCP Compute platform health implementation

* Platform Health unit test

* Correct platform health logic

* GCP Compute instance states in constants

* Update docs

* Sabil chaos engine (#11)

Experimenting on the AWS handbook

* Update k8s client

* Update k8s docs

* Avoid NPE when pod has no owner

* Allow experiments on multiple namespaces

* Update existing unit tests

* Add test

* Remove redundant assertion

* Add parent worker node information into k8s container metadata

* Platform status must be failed when it is not possible to list pods in given namespace

* Follow naming conventions

* Add test

* Update docs

Co-authored-by: Andrew Mantha <andrew.mantha@gemalto.com>
Co-authored-by: Andrew Mantha <andrew.mantha@thalesgroup.com>
Co-authored-by: Andrew Mantha <andrew.mantha@cplcloud.com>
Co-authored-by: a-mantha <45573663+a-mantha@users.noreply.github.com>
Co-authored-by: sabil05 <59994038+sabil05@users.noreply.github.com>
  • Loading branch information
6 people authored Nov 27, 2020
1 parent 1e8b7a2 commit 6df8394
Show file tree
Hide file tree
Showing 95 changed files with 2,430 additions and 314 deletions.
2 changes: 2 additions & 0 deletions .github/CODEOWNERS
Validating CODEOWNERS rules …
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
# Thales Group Chaos Engine Admins are default owners
* @ThalesGroup/chaos-engine-admins
4 changes: 2 additions & 2 deletions .gitlab-ci.yml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
image: "maven"
image: "maven:3.6-jdk-11-slim"

stages:
- version
Expand Down Expand Up @@ -28,4 +28,4 @@ include:
- local: /ci/build/.gitlab-ci.yml
- local: /ci/package/.gitlab-ci.yml
- local: /ci/version/.gitlab-ci.yml
- local: /ci/docs/.gitlab-ci.yml
- local: /ci/docs/.gitlab-ci.yml
18 changes: 18 additions & 0 deletions .idea/codeStyles/Project.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@ Running chaos experiments in a non-resilient system can result in significant fa
| AWS EC2 | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: |
| AWS RDS | :heavy_multiplication_x: | :heavy_check_mark: | :heavy_multiplication_x: |
| GCP Compute | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: |
| GCP Memorystore | :heavy_multiplication_x: | :heavy_check_mark: | :heavy_multiplication_x: |

#### Legend
:heavy_check_mark: - Fully Supported | :white_check_mark: - Some Support
Expand Down
2 changes: 1 addition & 1 deletion chaosengine-core/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
<parent>
<artifactId>chaosengine</artifactId>
<groupId>com.thales.chaos</groupId>
<version>1.4.0-SNAPSHOT</version>
<version>1.5.0-SNAPSHOT</version>
</parent>
<modelVersion>4.0.0</modelVersion>

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -72,8 +72,7 @@ public abstract class Experiment {
private ScriptManager scriptManager;
@Autowired
private HolidayManager holidayManager;
private Runnable selfHealingMethod = () -> {
};
private Runnable selfHealingMethod;
private Callable<ContainerHealth> checkContainerHealth;
private Runnable finalizeMethod;
private Instant startTime = Instant.now();
Expand Down Expand Up @@ -306,24 +305,31 @@ public ExperimentType getExperimentType () {
}

void callSelfHealing () {
int selfHealingAttempts = getSelfHealingCounter().get();
try {
if (canRunSelfHealing()) {
selfHealingAttempts = getSelfHealingCounter().incrementAndGet();
log.info("Running self healing for the {} time", selfHealingAttempts);
sendNotification(NotificationLevel.WARN, "Running self healing for the " + selfHealingAttempts + " time.");
lastSelfHealingTime = Instant.now();
selfHealingMethod.run();
}
} catch (Exception e) {
sendNotification(NotificationLevel.ERROR, AN_EXCEPTION_OCCURRED_WHILE_RUNNING_SELF_HEALING);
log.error("An error occurred while calling self healing method", e);
} finally {
evaluateRunningExperiment();
if (selfHealingAttempts >= DEFAULT_MAXIMUM_SELF_HEALING_RETRIES && getExperimentState().equals(ExperimentState.SELF_HEALING)) {
sendNotification(NotificationLevel.ERROR, MAXIMUM_SELF_HEALING_RETRIES_REACHED);
setExperimentState(ExperimentState.FAILED);
if (selfHealingMethod != null) {
int selfHealingAttempts = getSelfHealingCounter().get();
try {
if (canRunSelfHealing()) {
selfHealingAttempts = getSelfHealingCounter().incrementAndGet();
log.info("Running self healing for the {} time", selfHealingAttempts);
sendNotification(NotificationLevel.WARN,
"Running self healing for the " + selfHealingAttempts + " time.");
lastSelfHealingTime = Instant.now();
selfHealingMethod.run();
}
} catch (Exception e) {
sendNotification(NotificationLevel.ERROR, AN_EXCEPTION_OCCURRED_WHILE_RUNNING_SELF_HEALING);
log.error("An error occurred while calling self healing method", e);
} finally {
evaluateRunningExperiment();
if (selfHealingAttempts >= DEFAULT_MAXIMUM_SELF_HEALING_RETRIES && getExperimentState().equals(
ExperimentState.SELF_HEALING)) {
sendNotification(NotificationLevel.ERROR, MAXIMUM_SELF_HEALING_RETRIES_REACHED);
setExperimentState(ExperimentState.FAILED);
}
}
} else {
log.info("Experiment has no self healing method, finalizing experiment.");
setExperimentState(ExperimentState.FAILED);
}
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -112,4 +112,8 @@ errorCode.45002.message=Transfer of file via Kubernetes shell failed
errorCode.51000.name=Google Compute Error
errorCode.51000.message=Generic Google Compute Error
errorCode.55101.name=Google Compute SSH Key Creation Error
errorCode.55101.message=Error creating SSH Key to use in Google Compute experiments
errorCode.55101.message=Error creating SSH Key to use in Google Compute experiments
errorCode.51200.name=Google Memorystore Error
errorCode.51200.message=Generic Google Memorystore Error
errorCode.51201.name=Memorystore Does Not Support Recycling
errorCode.51201.message=Attempted to recycle Memorystore Container, but this is not supported.
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
#!/bin/sh
# Dependencies: dd, df, grep, awk
# Dependencies: dd, df, grep, awk, sleep

TMP_FILESYSTEM=$(df /tmp -T | grep '/$' | awk '{print $2}')
FILE_NAME=chaos-burn-io-experiment
Expand All @@ -10,6 +10,9 @@ else
FILE_PATH=$HOME/$FILE_NAME
fi

# This extra wait time is added in order to synchronize startup of parralel experiments
sleep 5

while true ; do
dd if=/dev/zero of=$FILE_PATH bs=1M count=1024
done
Original file line number Diff line number Diff line change
@@ -1,10 +1,13 @@
#!/bin/sh
# Description: Simulates high CPU usage on all available processing units
# Dependencies: wc, yes, grep
# Dependencies: wc, yes, grep, sleep

limit=$(grep proc /proc/cpuinfo | wc -l)
counter=1

# This extra wait time is added in order to synchronize startup of parralel experiments
sleep 5

while [ $counter -le $limit ]; do
yes "is it chaos ?" | grep "wow really?" &
counter=$(($counter+1))
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,7 @@
# Description: Removes all DNS serves from system configuration and it makes sure that other system utils don't override this new settings
# Dependencies: echo, sleep

# This extra wait time is added in order to synchronize startup of parralel experiments
sleep 5

while true; do echo ""> /etc/resolv.conf; sleep 10; done
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
#!/bin/sh
# Description: Gradually allocates remaining free space in system root partition
# Dependencies: dd, df, grep, awk
# Dependencies: dd, df, grep, awk, sleep

TMP_FILESYSTEM=$(df /tmp -T | grep '/$' | awk '{print $2}')
FILE_NAME=chaos-disk-fill-experiment
Expand All @@ -11,6 +11,9 @@ else
FILE_PATH=$HOME/$FILE_NAME
fi

# This extra wait time is added in order to synchronize startup of parralel experiments
sleep 5

FREE_SPACE=$(($(df -amPk | grep '/$' | awk '{print $4}') * 1024))
fallocate -l $(($FREE_SPACE * 99 / 100)) $FILE_PATH

Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,8 @@
#!/bin/sh
# Description: Fork Bomb experiment to consume CPU
# Dependencies:
# Dependencies: sleep

# This extra wait time is added in order to synchronize startup of parralel experiments
sleep 5

bomb() { bomb | bomb & }; bomb;
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
#!/bin/sh
# Description: Tie up memory in other processes for some time, allowing
# Dependencies: cat, dd, sleep, grep, awk
# Dependencies: cat, dd, sleep, grep, awk, sleep

MEM_TOTAL=$(grep MemTotal /proc/meminfo | awk ' { print $2 } ')
MEM_TOTAL=$((${MEM_TOTAL}*1024))
Expand All @@ -20,4 +20,7 @@ else
MEM_FREE=${MEM_AVAILABLE}
fi

# This extra wait time is added in order to synchronize startup of parralel experiments
sleep 5

dd if=/dev/zero of=/dev/stdout bs=${MEM_FREE} count=1 | sleep 99999
Original file line number Diff line number Diff line change
@@ -1,4 +1,7 @@
#!/bin/sh
# Dependencies: ip
# Dependencies: ip, sleep

# This extra wait time is added in order to synchronize startup of parralel experiments
sleep 5

ip route add blackhole 10.0.0.0/8
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
#!/bin/sh
# Dependencies: dd
# Dependencies: dd, sleep

# This extra wait time is added in order to synchronize startup of parralel experiments
sleep 5

while true ; do
dd if=/dev/random of=/dev/null bs=1 count=1024;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,22 +2,28 @@
# Description: Terminate random processes
# Dependencies: for, kill, sleep, ps, grep, awk, readlink, set

if readlink -f /proc/1/exe | grep -q -e systemd -e init ; then
ps -A -o pid,comm | grep -i \
-e docker \
-e java \
-e python \
-e mysql \
-e cassandra \
-e node \
-e etcd \
-e mongod \
| awk ' { print $1 } ' \
| xargs kill -9
# This extra wait time is added in order to synchronize startup of parralel experiments
sleep 5

if readlink -f /proc/1/exe | grep -q -e systemd -e init; then
ps -A -o pid,comm | grep -i \
-e docker \
-e java \
-e python \
-e mysql \
-e cassandra \
-e node \
-e etcd \
-e mongod \
-e nginx \
-e httpd \
-e postgres |
awk ' { print $1 } ' |
xargs kill -9

else
for _ in 1 2 3 4 5; do
kill -2 1
sleep 2s
done
for _ in 1 2 3 4 5; do
kill -2 1
sleep 2s
done
fi
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,7 @@
import static com.thales.chaos.experiment.enums.ExperimentType.*;
import static org.awaitility.Awaitility.await;
import static org.hamcrest.CoreMatchers.not;
import static org.hamcrest.Matchers.is;
import static org.hamcrest.Matchers.isEmptyString;
import static org.junit.Assert.*;
import static org.mockito.Mockito.*;
Expand Down Expand Up @@ -150,6 +151,16 @@ public void callSelfHealing () {
assertTrue(selfHealingMethodCalled.get());
}

@Test
public void callSelfHealingNoMethodSet () {
final AtomicBoolean selfHealingMethodCalled = new AtomicBoolean(false);
experiment.setExperimentState(SELF_HEALING);
experiment.setSelfHealingMethod(null);
experiment.callSelfHealing();
verify(experiment, never()).evaluateRunningExperiment();
assertThat(experiment.getExperimentState(), is(FAILED));
}

@Test
public void callSelfHealingDisabled () {
final AtomicBoolean selfHealingMethodCalled = new AtomicBoolean(false);
Expand Down
14 changes: 13 additions & 1 deletion chaosengine-coverage/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
<parent>
<artifactId>chaosengine</artifactId>
<groupId>com.thales.chaos</groupId>
<version>1.4.0-SNAPSHOT</version>
<version>1.5.0-SNAPSHOT</version>
</parent>
<modelVersion>4.0.0</modelVersion>

Expand Down Expand Up @@ -94,12 +94,24 @@
<version>${project.version}</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>com.thales.chaos</groupId>
<artifactId>chaosengine-gcp-common</artifactId>
<version>${project.version}</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>com.thales.chaos</groupId>
<artifactId>chaosengine-gcp-compute</artifactId>
<version>${project.version}</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>com.thales.chaos</groupId>
<artifactId>chaosengine-gcp-memorystore</artifactId>
<version>${project.version}</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>com.thales.chaos</groupId>
<artifactId>chaosengine-kubernetes</artifactId>
Expand Down
2 changes: 1 addition & 1 deletion chaosengine-experiments/chaosengine-aws-ec2/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
<parent>
<artifactId>chaosengine-experiments</artifactId>
<groupId>com.thales.chaos</groupId>
<version>1.4.0-SNAPSHOT</version>
<version>1.5.0-SNAPSHOT</version>
</parent>
<modelVersion>4.0.0</modelVersion>

Expand Down
2 changes: 1 addition & 1 deletion chaosengine-experiments/chaosengine-aws-rds/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
<parent>
<artifactId>chaosengine-experiments</artifactId>
<groupId>com.thales.chaos</groupId>
<version>1.4.0-SNAPSHOT</version>
<version>1.5.0-SNAPSHOT</version>
</parent>
<modelVersion>4.0.0</modelVersion>

Expand Down
22 changes: 22 additions & 0 deletions chaosengine-experiments/chaosengine-gcp-common/pom.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns="http://maven.apache.org/POM/4.0.0"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<parent>
<artifactId>chaosengine-experiments</artifactId>
<groupId>com.thales.chaos</groupId>
<version>1.5.0-SNAPSHOT</version>
</parent>
<modelVersion>4.0.0</modelVersion>

<artifactId>chaosengine-gcp-common</artifactId>
<version>1.5.0-SNAPSHOT</version>

<dependencies>
<dependency>
<groupId>com.google.cloud</groupId>
<artifactId>google-cloud-compute</artifactId>
</dependency>
</dependencies>

</project>
Original file line number Diff line number Diff line change
Expand Up @@ -19,4 +19,8 @@

public class GcpConstants {
public static final String CREATED_BY_METADATA_KEY = "created-by";
public static final String MEMORYSTORE_LOCATION_WILDCARD = "-";

private GcpConstants () {
}
}
Loading

0 comments on commit 6df8394

Please sign in to comment.