From cb6033d171100690d214e217e9c6e992a401e323 Mon Sep 17 00:00:00 2001 From: Kiel Friedt Date: Tue, 17 Dec 2024 14:48:18 -0800 Subject: [PATCH] Adding mongodb8 details. Modified how it is install and configured --- .../mongodb/benchmark_mongodb-7.0.md | 128 ++++++++++++++++++ .../mongodb/benchmark_mongodb-8.0.md | 124 +++++++++++++++++ .../mongodb/mongodb_configuration.md | 78 +++++++++++ .../mongodb/run_mongodb.md | 14 +- 4 files changed, 337 insertions(+), 7 deletions(-) create mode 100644 content/learning-paths/servers-and-cloud-computing/mongodb/benchmark_mongodb-7.0.md create mode 100644 content/learning-paths/servers-and-cloud-computing/mongodb/benchmark_mongodb-8.0.md create mode 100644 content/learning-paths/servers-and-cloud-computing/mongodb/mongodb_configuration.md diff --git a/content/learning-paths/servers-and-cloud-computing/mongodb/benchmark_mongodb-7.0.md b/content/learning-paths/servers-and-cloud-computing/mongodb/benchmark_mongodb-7.0.md new file mode 100644 index 000000000..75db6fe83 --- /dev/null +++ b/content/learning-paths/servers-and-cloud-computing/mongodb/benchmark_mongodb-7.0.md @@ -0,0 +1,128 @@ +--- +# User change +title: "Benchmark MongoDB 7.0 on Arm with Yahoo Cloud Serving Benchmark (YCSB)" + +weight: 4 # (intro is 1), 2 is first, 3 is second, etc. + +# Do not modify these elements +layout: "learningpathall" +--- +To further measure the performance of MongoDB, you will run the [Yahoo Cloud Serving Benchmark](http://github.com/brianfrankcooper/YCSB). + +YCSB is an open sourced project which provides the framework and common set of workloads to evaluate the performance of different "key-value" and "cloud" serving stores. Here are the steps to run YCSB to evaluate the performance of MongoDB running on 64-bit Arm machine. + +## Additional software packages + +To run YCSB, additional software packages are required, [Apache Maven](https://maven.apache.org/), and [Python](https://www.python.org) 2.7. + +Installing Apache Maven: + +```bash + cd ~ + wget https://archive.apache.org/dist/maven/maven-3/3.1.1/binaries/apache-maven-3.1.1-bin.tar.gz + sudo tar xzf apache-maven-*-bin.tar.gz -C /usr/local + cd /usr/local + sudo ln -s apache-maven-* maven + cd ~/ + export M2_HOME=/usr/local/maven + export PATH="$M2_HOME/bin:$PATH" +``` + +Installing Python 2.7: + +{{< tabpane code=true >}} + {{< tab header="Ubuntu" >}} +sudo apt-get update +sudo apt install python -y +{{< /tab >}} +{{< tab header="RHE/Amazon" >}} +sudo yum check-update +sudo yum install python2 +{{< /tab >}} +{{< /tabpane >}} +{{% notice Python Note%}} + +For Ubuntu 22.04 the `python` package may not be found. You can install Python 2.7 using: +```console +sudo apt install python2 -y +sudo update-alternatives --install /usr/bin/python python /usr/bin/python2 1 +``` + +For Red Hat you can configure `python2` to be the default `python` using: +```console +sudo alternatives --set python /usr/bin/python2 +``` +{{% /notice %}} + +## Setup YCSB + +Download the latest released YCSB zip file and uncompress it. + +```bash { pre_cmd="sudo apt install -y python" } +mkdir ycsb && cd ycsb +curl -O --location https://github.com/brianfrankcooper/YCSB/releases/download/0.17.0/ycsb-0.17.0.tar.gz +tar xfvz ycsb-0.17.0.tar.gz +``` +Now `cd` into project folder and run the executable to print a description of how to use the benchmark. + +```bash { env="M2_HOME=/usr/local/maven,PATH=/usr/local/maven/bin:$PATH",cwd="./ycsb",ret_code="2" } +cd ycsb-0.17.0 +./bin/ycsb +``` +## Load/Insert Test on MongoDB + +To load and test the performance of loading data(INSERT) into default database `ycsb` at `localhost:27017` where MongoDB is running using the synchronous driver run the following command: + +```console +./bin/ycsb load mongodb -s -P workloads/workloada -p mongodb.url=mongodb://localhost:27017/ycsb?w=0 -threads 10 +``` +The "-P" parameter is used to load property files. In this example, you used it load the workloada parameter file which sets the recordcount to 1000 in addition to other parameters. The "-threads" parameter indicates the number of threads and is set to 1 by default. + +## Update/Read/Read Modify Write Test on MongoDB + +To test the performance of executing a workload which includes running UPDATE, Read Modify Write(RMW) and/or READ operations on the data using 10 threads for example, use the following command: + +```console +./bin/ycsb load mongodb -s -P workloads/workloada -p mongodb.url=mongodb://localhost:27017/ycsb?w=0 +``` + +The workloads/workloada file in this example sets the following values `readproportion=0.5` and `updateproportion=0.5` which means there is an even split between the number of READ and UPDATE operations performed. You can change the type of operations and the splits by providing your own workload parameter file. + +For more detailed information on all the parameters for running a workload refer to [this section](https://github.com/brianfrankcooper/YCSB/wiki/Running-a-Workload). + +## View the results + +At the end of each test, statistics are printed to the console. Shown below is the output from the end of Load/Insert test + +```output +2022-07-06 15:50:18:917 1 sec: 1000 operations; 542.01 current ops/sec; [CLEANUP: Count=10, Max=12951, Min=0, Avg=1295.2, 90=4, 99=12951, 99.9=12951, 99.99=12951] [INSERT: Count=1000, Max=134655, Min=561, Avg=8506.37, 90=10287, 99=39903, 99.9=134015, 99.99=134655] +[OVERALL], RunTime(ms), 1849 +[OVERALL], Throughput(ops/sec), 540.8328826392644 +[TOTAL_GCS_Copy], Count, 5 +[TOTAL_GC_TIME_Copy], Time(ms), 23 +[TOTAL_GC_TIME_%_Copy], Time(%), 1.2439156300703083 +[TOTAL_GCS_MarkSweepCompact], Count, 0 +[TOTAL_GC_TIME_MarkSweepCompact], Time(ms), 0 +[TOTAL_GC_TIME_%_MarkSweepCompact], Time(%), 0.0 +[TOTAL_GCs], Count, 5 +[TOTAL_GC_TIME], Time(ms), 23 +[TOTAL_GC_TIME_%], Time(%), 1.2439156300703083 +[CLEANUP], Operations, 10 +[CLEANUP], AverageLatency(us), 1295.2 +[CLEANUP], MinLatency(us), 0 +[CLEANUP], MaxLatency(us), 12951 +[CLEANUP], 95thPercentileLatency(us), 12951 +[CLEANUP], 99thPercentileLatency(us), 12951 +[INSERT], Operations, 1000 +[INSERT], AverageLatency(us), 8506.367 +[INSERT], MinLatency(us), 561 +[INSERT], MaxLatency(us), 134655 +[INSERT], 95thPercentileLatency(us), 11871 +[INSERT], 99thPercentileLatency(us), 39903 +[INSERT], Return=OK, 1000 +... +``` +## Other tests + +For instructions on running any other tests or more details on the metrics reported, refer to the [GitHub project for the YCSB](https://github.com/brianfrankcooper/YCSB/wiki/). + diff --git a/content/learning-paths/servers-and-cloud-computing/mongodb/benchmark_mongodb-8.0.md b/content/learning-paths/servers-and-cloud-computing/mongodb/benchmark_mongodb-8.0.md new file mode 100644 index 000000000..716fd50c8 --- /dev/null +++ b/content/learning-paths/servers-and-cloud-computing/mongodb/benchmark_mongodb-8.0.md @@ -0,0 +1,124 @@ +--- +# User change +title: "Benchmark MongoDB 8.0 on Arm with Yahoo Cloud Serving Benchmark (YCSB)" + +weight: 4 # (intro is 1), 2 is first, 3 is second, etc. + +# Do not modify these elements +layout: "learningpathall" +--- +To further measure the performance of MongoDB, you will run the [Yahoo Cloud Serving Benchmark](http://github.com/brianfrankcooper/YCSB). + +YCSB is an open sourced project which provides the framework and common set of workloads to evaluate the performance of different "key-value" and "cloud" serving stores. Here are the steps to run YCSB to evaluate the performance of MongoDB running on 64-bit Arm machine. + +## Additional software packages + +To run YCSB, additional software packages are required: default-jdk, default-jre, maven, make and Python. + + +Install all other packages: + +{{< tabpane code=true >}} + {{< tab header="Ubuntu" >}} +sudo apt-get update +sudo apt install -y default-jre default-jdk maven make gcc +{{< /tab >}} +{{< /tabpane >}} +{{% notice Python Note%}} + +For Ubuntu 22.04 and 24.04 the `python` package may not be found. You can install Python 2.7 using: +```console +wget https://www.python.org/ftp/python/2.7.18/Python-2.7.18.tgz +cd Python-2.7.18 +sudo ./configure --enable-optimizations +make altinstall +ln -s /usr/local/bin/python2.7 /usr/bin/python +``` +{{% /notice %}} + +## Setup YCSB + +Download the latest released YCSB zip file and uncompress it. + +```bash +mkdir ycsb && cd ycsb +curl -O --location https://github.com/brianfrankcooper/YCSB/releases/download/0.17.0/ycsb-0.17.0.tar.gz +tar xfvz ycsb-0.17.0.tar.gz + +``` +Now `cd` into project folder and run the executable to print a description of how to use the benchmark. + +```bash +cd ycsb-0.17.0 +./bin/ycsb +``` + +## Most Common MongoDB Test Setup + +The recommended test setup is a relica set. Which contains three nodes each of equal size. A primary will be the node you send the YCSB traffic to. + +## Recommended Tests on MongoDB + +The most common real world test to run is a 95/5 test, 95% read and 5% update. 100/0 and 90/10 are also popular. Run the following commands for about 5 mins before collecting data. + +Load the dataset +```console +./bin/ycsb load mongodb -s -P workloads/workloadb -p mongodb.url=mongodb://localhost:27017 -p compressibility=2 -p fieldlengthdistribution=zipfian -p minfieldlength=50 -threads 64 -p recordcount=20000000 +``` + +95/5 +```console +./bin/ycsb run mongodb -s -P workloads/workloadb -p mongodb.url=mongodb://localhost:27017 -p minfieldlength=50 -p compressibility=2 -p maxexecutiontime=120 -threads 64 -p operationcount=40000000 -p recordcount=20000000 -p requestdistribution=zipfian -p readproportion=0.95 -p updateproportion=0.05 + +``` + +100/0 +```console +./bin/ycsb run mongodb -s -P workloads/workloadc -p mongodb.url=mongodb://Localhost:27017 -p minfieldlength=50 -p compressibility=2 -p maxexecutiontime=120 -threads 64 -p operationcount=40000000 -p recordcount=20000000 -p requestdistribution=zipfian -p readproportion=1.0 -p updateproportion=0.0 + +``` + +90/10 +```console +./bin/ycsb run mongodb -s -P workloads/workloadb -p mongodb.url=mongodb://localhost:27017 -p minfieldlength=50 -p compressibility=2 -p maxexecutiontime=120 -threads 64 -p operationcount=40000000 -p recordcount=20000000 -p requestdistribution=zipfian -p readproportion=0.90 -p updateproportion=0.10 + +``` + +For more detailed information on all the parameters for running a workload refer to [this section](https://github.com/brianfrankcooper/YCSB/wiki/Running-a-Workload). + +## View the results + +At the end of each test, statistics are printed to the console. Shown below is the output from the end of Load/Insert test + +```output +2022-07-06 15:50:18:917 1 sec: 1000 operations; 542.01 current ops/sec; [CLEANUP: Count=10, Max=12951, Min=0, Avg=1295.2, 90=4, 99=12951, 99.9=12951, 99.99=12951] [INSERT: Count=1000, Max=134655, Min=561, Avg=8506.37, 90=10287, 99=39903, 99.9=134015, 99.99=134655] +[OVERALL], RunTime(ms), 1849 +[OVERALL], Throughput(ops/sec), 540.8328826392644 +[TOTAL_GCS_Copy], Count, 5 +[TOTAL_GC_TIME_Copy], Time(ms), 23 +[TOTAL_GC_TIME_%_Copy], Time(%), 1.2439156300703083 +[TOTAL_GCS_MarkSweepCompact], Count, 0 +[TOTAL_GC_TIME_MarkSweepCompact], Time(ms), 0 +[TOTAL_GC_TIME_%_MarkSweepCompact], Time(%), 0.0 +[TOTAL_GCs], Count, 5 +[TOTAL_GC_TIME], Time(ms), 23 +[TOTAL_GC_TIME_%], Time(%), 1.2439156300703083 +[CLEANUP], Operations, 10 +[CLEANUP], AverageLatency(us), 1295.2 +[CLEANUP], MinLatency(us), 0 +[CLEANUP], MaxLatency(us), 12951 +[CLEANUP], 95thPercentileLatency(us), 12951 +[CLEANUP], 99thPercentileLatency(us), 12951 +[INSERT], Operations, 1000 +[INSERT], AverageLatency(us), 8506.367 +[INSERT], MinLatency(us), 561 +[INSERT], MaxLatency(us), 134655 +[INSERT], 95thPercentileLatency(us), 11871 +[INSERT], 99thPercentileLatency(us), 39903 +[INSERT], Return=OK, 1000 +... +``` +## Other tests + +For instructions on running any other tests or more details on the metrics reported, refer to the [GitHub project for the YCSB](https://github.com/brianfrankcooper/YCSB/wiki/). + diff --git a/content/learning-paths/servers-and-cloud-computing/mongodb/mongodb_configuration.md b/content/learning-paths/servers-and-cloud-computing/mongodb/mongodb_configuration.md new file mode 100644 index 000000000..708034b4f --- /dev/null +++ b/content/learning-paths/servers-and-cloud-computing/mongodb/mongodb_configuration.md @@ -0,0 +1,78 @@ +--- +# User change +title: "MongoDB test configuration and setup" + +weight: 3 # 1 is first, 2 is second, etc. + +# Do not modify these elements +layout: "learningpathall" +--- +The most popular test setup for read world testing is a replica set. A replica set of three equal sized nodes is created and initiated. + +## What is a replica Set? +A replica set is a group of instances that maintain the same data set. A replica set contains many nodes and in our test three nodes. Out of the three nodes, one and only one member is the primary node, while the other nodes are secondary nodes. + +## What node size should I use? +The most common size for testing MongoDB is a 8vCPU instance. You are welcome to test with any sized machine. But If you are looking for ideal testing conditionals 8 is more than enough. 32 Gb of ram is recommended for testing. + +## How should I run this test? +It is recommended to avoid disk and keep the complete date set within memory. The recommended configuration is below and will be explains in detail. + +## Mongod.conf + +```console +# Configuration Options: https://docs.mongodb.org/manual/reference/configuration-options/ +# Log Messages/Components: https://docs.mongodb.com/manual/reference/log-messages/index.html + +systemLog: + destination: file + logAppend: true + path: /var/log/mongodb/mongodb.log + +storage: + dbPath: /mnt/mongodb # Mounting point selected + engine: wiredTiger + wiredTiger: + engineConfig: + configString: "cache_size=16484MB" # 50% of your ram is recommened. Adding more helps depending on dataset. + +replication: + replSetName: "rs0" # Name of your replicaset + oplogSizeMB: 5000 + +# network interfaces +net: + port: 27017 + bindIp: 0.0.0.0 + maxIncomingConnections: 16000 +setParameter: + diagnosticDataCollectionDirectorySizeMB: 400 + honorSystemUmask: false + lockCodeSegmentsInMemory: true + reportOpWriteConcernCountersInServerStatus: true + suppressNoTLSPeerCertificateWarning: true + tlsWithholdClientCertificate: true +``` +**systemLog:** Contains locations and details of where logging should be contained. +- **path:** Location for logging + +**storage:** Its recommended to run test within memory to get achieve the best performance. This contains details on the engine used and location of storage. +- **engine:** Wiredtiger is used in this case. Using a disk will add latency. +- **cache_size:** The minimum if using the recommend instance size is 50% of 32(16gb). But in testing using 18gb produced better results. + +**replication:** This is used for replica set setup. +- **replSetName:** This is the name of the replica set. +- **oplogSizeMB:** 5% of the disk size is recommended. + +**net:** Contains details of networking on the node. +- **port:** 27017 is the port used for replica sets +- **maxIncomingConnections:** The maximum number of incoming connections supported by MongoDB + +**setParameter:** Addtional options +- **diagnosticDataCollectionDirectorySizeMB:** 400 is based on the docs. +- **honorSystemUmask:** Sets read and write permissions only to the owner of new files +- **lockCodeSegmentsInMemory:** Locks code into memory and prevents it from being swapped. +- **suppressNoTLSPeerCertificateWarning:** allows clients to connect without a certificate. (Only for testing purposes) +- **tlsWithholdClientCertificate:** Will not send the certification during communication. (Only for testing purposes) + +If you would like to use encryption you will need to add the security and keyFile to your configuration. As well as change some of the parameters in the mongod.conf. \ No newline at end of file diff --git a/content/learning-paths/servers-and-cloud-computing/mongodb/run_mongodb.md b/content/learning-paths/servers-and-cloud-computing/mongodb/run_mongodb.md index e41ecb708..2688b373e 100644 --- a/content/learning-paths/servers-and-cloud-computing/mongodb/run_mongodb.md +++ b/content/learning-paths/servers-and-cloud-computing/mongodb/run_mongodb.md @@ -9,17 +9,17 @@ layout: "learningpathall" --- [MongoDB](https://www.mongodb.com/) is a popular free-to-use document database. -The latest released version of MongoDB Community Edition (7.0) is supported on the following Linux distributions: +The latest released version of MongoDB Community Edition (8.0) is supported on the following Linux distributions: -* Ubuntu Versions - 20.04, 22.04 +* Ubuntu Versions - 20.04, 22.04, 24.04 * RHEL/CentOS 8, 9 -* Amazon Linux 2, 2023 +* Amazon Linux 2023 Refer to this [page](https://www.mongodb.com/docs/manual/administration/production-notes/#platform-support-matrix) for the complete platform support matrix. ## Install and Run MongoDB on Ubuntu -Launch an [Arm based instance](/learning-paths/servers-and-cloud-computing/csp/) running either Ubuntu versions 20.04 or 22.04. +Launch an [Arm based instance](/learning-paths/servers-and-cloud-computing/csp/) running either Ubuntu versions 20.04, 22.04 or 24.04. Follow [Install MongoDB Community Edition on Ubuntu](https://www.mongodb.com/docs/manual/tutorial/install-mongodb-on-ubuntu/) to install and run MongoDB on your instance. @@ -29,8 +29,8 @@ Launch an [Arm based instance](/learning-paths/servers-and-cloud-computing/csp/) Follow [Install MongoDB Community Edition on Red Hat or CentOS](https://www.mongodb.com/docs/manual/tutorial/install-mongodb-on-red-hat/) to install and run MongoDB on your instance. -## Install and Run MongoDB on Amazon Linux 2 +## Install and Run MongoDB on Amazon Linux 2023 -Launch an [Arm based instance](/learning-paths/servers-and-cloud-computing/csp/) running Amazon Linux 2 or Amazon Linux 2023. +Launch an [Arm based instance](/learning-paths/servers-and-cloud-computing/csp/) running Amazon Linux 2023. -Follow [Install MongoDB Community Edition on Amazon Linux](https://www.mongodb.com/docs/manual/tutorial/install-mongodb-on-amazon/) to install and run MongoDB on your instance. +Follow [Install MongoDB Community Edition on Amazon Linux 2023](https://www.mongodb.com/docs/manual/tutorial/install-mongodb-on-amazon/) to install and run MongoDB on your instance.