simple-spark

一、System architecture

This project implements a mini-version of Spark. The overall class diagram is

DFSClient：The client of our Distributed File System (DFS), only for reading and writing operations.
NameNode：Coordinate data reading and writing.
DataNode：Perform actual data reading, writing and computing.
RDD：Record the relevant information of Resilient Distributed DataSets (RDD), such as operations, child nodes, id, etc. It also defines operations such as textFile, map, take, etc., as interfaces provided to users.
Operation：As an Abstract class, its subclasses include Transformation and Action classes. They have actual operation subclasses such as TextFileOp, MapOp, TakeOp, etc., which define a unified interface (such as __call__) for RDD to invoke and perform actual computing.
SparkContext：The context of our Spark. It is a special RDD as the root of the computing graph.

二、Installation and use

# Download the repository to local
git clone https://github.com/controny/simple-spark
cd simple-spark
# Install the required python packages
python3 scripts/install_all.py
# Set up the configuration
cp common_template.py common.py
vim common.py
# One-click deployment of all nodes
python3 scripts/start_all.py
# One-click termination of all nodes
python3 scripts/stop_all.py

Name		Name	Last commit message	Last commit date
Latest commit History 80 Commits
docs		docs
scripts		scripts
.gitignore		.gitignore
README.md		README.md
common_template.py		common_template.py
data_node.py		data_node.py
dfs_client.py		dfs_client.py
name_node.py		name_node.py
operation.py		operation.py
rdd.py		rdd.py
requirements.txt		requirements.txt
test.py		test.py
uniqueID.py		uniqueID.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

simple-spark

一、System architecture

二、Installation and use

About

Releases

Packages

Languages

controny/simple-spark

Folders and files

Latest commit

History

Repository files navigation

simple-spark

一、System architecture

二、Installation and use

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages