MalDocA - Malicious Document Analyzer

MalDocA is a library to parse and extract features from Microsoft Office documents. It supports both OLE and OOXML documents.

The project's goal is to analyze potentially malicious documents to improve user safety and security.

REQUIREMENTS

Bazel (recommended version: 4 or 5)
Clang (recommended version: 11 or 12)
OS: Linux or Windows

GENERAL

Some testdata files contain malicious code! Hence, we use a xor-encoding for some testdata files as a safety measure (key = 0x42). Additionally, they are prefixed by "MALICIOUS_" and postfixed by "_xor_0x42_encoded". In general, be very careful when opening / processing test files!

For convenience, we provide a python script ("testdata_encode.py") to encode / decode those files. The script's output is stored in the same path, having "_xored" as file name appendix. Keep in mind that encoding a file twice decodes it again, i.e. restores the original file.

Example usage: python testdata_encode.py maldoca/service/testdata/c98661bcd5bd2e5df06d3432890e7a2e8d6a3edcb5f89f6aaa2e5c79d4619f3d.docx

WINDOWS

Bazel has some Windows related problems, e.g. maximum path length limitations. Make sure to read the best-practices to avoid them.
Enable symlink support (how-to) as it is required by Bazel.

CHECKOUT

git clone --recurse-submodules https://github.com/google/maldoca.git

cd maldoca

BUILD

Linux: bazel build --config=linux //maldoca/...

Windows: bazel build --config=windows //maldoca/...

TEST

Linux: bazel test --config=linux //maldoca/...

Windows: bazel test --config=windows //maldoca/...

DOCKER

We provide a docker file in "docker/Dockerfile". This is the reference platform we use for continuous integration and optionally (arguably recommended) for development as well. Please check the documentation in "docker/Dockerfile" on how to build and use for development.

CONTACT

maldoca-community@google.com

DISCLAIMER

This is not an official Google product.

Name		Name	Last commit message	Last commit date
Latest commit History 100 Commits
.github/workflows		.github/workflows
bazel		bazel
ci/kokoro		ci/kokoro
docker		docker
maldoca		maldoca
third_party		third_party
.bazelrc		.bazelrc
.clang-format		.clang-format
.gitignore		.gitignore
.gitmodules		.gitmodules
BUILD.bazel		BUILD.bazel
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
WORKSPACE		WORKSPACE
clang-format.sh		clang-format.sh
testdata_encode.py		testdata_encode.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MalDocA - Malicious Document Analyzer

REQUIREMENTS

GENERAL

WINDOWS

CHECKOUT

BUILD

TEST

DOCKER

CONTACT

DISCLAIMER

About

Contributors 4

Languages

License

google/maldoca

Folders and files

Latest commit

History

Repository files navigation

MalDocA - Malicious Document Analyzer

REQUIREMENTS

GENERAL

WINDOWS

CHECKOUT

BUILD

TEST

DOCKER

CONTACT

DISCLAIMER

About

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Contributors 4

Languages