IBMocha

A Caffeinated Solution To Privacy

Modern Problems Require Modern Solutions

In the race to big data solutions and data-driven analytics, it is important to preserve the privacy of the information source as data propagates into the loops of the Internet.

IBMocha is a hack on IBM Watson NLU tools to utilize the power of Machine Learning Cloud Infrastructure to redact sensitive information on the Internet.

If you are a web-admin, you can use this code to look for potential exposure of private data on your pages. This can help you screen your website for possible GDPR Violations.

IBMocha is also modelled to target the recent outbreak of Aadhaar Card Data that exploited search engine crawlers.

Exposures Identified

Individual Names
Location
Email Addresses
Phone numbers
Aadhaar Numbers (primitive) (XXXX-XXXX-XXXX format)

Get API credentials

Go to IBM Cloud Console -> Login/Register -> Visit Dashboard
Visit Catalog -> AI -> Natural Language Understanding or visit Natural Language Understanding
Create a Watson NLU Service
Go to Dashboard
Select your newly created Natural Language Understanding service
Go to Service Credentials tab
Create new credentials if it doesn't show up
Click view credentials
Create config.json in root directory of repo
Paste the credentials in json format in config.json
Add config.json to .gitignore to avoid misuse

Development

Clone repo

git clone https://github.com/ajwad-shaikh/IBMocha.git
cd IBMocha
Install dependencies

npm install
npm install nodemon -g
npm run serve (win-serve if Windows Machine)

Usage

open localhost:8008
There are two modes of input - Text and URL
URL Mode - Enter URL and click on submit to analyse the website for personal information exposure using IBM Watson NLU Service

Text Mode - Enter text and click on submit to analyse the text for personal information exposure using IBM Watson NLU Service.

Text Mode also renders a redacted preview that masks personal information.

Further Development

Include PDF file input.
Include redacted website preview.
Include PDF output.

Name		Name	Last commit message	Last commit date
Latest commit History 105 Commits
public		public
screenshots		screenshots
views		views
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
TODO.md		TODO.md
nluService.js		nluService.js
package-lock.json		package-lock.json
package.json		package.json
server.js		server.js

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

IBMocha

A Caffeinated Solution To Privacy

Exposures Identified

Get API credentials

Development

Usage

Further Development

About

Releases

Packages

Contributors 5

Languages

License

ajwad-shaikh/IBMocha

Folders and files

Latest commit

History

Repository files navigation

IBMocha

A Caffeinated Solution To Privacy

Exposures Identified

Get API credentials

Development

Usage

Further Development

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 5

Languages

Packages