Code for Labeling Cloud Data and Clustering Model Validation

Ruby Werman

Getting Started

cloud_labeling.ipynb
- Jupyter Notebook file to label and output patch data from a given date range
visualize_patches.ipynb
- Jupyter Notebook file to cluster labeled patch data, create visualizations, and remove poorly labeled patches
80k_with_31_patches_clustered.ipynb
- Jupyter Notebook file to cluster labeled patch data with the exisiting 80k patches

Necessary Modules:

matplotlib
os
sys
glob
numpy
matplotlib
pyhdf.SD
Tensorflow 1.12.0 for CPU
pandas
seaborn
math
sklearn

How to Label Data

Necesary elelments:

lib_hdfs directory
.txt file of dates (see clouds/src_analysis/dates for examples)
MOD02, MOD35 data from the NASA LAADS website (see here for download instructions)

Run cloud_labeling.ipynb
After running the notebook (and labeling), you will have the necessary file for clustering and validation. This file contains a list of labeled patch instances with the necessary information for the clustering model and analysis.

patches_DDMMYYYY.npy, where DDMMYYYY is the date the patches were labeled

How to Vizualize Data

Necessary elements:

lib_hdfs directory
encoder directory (see "load model" section of visualize_patches.ipynb)
patches_DDMMYYY.npy (my labeled 31 patches can be found here)

Run visualize_patches.ipynb

Edit num_clusters to change the number of clusters for agglomerative clustering
Remove ambigious/mislabeled patches from patch list if necessary
Save plot images if desired

How to Vizualize Data Within the Existing 80k Patch Dataset

Necessary elements:

npy file containing the labels from clustering ALL data together (use the bash script located here)

Run 80k_with_31_patches_clustered.ipynb

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.ipynb_checkpoints		.ipynb_checkpoints
80k_with_31_patches_clustered.ipynb		80k_with_31_patches_clustered.ipynb
Clustering Visualization.pptx		Clustering Visualization.pptx
README.md		README.md
cloud_labeling.ipynb		cloud_labeling.ipynb
visualize_patches.ipynb		visualize_patches.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Code for Labeling Cloud Data and Clustering Model Validation

Ruby Werman

Getting Started

Necessary Modules:

How to Label Data

How to Vizualize Data

How to Vizualize Data Within the Existing 80k Patch Dataset

About

Releases

Packages

Languages

rubywerman/summer19

Folders and files

Latest commit

History

Repository files navigation

Code for Labeling Cloud Data and Clustering Model Validation

Ruby Werman

Getting Started

Necessary Modules:

How to Label Data

How to Vizualize Data

How to Vizualize Data Within the Existing 80k Patch Dataset

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages