Skip to content

Commit

Permalink
Add files via upload
Browse files Browse the repository at this point in the history
  • Loading branch information
xs018 authored Aug 14, 2023
1 parent 23e0d29 commit 8c2220d
Show file tree
Hide file tree
Showing 62 changed files with 42,149 additions and 0 deletions.
137 changes: 137 additions & 0 deletions CNN_ImageRegression/HW6_Job.e5271556
Original file line number Diff line number Diff line change
@@ -0,0 +1,137 @@
Unloading the anaconda2 module (if loaded)...
Loaded dependency: anaconda3/2020.11
AI/Version: anaconda3.2020.11
-----------------------------

Description
-----------
The modulefile AI/anaconda3.2020.11 provides a unified, rich, anaconda3 based
environment for Artificial Intelligence(AI)/Machine Learning(ML)/Big Data(BD)
on top of based on Python 3.

Module contents
---------------
Several popular AI/ML/BD packages are included in this environment, such as:
tensorflow-gpu, theano, keras-gpu, pytorch, opencv, pandas, scikit-learn,
scikit-image etc.

To check the full list of available packages in this environment, first
activate it and then run the command
conda list

Main packages included in this module:
* astropy 4.2
* blas 1.0
* bokeh 2.2.3
* cudatoolkit 10.0.130
* cudnn 7.6.5
* h5py 2.8.0
* hdf5 1.10.2
* ipython 7.19.0
* jupyter 1.0.0
* jupyterlab 2.2.6
* keras-gpu 2.3.1
* matplotlib 3.3.2
* mkl 2019.4
* nccl 1.3.5
* networkx 2.5
* ninja 1.10.2
* nltk 3.5
* notebook 6.1.6
* numba 0.51.2
* numpy 1.17.0
* opencv 3.4.2
* pandas 1.2.0
* pillow 8.1.0
* pip 20.3.3
* python 3.7.9
* pytorch 1.5.0
* scikit-learn 0.23.2
* scipy 1.5.2
* seaborn 0.11.1
* tensorboard 2.0.0
* tensorflow-gpu 2.0.0
* theano 1.0.4

If you need to further customize this environment (e.g., install additional
packages, or upgrade a particular package),
you should first clone this environment as follows:
conda create --prefix <path to a dir in which you can write> --clone
$AI_ENV
Then activate the newly spawned environment and proceed with your
customization.

To get further help with conda usage, you can try one of the following:
conda -h
conda <command> -h

Activate the module
-------------------
New!: You are NOT required to manually activate this module anymore.

It should get activated automatically after running the module load command
since the following commands are being run on your behalf:
# module load AI/anaconda3.2020.11
# source /opt/packages/anaconda3/2020.11/etc/profile.d/conda.sh # conda
init
# conda activate $AI_ENV
# (/opt/packages/AI/anaconda3-tf2.2020.11) USERNAME@BRIDGES-2:~ $ # << Your
prompt should have changed to something similar.


2021-11-28 23:15:17.433686: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2021-11-28 23:15:17.506159: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:
name: Tesla V100-SXM2-32GB major: 7 minor: 0 memoryClockRate(GHz): 1.53
pciBusID: 0000:b2:00.0
2021-11-28 23:15:17.513304: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0
2021-11-28 23:15:17.597368: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0
2021-11-28 23:15:17.676721: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10.0
2021-11-28 23:15:17.686875: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10.0
2021-11-28 23:15:17.811941: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10.0
2021-11-28 23:15:17.906975: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10.0
2021-11-28 23:15:18.097197: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2021-11-28 23:15:18.099906: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
2021-11-28 23:15:18.100186: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 AVX512F FMA
2021-11-28 23:15:18.130843: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2500000000 Hz
2021-11-28 23:15:18.131443: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x5567843d1d00 executing computations on platform Host. Devices:
2021-11-28 23:15:18.131483: I tensorflow/compiler/xla/service/service.cc:175] StreamExecutor device (0): Host, Default Version
2021-11-28 23:15:18.254270: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x556786b0e270 executing computations on platform CUDA. Devices:
2021-11-28 23:15:18.254327: I tensorflow/compiler/xla/service/service.cc:175] StreamExecutor device (0): Tesla V100-SXM2-32GB, Compute Capability 7.0
2021-11-28 23:15:18.257384: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:
name: Tesla V100-SXM2-32GB major: 7 minor: 0 memoryClockRate(GHz): 1.53
pciBusID: 0000:b2:00.0
2021-11-28 23:15:18.257462: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0
2021-11-28 23:15:18.257481: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0
2021-11-28 23:15:18.257496: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10.0
2021-11-28 23:15:18.257511: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10.0
2021-11-28 23:15:18.257525: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10.0
2021-11-28 23:15:18.257539: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10.0
2021-11-28 23:15:18.257554: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2021-11-28 23:15:18.260005: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
2021-11-28 23:15:18.260072: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0
2021-11-28 23:15:18.264775: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-11-28 23:15:18.264797: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165] 0
2021-11-28 23:15:18.264807: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0: N
2021-11-28 23:15:18.267404: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 30593 MB memory) -> physical GPU (device: 0, name: Tesla V100-SXM2-32GB, pci bus id: 0000:b2:00.0, compute capability: 7.0)
2021-11-28 23:15:23.984448: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0
2021-11-28 23:15:24.232371: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2021-11-28 23:15:25.373557: W tensorflow/stream_executor/cuda/redzone_allocator.cc:312] Not found: ./bin/ptxas not found
Relying on driver to perform ptx compilation. This message will be only logged once.
Traceback (most recent call last):
File "hw6.py", line 76, in <module>
history = model.fit(X_train, y_train, batch_size=32, validation_split=0.2, shuffle=True, callbacks=[monitor, model_checkpoint_callback], verbose=2, epochs=100)
File "/jet/home/xs018/envs/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training.py", line 728, in fit
use_multiprocessing=use_multiprocessing)
File "/jet/home/xs018/envs/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training_v2.py", line 372, in fit
prefix='val_')
File "/jet/home/xs018/envs/lib/python3.7/contextlib.py", line 119, in __exit__
next(self.gen)
File "/jet/home/xs018/envs/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training_v2.py", line 686, in on_epoch
self.progbar.on_epoch_end(epoch, epoch_logs)
File "/jet/home/xs018/envs/lib/python3.7/site-packages/tensorflow_core/python/keras/callbacks.py", line 768, in on_epoch_end
self.progbar.update(self.seen, self.log_values)
File "/jet/home/xs018/envs/lib/python3.7/site-packages/tensorflow_core/python/keras/utils/generic_utils.py", line 473, in update
sys.stdout.flush()
OSError: [Errno 122] Disk quota exceeded
Exception ignored in: <_io.TextIOWrapper name='<stdout>' mode='w' encoding='UTF-8'>
OSError: [Errno 122] Disk quota exceeded
114 changes: 114 additions & 0 deletions CNN_ImageRegression/HW6_Job.o5157211
Original file line number Diff line number Diff line change
@@ -0,0 +1,114 @@
Train on 3599 samples, validate on 900 samples
Epoch 1/100
3599/3599 - 5s - loss: 255.4924 - mse: 255.4924 - val_loss: 0.0781 - val_mse: 0.0781
Epoch 2/100
3599/3599 - 3s - loss: 0.1621 - mse: 0.1621 - val_loss: 0.0732 - val_mse: 0.0732
Epoch 3/100
3599/3599 - 3s - loss: 0.1489 - mse: 0.1489 - val_loss: 0.0706 - val_mse: 0.0706
Epoch 4/100
3599/3599 - 3s - loss: 0.1414 - mse: 0.1414 - val_loss: 0.0701 - val_mse: 0.0701
Epoch 5/100
3599/3599 - 3s - loss: 0.1392 - mse: 0.1392 - val_loss: 0.0685 - val_mse: 0.0685
Epoch 6/100
3599/3599 - 3s - loss: 0.1394 - mse: 0.1394 - val_loss: 0.0670 - val_mse: 0.0670
Epoch 7/100
3599/3599 - 3s - loss: 0.1343 - mse: 0.1343 - val_loss: 0.0658 - val_mse: 0.0658
Epoch 8/100
3599/3599 - 3s - loss: 0.1309 - mse: 0.1309 - val_loss: 0.0639 - val_mse: 0.0639
Epoch 9/100
3599/3599 - 3s - loss: 0.1265 - mse: 0.1265 - val_loss: 0.0623 - val_mse: 0.0623
Epoch 10/100
3599/3599 - 3s - loss: 0.1241 - mse: 0.1241 - val_loss: 0.0602 - val_mse: 0.0602
Epoch 11/100
3599/3599 - 3s - loss: 0.1201 - mse: 0.1201 - val_loss: 0.0582 - val_mse: 0.0582
Epoch 12/100
3599/3599 - 3s - loss: 0.1144 - mse: 0.1144 - val_loss: 0.0568 - val_mse: 0.0568
Epoch 13/100
3599/3599 - 3s - loss: 0.1121 - mse: 0.1121 - val_loss: 0.0546 - val_mse: 0.0546
Epoch 14/100
3599/3599 - 3s - loss: 0.1071 - mse: 0.1071 - val_loss: 0.0525 - val_mse: 0.0525
Epoch 15/100
3599/3599 - 3s - loss: 0.1038 - mse: 0.1038 - val_loss: 0.0497 - val_mse: 0.0497
Epoch 16/100
3599/3599 - 3s - loss: 0.0969 - mse: 0.0969 - val_loss: 0.0478 - val_mse: 0.0478
Epoch 17/100
3599/3599 - 3s - loss: 0.0943 - mse: 0.0943 - val_loss: 0.0457 - val_mse: 0.0457
Epoch 18/100
3599/3599 - 3s - loss: 0.0905 - mse: 0.0905 - val_loss: 0.0429 - val_mse: 0.0429
Epoch 19/100
3599/3599 - 3s - loss: 0.0847 - mse: 0.0847 - val_loss: 0.0410 - val_mse: 0.0410
Epoch 20/100
3599/3599 - 3s - loss: 0.0769 - mse: 0.0769 - val_loss: 0.0384 - val_mse: 0.0384
Epoch 21/100
3599/3599 - 3s - loss: 0.0751 - mse: 0.0751 - val_loss: 0.0366 - val_mse: 0.0366
Epoch 22/100
3599/3599 - 3s - loss: 0.0724 - mse: 0.0724 - val_loss: 0.0336 - val_mse: 0.0336
Epoch 23/100
3599/3599 - 3s - loss: 0.0645 - mse: 0.0645 - val_loss: 0.0315 - val_mse: 0.0315
Epoch 24/100
3599/3599 - 3s - loss: 0.0613 - mse: 0.0613 - val_loss: 0.0294 - val_mse: 0.0294
Epoch 25/100
3599/3599 - 3s - loss: 0.0553 - mse: 0.0553 - val_loss: 0.0274 - val_mse: 0.0274
Epoch 26/100
3599/3599 - 3s - loss: 0.0520 - mse: 0.0520 - val_loss: 0.0251 - val_mse: 0.0251
Epoch 27/100
3599/3599 - 3s - loss: 0.0485 - mse: 0.0485 - val_loss: 0.0229 - val_mse: 0.0229
Epoch 28/100
3599/3599 - 3s - loss: 0.0438 - mse: 0.0438 - val_loss: 0.0209 - val_mse: 0.0209
Epoch 29/100
3599/3599 - 3s - loss: 0.0394 - mse: 0.0394 - val_loss: 0.0190 - val_mse: 0.0190
Epoch 30/100
3599/3599 - 3s - loss: 0.0361 - mse: 0.0361 - val_loss: 0.0169 - val_mse: 0.0169
Epoch 31/100
3599/3599 - 3s - loss: 0.0330 - mse: 0.0330 - val_loss: 0.0152 - val_mse: 0.0152
Epoch 32/100
3599/3599 - 3s - loss: 0.0287 - mse: 0.0287 - val_loss: 0.0135 - val_mse: 0.0135
Epoch 33/100
3599/3599 - 3s - loss: 0.0257 - mse: 0.0257 - val_loss: 0.0118 - val_mse: 0.0118
Epoch 34/100
3599/3599 - 3s - loss: 0.0230 - mse: 0.0230 - val_loss: 0.0106 - val_mse: 0.0106
Epoch 35/100
3599/3599 - 3s - loss: 0.0196 - mse: 0.0196 - val_loss: 0.0092 - val_mse: 0.0092
Epoch 36/100
3599/3599 - 3s - loss: 0.0173 - mse: 0.0173 - val_loss: 0.0080 - val_mse: 0.0080
Epoch 37/100
3599/3599 - 3s - loss: 0.0148 - mse: 0.0148 - val_loss: 0.0070 - val_mse: 0.0070
Epoch 38/100
3599/3599 - 3s - loss: 0.0125 - mse: 0.0125 - val_loss: 0.0058 - val_mse: 0.0058
Epoch 39/100
3599/3599 - 3s - loss: 0.0113 - mse: 0.0113 - val_loss: 0.0050 - val_mse: 0.0050
Epoch 40/100
3599/3599 - 3s - loss: 0.0091 - mse: 0.0091 - val_loss: 0.0044 - val_mse: 0.0044
Epoch 41/100
3599/3599 - 3s - loss: 0.0079 - mse: 0.0079 - val_loss: 0.0037 - val_mse: 0.0037
Epoch 42/100
3599/3599 - 3s - loss: 0.0065 - mse: 0.0065 - val_loss: 0.0031 - val_mse: 0.0031
Epoch 43/100
3599/3599 - 3s - loss: 0.0057 - mse: 0.0057 - val_loss: 0.0027 - val_mse: 0.0027
Epoch 44/100
3599/3599 - 3s - loss: 0.0047 - mse: 0.0047 - val_loss: 0.0023 - val_mse: 0.0023
Epoch 45/100
3599/3599 - 3s - loss: 0.0040 - mse: 0.0040 - val_loss: 0.0020 - val_mse: 0.0020
Epoch 46/100
3599/3599 - 3s - loss: 0.0034 - mse: 0.0034 - val_loss: 0.0017 - val_mse: 0.0017
Epoch 47/100
3599/3599 - 3s - loss: 0.0032 - mse: 0.0032 - val_loss: 0.0016 - val_mse: 0.0016
Epoch 48/100
3599/3599 - 3s - loss: 0.0027 - mse: 0.0027 - val_loss: 0.0015 - val_mse: 0.0015
Epoch 49/100
3599/3599 - 3s - loss: 0.0025 - mse: 0.0025 - val_loss: 0.0014 - val_mse: 0.0014
Epoch 50/100
3599/3599 - 3s - loss: 0.0024 - mse: 0.0024 - val_loss: 0.0014 - val_mse: 0.0014
Epoch 51/100
3599/3599 - 3s - loss: 0.0021 - mse: 0.0021 - val_loss: 0.0014 - val_mse: 0.0014
Epoch 52/100
3599/3599 - 3s - loss: 0.0020 - mse: 0.0020 - val_loss: 0.0014 - val_mse: 0.0014
Epoch 53/100
3599/3599 - 3s - loss: 0.0019 - mse: 0.0019 - val_loss: 0.0014 - val_mse: 0.0014
Epoch 54/100
3599/3599 - 3s - loss: 0.0019 - mse: 0.0019 - val_loss: 0.0014 - val_mse: 0.0014
Epoch 55/100
Restoring model weights from the end of the best epoch.
3599/3599 - 3s - loss: 0.0020 - mse: 0.0020 - val_loss: 0.0014 - val_mse: 0.0014
Epoch 00055: early stopping
test mean square error: 0.0013
Binary file added CNN_ImageRegression/Xiaotong Sun-HW6.pdf
Binary file not shown.
90 changes: 90 additions & 0 deletions CNN_ImageRegression/hw6.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
import os
from glob import glob
import numpy as np
import pandas as pd
from PIL import Image
from skimage.transform import resize
from sklearn.model_selection import train_test_split

import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, Flatten, Conv2D, MaxPooling2D
from tensorflow.keras.callbacks import EarlyStopping

import matplotlib.pyplot as plt

base_path = "/ocean/projects/mch210006p/shared/HW5"
save_path = "/ocean/projects/mch210006p/xs018"
image_size = (240, 240)

#num_samples = 4999

#labels = pd.read_csv(os.path.join(base_path, "DS-1_36W_vapor_fraction.txt"), sep = "\t", usecols=[1])
#labels = labels.values

#imgs = []
#for idx in range(1, num_samples+1):
# img_dir = os.path.join(base_path, "DS-1_36W_images/DS-1_36W.{:05d}.TIFF".format(idx))
# img = Image.open(img_dir)
# img = np.float32(np.array(img)) / 255.
# img = resize(img, image_size, anti_aliasing=True)
# imgs.append(img[..., np.newaxis])

#imgs = np.array(imgs)
#with open(os.path.join(save_path, "train.npy"), "wb") as f:
# np.save(f, imgs)
# np.save(f, labels)

def create_model(img_size=(240, 240)):
input_shape = (img_size[0], img_size[1], 1)
model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3), activation='relu', input_shape=input_shape))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(64, kernel_size=(3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(256, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(1))
model.compile(
optimizer=tf.keras.optimizers.Adam(1e-3),
loss='mean_squared_error',
metrics=['mse'])

return model

with open(os.path.join(save_path, "train.npy"), "rb") as f:
imgs = np.load(f)
labels = np.load(f)

mean = imgs.mean()
std = imgs.std()
data = (imgs - mean) / std

X_train, X_test, y_train, y_test = train_test_split(data, labels, test_size=0.1, random_state=42)

checkpoint_filepath = f'{save_path}/best_model.hdf5'
monitor = EarlyStopping(monitor='val_loss', patience=5, verbose=1, mode='auto', restore_best_weights=True)
model_checkpoint_callback = tf.keras.callbacks.ModelCheckpoint(filepath=checkpoint_filepath,
save_weights_only=True,
monitor='val_loss',
mode='min',
save_best_only=True)
model = create_model()

history = model.fit(X_train, y_train, batch_size=32, validation_split=0.2, shuffle=True, callbacks=[monitor, model_checkpoint_callback], verbose=2, epochs=100)
print(model.summary())

plt.figure()
plt.plot(history.history['loss'], label="training loss")
plt.plot(history.history['val_loss'], label="validation loss")
plt.xlabel('epoches')
plt.ylabel('loss')
plt.legend()
plt.savefig("res/hw6/loss.png")

y_pred = model.predict(X_test)
test_mse = ((y_test.flatten()-y_pred.flatten()) ** 2).mean()
print(f"test mean square error: {test_mse:.4f}")

Binary file added CNN_ImageRegression/loss.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading

0 comments on commit 8c2220d

Please sign in to comment.