Add multi hot sparse categorical crossentropy #163

roywei · 2018-08-30T16:56:39Z

Summary

This PR adds a new feature to calculate categorical cross-entropy on multi hot sparse labels
Inputs are softmax predictions and true labels.
It should return same loss as categorical cross-entropy.

Example:

Input:
input data is random images of size (32, 32) in channels first data format
 Labels:
Tradition labels are in the shape of (num_samples, num_class), for example:
labels = [[0, 1, 1, ..., 0],
          [1, 1, 0, ..., 0],
          ...
          [0, 0, 0, ..., 1]]
where len(labels) = num_samples, len(labels[0]) = num_classes
 However, when num_classes are very large and labels are very sparse,
we can represent them differently, for example:
 There are total 1000 classes, so there will be 10000 different labels.
Each image can belong to at most 5 labels at the same time.
labels = [[1, 2],
          [0, 1],
          ...
          [999]]
where labels is a list of list
 Special Note:
To deal with different length of sparse labels, we pad them with negative values,
so we can differentiate padding values with normal labels. It will become:
padded_labels = pad_sequeences(labels, value=-1)
padded_labels = [[-1, -1, -1, 1, 2],
          [-1, -1, -1, 0, 1],
          ...
          [-1, -1, -1, -1, 999]]
It will have shape (num_samples, 5) which still save space compare to dense labels.

Changes

Add implementation for loss and metrics
Add examples at examples/multi_hot_sparse_categorical_crossentropy.py
Add unit tests for loss and metric
Performance:
X3 faster for calculating 3000 classes multi labeled sparse labels. (5 labels at most for each data)

('categorical crossentropy loss time per epoch:', 0.7335219383239746)
('multi hot sparse categorical crossentropy loss time per epoch:', 0.23781204223632812)

PR Overview

This PR requires new unit tests [y/n] (make sure tests are included)
This PR requires to update the documentation [y/n] (make sure the docs are up-to-date)
This PR is backwards compatible [y/n]
This PR changes the current API [y/n]

sandeep-krishnamurthy

Awesome work!

sandeep-krishnamurthy · 2018-08-31T00:35:55Z

keras/backend/mxnet_backend.py

+                                                               keepdims=True))
+    mx_output = mx.sym.clip(mx_output, a_min=epsilon(), a_max=1.0 - epsilon())
+    mx_output = mx.sym.concat(mx.sym.full((target.shape[0],1), 0.5), mx_output)
+    from mxnet.symbol.contrib import foreach


nit: move it.

sandeep-krishnamurthy · 2018-08-31T00:36:58Z

keras/backend/mxnet_backend.py

+        mx_output = mx.sym.broadcast_div(mx_output, mx.sym.sum(mx_output,
+                                                               axis=axis,
+                                                               keepdims=True))
+    mx_output = mx.sym.clip(mx_output, a_min=epsilon(), a_max=1.0 - epsilon())


Nice work. Please document these steps so it will be easier later for maintenance.

kalyc

Looks good to me, added a few comments inline

kalyc · 2018-09-17T18:26:45Z

examples/multi_hot_sparse_categorical_crossentropy.py

+            break
+    sparse_label_j = np.where(dense_labels[i] == 1)[0]
+    sparse_labels.append(sparse_label_j)
+


[minor] remove blank line

kalyc · 2018-09-17T18:28:02Z

examples/multi_hot_sparse_categorical_crossentropy.py

+              metrics=['accuracy'])
+
+# pad sparse labels into shape length with value -1 to differentiate from normal labels
+y_train_pad = pad_sequences(sparse_labels, value=-1)


why not add this padding logic inside multi_hot_sparse_categorical_crossentropy calculating method?

It has to be done at numpy array level, before feeding to the model. Keras can't take a input(y_true) in the form of a list of list, and the list length may vary. (e.g. [[1,2],[0],[0,1,2]])

kalyc · 2018-09-17T18:29:18Z

keras/backend/mxnet_backend.py

+    # using control flow ops to iterate output and take target (true label)
+    _step = lambda data, _: (mx.sym.take(data[0], data[1]), [])
+    data = [mx_output, target.symbol]
+    outputs, _ = mx.symbol.contrib.foreach(_step, data, [])


this is a MXNet contrib API do we have an issue to track that this needs to be updated when the stable version is released for foreach?

Currently we have CI to test MXNet API changes (deprecated or breaking change). It's the same for contrib ops and other normal ops. If it's moved to mx.sym, it will give a deprecated warning. we should be able to catch it. created #173

kalyc · 2018-09-17T18:30:44Z

tests/keras/losses_test.py

@@ -109,6 +109,22 @@ def test_sparse_categorical_crossentropy_4d():
    assert np.isclose(expected_loss, np.mean(loss))


+def test_multi_hot_sparse_categorical_crossentropy():


please add test for multi_hot_sparse_categorical_accuracy

added, see changes at tests/keras/metrics_test.py

sandeep-krishnamurthy

Thanks for your contributions. LGTM

kalyc

LGTM, thanks for your contribution @roywei

kalyc · 2018-09-20T17:26:59Z

keras/backend/mxnet_backend.py

@@ -3017,6 +3017,11 @@ def multi_hot_sparse_categorical_crossentropy(target, output, from_logits=False,
    >>>y_true_np2 = np.array([[1, 2], [0, 2],[0]])
    ```
    """
+    # TODO: remove version check after mxnet 1.3.1 stable release


please create a tracking issue to remove this check

tracked at #175

add multi hot sparse categorical crossentropy

e8fd9e1

roywei changed the title ~~add multi hot sparse categorical crossentropy~~ [WIP]add multi hot sparse categorical crossentropy Aug 30, 2018

sandeep-krishnamurthy reviewed Aug 31, 2018

View reviewed changes

roywei added 4 commits September 11, 2018 17:29

add comments, examples, utils

871c46f

add related accuracy

9127f07

add clear comments, add example usage

a1e69ba

update example, fix metric

b2458ba

roywei changed the title ~~[WIP]add multi hot sparse categorical crossentropy~~ Add multi hot sparse categorical crossentropy Sep 13, 2018

roywei requested review from sandeep-krishnamurthy and kalyc September 13, 2018 07:58

kalyc reviewed Sep 17, 2018

View reviewed changes

sandeep-krishnamurthy approved these changes Sep 19, 2018

View reviewed changes

add version check

f066dbb

kalyc approved these changes Sep 20, 2018

View reviewed changes

roywei merged commit ae719f1 into awslabs:dev Sep 20, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add multi hot sparse categorical crossentropy #163

Add multi hot sparse categorical crossentropy #163

roywei commented Aug 30, 2018 •

edited

Loading

sandeep-krishnamurthy left a comment

sandeep-krishnamurthy Aug 31, 2018

sandeep-krishnamurthy Aug 31, 2018

kalyc left a comment

kalyc Sep 17, 2018

roywei Sep 20, 2018

kalyc Sep 17, 2018

roywei Sep 17, 2018

kalyc Sep 17, 2018

roywei Sep 17, 2018

kalyc Sep 17, 2018

roywei Sep 20, 2018

sandeep-krishnamurthy left a comment

kalyc left a comment

kalyc Sep 20, 2018

roywei Sep 20, 2018

		@@ -109,6 +109,22 @@ def test_sparse_categorical_crossentropy_4d():
		assert np.isclose(expected_loss, np.mean(loss))


		def test_multi_hot_sparse_categorical_crossentropy():

Add multi hot sparse categorical crossentropy #163

Add multi hot sparse categorical crossentropy #163

Conversation

roywei commented Aug 30, 2018 • edited Loading

Summary

Changes

PR Overview

sandeep-krishnamurthy left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kalyc left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sandeep-krishnamurthy left a comment

Choose a reason for hiding this comment

kalyc left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

roywei commented Aug 30, 2018 •

edited

Loading