unify how to freeze some parameters for coca pre-training (#526)

Summary: 1. we already have support of freezing vision encoder; as experiment goes, we want to experiment to freeze other part of coca, e.g., text decoder. This diff provides unified way of freezing/unfreezing modules, the same way as we are doing for linear probe or finetune. 2. add configuration of using MLP instead of attention pooler for vision adapter; 3. for output projection in text decoder, change bias=False to True. In many other places, e.g., LP head, ember's output module and LLAVA, they are using bias=True (which is default value in Linear). Differential Revision: D54559503 Privacy Context Container: 303860477774201
facebookresearch · Mar 21, 2024 · 88933e9 · 88933e9
1 parent dbeed97
commit 88933e9
Showing 1 changed file with 11 additions and 1 deletion.
diff --git a/tests/test_utils.py b/tests/test_utils.py
@@ -192,8 +192,18 @@ def assert_expected_namedtuple(
 
 
 def init_weights_with_constant(model: nn.Module, constant: float = 1.0) -> None:
-    for p in model.parameters():
+    for n, p in model.named_parameters():
         nn.init.constant_(p, constant)
+        # reduce the change to the tests
+        for k in {
+            "text_projection.bias",
+            "pooled_projection.bias",
+            "output_projection.bias",
+            "vision_proj.bias",
+        }:
+            if n.endswith(k):
+                nn.init.constant_(p, 0.0)
+                break
 
 
 def tensor_hash(x: torch.tensor, scaling=0.05, buckets=1000) -> torch.tensor: