Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problems converting keypoint RCNN from Detectron2 to TensorRT #2678

Closed
GEngels opened this issue Feb 13, 2023 · 11 comments
Closed

Problems converting keypoint RCNN from Detectron2 to TensorRT #2678

GEngels opened this issue Feb 13, 2023 · 11 comments
Assignees
Labels
triaged Issue has been triaged by maintainers

Comments

@GEngels
Copy link

GEngels commented Feb 13, 2023

I have been trying to convert the Keypoint Mask-RCNN architecture from Detectron2 to a .trt file. With the suggestions from this issue (#2546) and the readme (https://github.com/NVIDIA/TensorRT/tree/main/samples/python/detectron/README.md) I have been able to succesfully convert the instance segmentation version of the network.
I am trying to use a similar approach for the keypoint model but I am running into a problem.

The current status:

I made some changes to the create_onnx.py to make it suitable for the keypoint model. The changes are only in the second part of the roi_heads function where some names have been changed (mask_pooler -> keypoint _pooler). The final node I am grabbing from is the Resize node which does the resizing after the upsampling. Here is the code with the changes:
`

        mask_pooler_output = self.ROIAlign(nms_outputs[1], p2, p3, p4, p5, self.second_ROIAlign_pooled_size, \
                                           self.second_ROIAlign_sampling_ratio, self.second_ROIAlign_type, self.second_NMS_max_proposals, 'keypoint_pooler')

        # Reshape mask pooler output.
        mask_pooler_shape = np.asarray([self.second_NMS_max_proposals*self.batch_size, self.fpn_out_channels, self.second_ROIAlign_pooled_size, self.second_ROIAlign_pooled_size], dtype=np.int64)
        mask_pooler_reshape_node = self.graph.op_with_const("Reshape", "keypoint_pooler/reshape", mask_pooler_output, mask_pooler_shape)

        # Get first Conv op in mask head and connect ROIAlign's squeezed output to it.
        mask_head_conv = self.graph.find_node_by_op_name("Conv", "/roi_heads/keypoint_head/conv_fcn1/Conv")
        mask_head_conv.inputs[0] = mask_pooler_reshape_node[0]

        # Reshape node that is preparing 2nd NMS class outputs for Add node that comes next.
        classes_reshape_shape = np.asarray([self.second_NMS_max_proposals * self.batch_size], dtype=np.int64)
        classes_reshape_node = self.graph.op_with_const("Reshape", "box_outputs/reshape_classes", nms_outputs[3], classes_reshape_shape)

        # This loop will generate an array used in Add node, which eventually will help Gather node to pick the single
        # class of interest per bounding box, instead of creating 80 masks for every single bounding box.
        add_array = []
        for i in range(self.second_NMS_max_proposals * self.batch_size):
            if i == 0:
                start_pos = 0
            else:
                start_pos = i * self.num_classes
            add_array.append(start_pos)

        # This Add node is one of the Gather node inputs, Gather node performs gather on 0th axis of data tensor
        # and requires indices that set tensors to be withing bounds, this Add node provides the bounds for Gather.
        add_array = np.asarray(add_array, dtype=np.int32)
        classes_add_node = self.graph.op_with_const("Add", "box_outputs/add", classes_reshape_node[0], add_array)

        # Get the last Conv op in mask head and reshape it to correctly gather class of interest's masks.
        last_resize = self.graph.find_node_by_op_name("Resize", "/roi_heads/keypoint_head/Resize")

        # Gather node that selects only masks belonging to detected class, 79 other masks are discarded.
        final_gather = self.graph.gather("/keypoint_head/gathering", last_resize.outputs[0], classes_add_node[0])
        final_gather[0].dtype = np.float32

        return nms_outputs, final_gather[0`

With this I can create a onnx file that can be converted to a .trt file but one component is missing .... the actual keypoints. In the onnx file i get when using the export_model.py from detectron2 there is a node named "ConstantOfShape_2057" that outputs "xy_preds" which are the keypoints that I need. I have been trying to output from this node in multiple ways but it always ends in with the same error when trying to convert it to .trt, namely:

[02/13/2023-16:17:11] [E] [TRT] ModelImporter.cpp:728: input: "/proposal_generator/Flatten_3_output_0"
input: "/proposal_generator/Reshape_44_output_0"
output: "/proposal_generator/TopK_3_output_0"
output: "/proposal_generator/TopK_3_output_1"
name: "/proposal_generator/TopK_3"
op_type: "TopK"
attribute {
name: "axis"
i: 1
type: INT
}
attribute {
name: "largest"
i: 1
type: INT
}
attribute {
name: "sorted"
i: 1
type: INT
}

[02/13/2023-16:17:11] [E] [TRT] ModelImporter.cpp:729: --- End node ---
[02/13/2023-16:17:11] [E] [TRT] ModelImporter.cpp:732: ERROR: ModelImporter.cpp:168 In function parseGraph:
[6] Invalid Node - /proposal_generator/TopK_3
This version of TensorRT only supports input K as an initializer. Try applying constant folding on the model using Polygraphy: https://github.com/NVIDIA/TensorRT/tree/master/tools/Polygraphy/examples/cli/surgeon/02_folding_constants

I have tried to use folding, hard code the number of keypoints in the "heatmaps_to_keypoints" function from detectron2 which seems to be where the problem lies, but no succes. I saw that this has received some attention quite recently here: facebookresearch/detectron2#4315.

I would like to add the keypoints to the output in somehow but I am lacking some knowledge to get it to work. I have been using this config (https://github.com/facebookresearch/detectron2/blob/main/configs/COCO-Keypoints/keypoint_rcnn_R_50_FPN_3x.yaml) with the weights belonging to it from the model zoo.

@zerollzeng
Copy link
Collaborator

Does it work if fix the input shape and then do the constant folding?

@zerollzeng zerollzeng self-assigned this Feb 15, 2023
@zerollzeng zerollzeng added the triaged Issue has been triaged by maintainers label Feb 15, 2023
@zerollzeng
Copy link
Collaborator

The keypoint here is how to make the K as an initializer. I think this can be confirm by check the onnx.

@GEngels
Copy link
Author

GEngels commented Feb 15, 2023

The input shape is fixed to [1344, 1344] as described in the README. I have tried constant folding but then it says that 0 nodes are folded.

@zerollzeng
Copy link
Collaborator

@nvpohanh do we have plan to support k as a tensor for ITopKLayer?

@nvpohanh
Copy link
Collaborator

@rajeevsrao @kevinch-nv to comment on this.

@niqbal996
Copy link

I have the same issue converting detector networks like FCOS and RetinaNet from Detectron2 -> ONNX -> TensorRT. I have the ONNX models converted with different OPSETS unti opset=17. But when parsing the Onnx model with tensorRT i get the same as above error:
ERROR:ModelHelper:Failed to load ONNX file: /opt/git/fcos_opset17_simp_python.onnx ERROR:ModelHelper:In node 547 (parseGraph): INVALID_NODE: Invalid Node - /model/TopK_2 This version of TensorRT only supports input K as an initializer. Try applying constant folding on the model using Polygraphy: https://github.com/NVIDIA/TensorRT/tree/master/tools/Polygraphy/examples/cli/surgeon/02_folding_constants
I also used trtexec and got the same error:

[03/01/2023-12:25:04] [E] [TRT] ModelImporter.cpp:726: While parsing node number 547 [TopK -> "/model/TopK_2_output_0"]:
[03/01/2023-12:25:04] [E] [TRT] ModelImporter.cpp:727: --- Begin node ---
[03/01/2023-12:25:04] [E] [TRT] ModelImporter.cpp:728: input: "/model/GatherND_2_output_0"
input: "/model/Reshape_50_output_0"
output: "/model/TopK_2_output_0"
output: "/model/TopK_2_output_1"
name: "/model/TopK_2"
op_type: "TopK"
attribute {
  name: "axis"
  i: -1
  type: INT
}
attribute {
  name: "largest"
  i: 1
  type: INT
}
attribute {
  name: "sorted"
  i: 1
  type: INT
}

[03/01/2023-12:25:04] [E] [TRT] ModelImporter.cpp:729: --- End node ---
[03/01/2023-12:25:04] [E] [TRT] ModelImporter.cpp:731: ERROR: ModelImporter.cpp:168 In function parseGraph:
[6] Invalid Node - /model/TopK_2
This version of TensorRT only supports input K as an initializer. Try applying constant folding on the model using Polygraphy: https://github.com/NVIDIA/TensorRT/tree/master/tools/Polygraphy/examples/cli/surgeon/02_folding_constants
[03/01/2023-12:25:04] [E] Failed to parse onnx file
[03/01/2023-12:25:04] [I] Finish parsing network model
[03/01/2023-12:25:04] [E] Parsing model failed
[03/01/2023-12:25:04] [E] Failed to create engine from model or file.
[03/01/2023-12:25:04] [E] Engine set up failed

I have already applied constant folding in an iterative way until no mode nodes can be simplified. What to do in this case? Any help is appreciated. Thank you.

@zerollzeng
Copy link
Collaborator

zerollzeng commented Mar 6, 2023

TRT 8.6 will have dynamic K input for topk, which should be released soon(EA).

@ttyio
Copy link
Collaborator

ttyio commented Apr 4, 2023

closing since no activity for more than 3 weeks, pls reopen if you still have question, thanks!

@ttyio ttyio closed this as completed Apr 4, 2023
@niqbal996
Copy link

I usually download the correspondig docker container from the NGC but the 8.6 TRT is not available there yet. Will I have to build it myself or Did I miss anything? Please guide.

@FilipDrapejkowskiGL
Copy link

@GEngels have you been able to successfully complete the conversion?

@fettahyildizz
Copy link

@GEngels @niqbal996 @FilipDrapejkowskiGL Hello, how can I convert this model without using tensorrt 8.6 since Jetson Xavier supports tensorrt 8.5 latest.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
triaged Issue has been triaged by maintainers
Projects
None yet
Development

No branches or pull requests

7 participants