Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Noise in depth gt #4

Open
BlueNine9 opened this issue Oct 10, 2023 · 18 comments
Open

Noise in depth gt #4

BlueNine9 opened this issue Oct 10, 2023 · 18 comments

Comments

@BlueNine9
Copy link

BlueNine9 commented Oct 10, 2023

Thank you for your excellent work!

I've downloaded the test set from the website and converted the depth image into point cloud using the following code:

import os
import numpy as np
from numpy.matlib import repmat

os.environ['OPENCV_IO_ENABLE_OPENEXR'] = '1'

import cv2

image = cv2.imread('0000.exr', cv2.IMREAD_ANYCOLOR | cv2.IMREAD_ANYDEPTH)[..., 0]

pixel_size = image.shape[0] * image.shape[1]

depth_image = image.reshape(pixel_size)
u_coord = repmat(np.r_[0:image.shape[1]:1], image.shape[0], 1).reshape(pixel_size)
v_coord = repmat(np.c_[0:image.shape[0]:1], 1, image.shape[1]).reshape(pixel_size)

fx = 499.9999781443055
fy = 499.9999781443055
cx = image.shape[1] / 2
cy = image.shape[0] / 2

k = np.identity(3)
k[0, 2] = cx
k[1, 2] = cy
k[0, 0] = k[1, 1] = fx

p2d = np.array([u_coord, v_coord, np.ones_like(u_coord)])

p3d = np.transpose(np.dot(np.linalg.inv(k), p2d) * depth_image).reshape(-1, 3)

point_cloud = []

for x, y, z in p3d:
	x, y, z = z, x, -y
	point_cloud.append([x, y, z])

point_cloud = np.array(point_cloud)

from plyfile import PlyData, PlyElement

vertices = np.array([(x, y, z) for x, y, z in point_cloud.reshape(-1, 3)], dtype=[('x', 'f4'), ('y', 'f4'), ('z', 'f4')])
plydata = PlyData([PlyElement.describe(vertices, 'vertex')], text=False)
plydata.write('point_cloud.ply')

The shape of the point cloud seemed nice, however there seemed to be a noticeable amount of noise in the point cloud.

Is this normal or am I using a wrong way to convert the exr file?
I'm opening 0000.exr from small_city_road_horizon_test_depth.tar.

772a50d5115886cb6fda6974f7731e9
619759db348704016bfbd173e5df108
0f399001c8f150fcda0e6b104b01046

@yixuanli98
Copy link
Contributor

Sorry for the late reply. Have you tried depth from other view? Did you observe similar phenomenon?

@yixuanli98
Copy link
Contributor

Thank you very much for you question! I think the way you read depth is right. We have checked our depth map and find the reason. For the best rendering quality, we use the anti-alias setting to reduce noise, which will have a smoothing effect on the image. The depths change dramatically at the edge of the object and anti-alias reduces this change, which will cause the noisy point cloud. The depth map without and with anti-alias are shown below, which is a zoom-in pixel-level comparison.
屏幕截图 2023-10-22 180538

We think the depth maps rendered with anti-lias can better align the rgb images rendered with anti-alias and can support most of the requirements, like depth guidance for reconstruction. But for generating point cloud from depth map, anti-alias will bring many noises like you find. We have tried generating point cloud from the depth rendered without anti-alias and solved the problem. We will update a new depth map version without anti-alias soon.
屏幕截图 2023-10-20 220900

@ChaoyiZh
Copy link

Hi Guys
Can I ask why the depth value from the .exr may has thousands? Is there any scaling for it?
image

@yixuanli98
Copy link
Contributor

The unit is cm.

@yixuanli98
Copy link
Contributor

The unit in our pose file (street/pose and aerial/pose) is 10000cm. If you use the pose file to train nerf, you can scale the depth file with 10000.

@ChaoyiZh
Copy link

ChaoyiZh commented Nov 15, 2023

I am trying to fuse serval RGB and depth pairs to generate the pointclouds. Here is my core code and please feel free to point out if there is any problems. According to the instruction from YiXuan and previous post, I divided the depth with 10000 and did the same coordinate conversion. But the results seems wired.
image

os.environ['OPENCV_IO_ENABLE_OPENEXR'] = '1'
def depth_to_pc(depth_path, intrinsic, transform_matrix):
    depth_map = cv2.imread(depth_path, cv2.IMREAD_ANYCOLOR | cv2.IMREAD_ANYDEPTH)[..., 0]
    pixel_size = depth_map.shape[0] * depth_map.shape[1]
    depth_image = depth_map.reshape(pixel_size)/10000
    u_coord = np.tile(np.arange(depth_map.shape[1]), (depth_map.shape[0], 1)).reshape(pixel_size)
    v_coord = np.tile(np.arange(depth_map.shape[0]), (depth_map.shape[1], 1)).T.reshape(pixel_size)
    k = np.identity(3)
    k[0, 2] = intrinsic["cx"]
    k[1, 2] = intrinsic["cy"]
    k[0, 0] = intrinsic["fl_x"]
    k[1, 1] = intrinsic["fl_x"]
    p2d = np.array([u_coord, v_coord, np.ones_like(u_coord)])
    p3d = np.transpose(np.dot(np.linalg.inv(k), p2d) * depth_image).reshape(-1, 3)

    # Left coord to right
    point_cloud = p3d
    point_cloud[:, 1] = -point_cloud[:, 1]
    ones = np.ones((point_cloud.shape[0], 1))
    homogenous_coordinate = np.hstack((point_cloud, ones))
    transformed_points = (transform_matrix @ homogenous_coordinate.T).T

    return transformed_points[:, :3]
frames = data["frames"]

for frame in frames[:10]:
    rel_img_path = frame["file_path"]
    transform_matrix = frame["transform_matrix"]
    transform_matrix = np.array(transform_matrix)



    abs_img_path = os.path.abspath(os.path.join(pose_path, rel_img_path)).replace("/pose", "")
    abs_depth_path = abs_img_path.replace("small_city", "small_city_depth",1).replace("small_city_road_down", "small_city_road_down_depth").replace("png", "exr")
    if not os.path.exists(abs_img_path) or not os.path.exists(abs_depth_path):
        continue
    depth_image = cv2.imread(abs_depth_path, cv2.IMREAD_ANYCOLOR | cv2.IMREAD_ANYDEPTH)[..., 0]
    color_image = cv2.imread(abs_img_path)
    color_image = cv2.cvtColor(color_image, cv2.COLOR_BGR2RGB)
    point_cloud = depth_to_pc(abs_depth_path, camera_intrinsics, transform_matrix)

    all_points.append(point_cloud)

    colors = color_image.reshape(-1, 3)/255.0
    all_colors.append(colors)


merged_points = np.vstack(all_points)
merged_colors = np.vstack(all_colors)


final_pcd = o3d.geometry.PointCloud()
final_pcd.points = o3d.utility.Vector3dVector(merged_points)
final_pcd.colors = o3d.utility.Vector3dVector(merged_colors)
o3d.io.write_point_cloud("final_merged_point_cloud_2.pcd", final_pcd)

@Goochaozheng
Copy link

Same problem as @ChaoyiZh. Is there any solution?

@yixuanli98
Copy link
Contributor

I am trying to fuse serval RGB and depth pairs to generate the pointclouds. Here is my core code and please feel free to point out if there is any problems. According to the instruction from YiXuan and previous post, I divided the depth with 10000 and did the same coordinate conversion. But the results seems wired. image

os.environ['OPENCV_IO_ENABLE_OPENEXR'] = '1'
def depth_to_pc(depth_path, intrinsic, transform_matrix):
    depth_map = cv2.imread(depth_path, cv2.IMREAD_ANYCOLOR | cv2.IMREAD_ANYDEPTH)[..., 0]
    pixel_size = depth_map.shape[0] * depth_map.shape[1]
    depth_image = depth_map.reshape(pixel_size)/10000
    u_coord = np.tile(np.arange(depth_map.shape[1]), (depth_map.shape[0], 1)).reshape(pixel_size)
    v_coord = np.tile(np.arange(depth_map.shape[0]), (depth_map.shape[1], 1)).T.reshape(pixel_size)
    k = np.identity(3)
    k[0, 2] = intrinsic["cx"]
    k[1, 2] = intrinsic["cy"]
    k[0, 0] = intrinsic["fl_x"]
    k[1, 1] = intrinsic["fl_x"]
    p2d = np.array([u_coord, v_coord, np.ones_like(u_coord)])
    p3d = np.transpose(np.dot(np.linalg.inv(k), p2d) * depth_image).reshape(-1, 3)

    # Left coord to right
    point_cloud = p3d
    point_cloud[:, 1] = -point_cloud[:, 1]
    ones = np.ones((point_cloud.shape[0], 1))
    homogenous_coordinate = np.hstack((point_cloud, ones))
    transformed_points = (transform_matrix @ homogenous_coordinate.T).T

    return transformed_points[:, :3]
frames = data["frames"]

for frame in frames[:10]:
    rel_img_path = frame["file_path"]
    transform_matrix = frame["transform_matrix"]
    transform_matrix = np.array(transform_matrix)



    abs_img_path = os.path.abspath(os.path.join(pose_path, rel_img_path)).replace("/pose", "")
    abs_depth_path = abs_img_path.replace("small_city", "small_city_depth",1).replace("small_city_road_down", "small_city_road_down_depth").replace("png", "exr")
    if not os.path.exists(abs_img_path) or not os.path.exists(abs_depth_path):
        continue
    depth_image = cv2.imread(abs_depth_path, cv2.IMREAD_ANYCOLOR | cv2.IMREAD_ANYDEPTH)[..., 0]
    color_image = cv2.imread(abs_img_path)
    color_image = cv2.cvtColor(color_image, cv2.COLOR_BGR2RGB)
    point_cloud = depth_to_pc(abs_depth_path, camera_intrinsics, transform_matrix)

    all_points.append(point_cloud)

    colors = color_image.reshape(-1, 3)/255.0
    all_colors.append(colors)


merged_points = np.vstack(all_points)
merged_colors = np.vstack(all_colors)


final_pcd = o3d.geometry.PointCloud()
final_pcd.points = o3d.utility.Vector3dVector(merged_points)
final_pcd.colors = o3d.utility.Vector3dVector(merged_colors)
o3d.io.write_point_cloud("final_merged_point_cloud_2.pcd", final_pcd)

Have you successfully generated the reasonable point cloud from a single RGB-depth pairs?

@yixuanli98
Copy link
Contributor

Besides, could you tell me the which json file you use?

@ChaoyiZh
Copy link

Besides, could you tell me the which json file you use?

Yes. I totally understand we are using Nerf(OPENGL), which is a right-hand coordinate system. And I can generate one pointcloud with some image-depth pair and the corresponding pose. Here are the files I am using.
small_city/street/pose/block_A/transforms_train.json.
The result I posted before was from the fusion of first ten images.

@yixuanli98
Copy link
Contributor

Thx! I will try to fix this bug.

@Snosixtyboo
Copy link

I am trying to fuse serval RGB and depth pairs to generate the pointclouds. Here is my core code and please feel free to point out if there is any problems. According to the instruction from YiXuan and previous post, I divided the depth with 10000 and did the same coordinate conversion. But the results seems wired. image

os.environ['OPENCV_IO_ENABLE_OPENEXR'] = '1'
def depth_to_pc(depth_path, intrinsic, transform_matrix):
    depth_map = cv2.imread(depth_path, cv2.IMREAD_ANYCOLOR | cv2.IMREAD_ANYDEPTH)[..., 0]
    pixel_size = depth_map.shape[0] * depth_map.shape[1]
    depth_image = depth_map.reshape(pixel_size)/10000
    u_coord = np.tile(np.arange(depth_map.shape[1]), (depth_map.shape[0], 1)).reshape(pixel_size)
    v_coord = np.tile(np.arange(depth_map.shape[0]), (depth_map.shape[1], 1)).T.reshape(pixel_size)
    k = np.identity(3)
    k[0, 2] = intrinsic["cx"]
    k[1, 2] = intrinsic["cy"]
    k[0, 0] = intrinsic["fl_x"]
    k[1, 1] = intrinsic["fl_x"]
    p2d = np.array([u_coord, v_coord, np.ones_like(u_coord)])
    p3d = np.transpose(np.dot(np.linalg.inv(k), p2d) * depth_image).reshape(-1, 3)

    # Left coord to right
    point_cloud = p3d
    point_cloud[:, 1] = -point_cloud[:, 1]
    ones = np.ones((point_cloud.shape[0], 1))
    homogenous_coordinate = np.hstack((point_cloud, ones))
    transformed_points = (transform_matrix @ homogenous_coordinate.T).T

    return transformed_points[:, :3]
frames = data["frames"]

for frame in frames[:10]:
    rel_img_path = frame["file_path"]
    transform_matrix = frame["transform_matrix"]
    transform_matrix = np.array(transform_matrix)



    abs_img_path = os.path.abspath(os.path.join(pose_path, rel_img_path)).replace("/pose", "")
    abs_depth_path = abs_img_path.replace("small_city", "small_city_depth",1).replace("small_city_road_down", "small_city_road_down_depth").replace("png", "exr")
    if not os.path.exists(abs_img_path) or not os.path.exists(abs_depth_path):
        continue
    depth_image = cv2.imread(abs_depth_path, cv2.IMREAD_ANYCOLOR | cv2.IMREAD_ANYDEPTH)[..., 0]
    color_image = cv2.imread(abs_img_path)
    color_image = cv2.cvtColor(color_image, cv2.COLOR_BGR2RGB)
    point_cloud = depth_to_pc(abs_depth_path, camera_intrinsics, transform_matrix)

    all_points.append(point_cloud)

    colors = color_image.reshape(-1, 3)/255.0
    all_colors.append(colors)


merged_points = np.vstack(all_points)
merged_colors = np.vstack(all_colors)


final_pcd = o3d.geometry.PointCloud()
final_pcd.points = o3d.utility.Vector3dVector(merged_points)
final_pcd.colors = o3d.utility.Vector3dVector(merged_colors)
o3d.io.write_point_cloud("final_merged_point_cloud_2.pcd", final_pcd)

I think you wanna do

point_cloud[:, 2] = -point_cloud[:, 2]

For me, that makes the points align as expected

@yzslab
Copy link

yzslab commented Jan 9, 2024

I am trying to fuse serval RGB and depth pairs to generate the pointclouds. Here is my core code and please feel free to point out if there is any problems. According to the instruction from YiXuan and previous post, I divided the depth with 10000 and did the same coordinate conversion. But the results seems wired.

Hi, have you solve this problem? I am trying to build point cloud too, but face the same problem as you.

@yzslab
Copy link

yzslab commented Jan 9, 2024

I think you wanna do

point_cloud[:, 2] = -point_cloud[:, 2]

For me, that makes the points align as expected

Hi, would you mind sharing more details about your implementation? I tried point_cloud[:, 2] = -point_cloud[:, 2] but it doesn't work.

@yzslab
Copy link

yzslab commented Jan 16, 2024

I think you wanna do

point_cloud[:, 2] = -point_cloud[:, 2]

For me, that makes the points align as expected

Thanks, with below operations, the points align successfully:

point_cloud[:, 1] = -point_cloud[:, 1]
point_cloud[:, 2] = -point_cloud[:, 2]

@guangkaixu
Copy link

Thank you very much for you question! I think the way you read depth is right. We have checked our depth map and find the reason. For the best rendering quality, we use the anti-alias setting to reduce noise, which will have a smoothing effect on the image. The depths change dramatically at the edge of the object and anti-alias reduces this change, which will cause the noisy point cloud. The depth map without and with anti-alias are shown below, which is a zoom-in pixel-level comparison. 屏幕截图 2023-10-22 180538

We think the depth maps rendered with anti-lias can better align the rgb images rendered with anti-alias and can support most of the requirements, like depth guidance for reconstruction. But for generating point cloud from depth map, anti-alias will bring many noises like you find. We have tried generating point cloud from the depth rendered without anti-alias and solved the problem. We will update a new depth map version without anti-alias soon. 屏幕截图 2023-10-20 220900

Hi, is there any update on the ground-truth depth map? It seems that the update date of "small_city_depth" is still 3 months ago. Besides, has the anti-alias problem been solved in the "big_city_depth"?

@yixuanli98
Copy link
Contributor

Thank you very much for you question! I think the way you read depth is right. We have checked our depth map and find the reason. For the best rendering quality, we use the anti-alias setting to reduce noise, which will have a smoothing effect on the image. The depths change dramatically at the edge of the object and anti-alias reduces this change, which will cause the noisy point cloud. The depth map without and with anti-alias are shown below, which is a zoom-in pixel-level comparison. 屏幕截图 2023-10-22 180538
We think the depth maps rendered with anti-lias can better align the rgb images rendered with anti-alias and can support most of the requirements, like depth guidance for reconstruction. But for generating point cloud from depth map, anti-alias will bring many noises like you find. We have tried generating point cloud from the depth rendered without anti-alias and solved the problem. We will update a new depth map version without anti-alias soon. 屏幕截图 2023-10-20 220900

Hi, is there any update on the ground-truth depth map? It seems that the update date of "small_city_depth" is still 3 months ago. Besides, has the anti-alias problem been solved in the "big_city_depth"?

Currently, we do not provide the depth map without anti-alias. If you want to use the depth map to guide reconstruction, I think the depth map with anti-alias is more suitable because it aligns with rgb images. If you want to use the depth map to obtain geometry, you can refer to https://github.com/city-super/MatrixCity/blob/main/MatrixCityPlugin/docs/Render-Data.md#depth to render the depth map without anti-alias, we have already provided the camera sequences, render config, plugin and detailed documentation.

@yixuanli98
Copy link
Contributor

we have updated the script extracting point clouds from multi rgb-depth pairs at https://github.com/city-super/MatrixCity/blob/main/scripts/rgbd2pc.py @ChaoyiZh @yzslab @Goochaozheng @Snosixtyboo

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants