Generating Point Cloud #7

vpwang · 2024-12-19T00:23:52Z

Hi,
Thank you for sharing the code associated with your paper. I am currently implementing the code and would like some clarification on generating a point cloud.

Method or script in your codebase to generate the point cloud from the depth and color maps?
Are any preprocessing steps or external tools required for this process?
I appreciate your assistance in resolving this! Thank you for your time and for sharing your excellent work.

yahskapar · 2024-12-19T04:54:18Z

Hi @vpwang,

When using the test code in this repo (e.g., test_ppsnet.py), you should find that .pfm files are generated for both the ground truth and estimated depth. You can load these .pfm files and subsequently use them in a script such as the one I had from a while back below:

import os, cv2, re, sys
import numpy as np
import open3d as o3d
from PIL import Image
from plyfile import PlyData, PlyElement
import plyfile

# Path to the folder containing the data
data_folder = "/playpen-nas-ssd/akshay/3D_medical_vision/PS-SLAM/MVPS/MVPSNet/experiments/cas_mvps_psfeat16_depthweight10_nenet4_negone_ldir_psnlweight5/2023-10-21-00-09-09/eval/2023-10-25-18-03-32/cecum_t1_a_under_review"
specify_depth = "depth_gt"

# Path to the depth folder
depth_folder = os.path.join(data_folder, specify_depth)

# Get a list of all .pfm files in the depth folder
depth_files = [f for f in os.listdir(depth_folder) if f.endswith(".pfm")]

# The number of frames is equal to the number of .pfm files
num_frames = len(depth_files)

# project the reference point cloud into the source view, then project back
def reproject_with_depth(depth_ref, intrinsics_ref, extrinsics_ref, depth_src, intrinsics_src, extrinsics_src):
    width, height = depth_ref.shape[1], depth_ref.shape[0]
    ## step1. project reference pixels to the source view
    # reference view x, y
    x_ref, y_ref = np.meshgrid(np.arange(0, width), np.arange(0, height))
    x_ref, y_ref = x_ref.reshape([-1]), y_ref.reshape([-1])
    # reference 3D space
    xyz_ref = np.matmul(np.linalg.inv(intrinsics_ref),
                        np.vstack((x_ref, y_ref, np.ones_like(x_ref))) * depth_ref.reshape([-1]))
    # source 3D space
    xyz_src = np.matmul(np.matmul(extrinsics_src, np.linalg.inv(extrinsics_ref)),
                        np.vstack((xyz_ref, np.ones_like(x_ref))))[:3]
    # source view x, y
    K_xyz_src = np.matmul(intrinsics_src, xyz_src)
    xy_src = K_xyz_src[:2] / K_xyz_src[2:3]

    ## step2. reproject the source view points with source view depth estimation
    # find the depth estimation of the source view
    x_src = xy_src[0].reshape([height, width]).astype(np.float32)
    y_src = xy_src[1].reshape([height, width]).astype(np.float32)
    sampled_depth_src = cv2.remap(depth_src, x_src, y_src, interpolation=cv2.INTER_LINEAR)
    # mask = sampled_depth_src > 0

    # source 3D space
    # NOTE that we should use sampled source-view depth_here to project back
    xyz_src = np.matmul(np.linalg.inv(intrinsics_src),
                        np.vstack((xy_src, np.ones_like(x_ref))) * sampled_depth_src.reshape([-1]))
    # reference 3D space
    xyz_reprojected = np.matmul(np.matmul(extrinsics_ref, np.linalg.inv(extrinsics_src)),
                                np.vstack((xyz_src, np.ones_like(x_ref))))[:3]
    # source view x, y, depth
    depth_reprojected = xyz_reprojected[2].reshape([height, width]).astype(np.float32)
    K_xyz_reprojected = np.matmul(intrinsics_ref, xyz_reprojected)
    xy_reprojected = K_xyz_reprojected[:2] / K_xyz_reprojected[2:3]
    x_reprojected = xy_reprojected[0].reshape([height, width]).astype(np.float32)
    y_reprojected = xy_reprojected[1].reshape([height, width]).astype(np.float32)

    return depth_reprojected, x_reprojected, y_reprojected, x_src, y_src


def check_geometric_consistency(depth_ref, intrinsics_ref, extrinsics_ref, depth_src, intrinsics_src, extrinsics_src):
    width, height = depth_ref.shape[1], depth_ref.shape[0]
    x_ref, y_ref = np.meshgrid(np.arange(0, width), np.arange(0, height))
    depth_reprojected, x2d_reprojected, y2d_reprojected, x2d_src, y2d_src = reproject_with_depth(depth_ref, intrinsics_ref, extrinsics_ref,
                                                     depth_src, intrinsics_src, extrinsics_src)
    # check |p_reproj-p_1| < 1
    dist = np.sqrt((x2d_reprojected - x_ref) ** 2 + (y2d_reprojected - y_ref) ** 2)

    # check |d_reproj-d_1| / d_1 < 0.01
    depth_diff = np.abs(depth_reprojected - depth_ref)
    relative_depth_diff = depth_diff / depth_ref

    mask = np.logical_and(dist < 1, relative_depth_diff < 0.01)
    depth_reprojected[~mask] = 0

    return mask, depth_reprojected, x2d_src, y2d_src

def read_pfm(filename):
    file = open(filename, 'rb')
    color = None
    width = None
    height = None
    scale = None
    endian = None

    header = file.readline().decode('utf-8').rstrip()
    if header == 'PF':
        color = True
    elif header == 'Pf':
        color = False
    else:
        raise Exception('Not a PFM file.')

    dim_match = re.match(r'^(\d+)\s(\d+)\s$', file.readline().decode('utf-8'))
    if dim_match:
        width, height = map(int, dim_match.groups())
    else:
        raise Exception('Malformed PFM header.')

    scale = float(file.readline().rstrip())
    if scale < 0:  # little-endian
        endian = '<'
        scale = -scale
    else:
        endian = '>'  # big-endian

    data = np.fromfile(file, endian + 'f')
    shape = (height, width, 3) if color else (height, width)

    data = np.reshape(data, shape)
    data = np.flipud(data)
    file.close()
    return data, scale

# read an image
def read_img(filename):
    img = Image.open(filename)
    # scale 0~255 to 0~1
    np_img = np.array(img, dtype=np.float32) / 255.
    return np_img

# read intrinsics and extrinsics
def read_camera_parameters(filename):
    with open(filename) as f:
        lines = f.readlines()
        lines = [line.rstrip() for line in lines]
    # extrinsics: line [1,5), 4x4 matrix
    extrinsics = np.fromstring(' '.join(lines[1:5]), dtype=np.float32, sep=' ').reshape((4, 4))
    # intrinsics: line [7-10), 3x3 matrix
    intrinsics = np.fromstring(' '.join(lines[7:10]), dtype=np.float32, sep=' ').reshape((3, 3))
    return intrinsics, extrinsics

# Function to create a depth image from the raw depth data
def create_depth_image(depth_data, depth_scale):
    # Scale the depth values to match Open3D's expectations
    scaled_depth_data = depth_data / depth_scale

    # Create an Open3D geometry::Image
    depth_image = o3d.geometry.Image(scaled_depth_data)

    return depth_image

# Function to load depth, confidence, and RGB images
def load_data(frame):
    depth = read_pfm(os.path.join(data_folder, specify_depth, f"{frame:08d}.pfm"))[0]
    confidence = read_pfm(os.path.join(data_folder, "confidence", f"{frame:08d}.pfm"))[0]
    color = read_img(os.path.join(data_folder, "images", f"{frame:08d}.png"))
    return depth, confidence, color


# build the pair file
samples = []
scan_name = "cecum_t1_a_under_review"
for target in range(num_frames):
    refs = []
    if target == 0:
        refs.extend([1,2])
    elif target == (num_frames - 1):
        refs.extend([num_frames - 2, num_frames - 3])
    else:
        refs.extend([target-1, target+1])
    sample = {
        'scene': scan_name,
        'target': target,
        'refs': refs
    }
    # print(sample)
    samples.append(sample)

# for the final point cloud
vertexs = []
vertex_colors = []

for sample in samples:
    scene = sample['scene']
    target = sample['target']
    refs = sample['refs']
    
    ref_intrinsics, ref_extrinsics = read_camera_parameters(
            os.path.join(data_folder, 'cams/{:0>8}_cam.txt'.format(target)))
    ref_img = read_img(os.path.join(data_folder, 'images/{:0>8}.png'.format(target)))
    ref_depth_est = read_pfm(os.path.join(data_folder, 'depth_gt/{:0>8}.pfm'.format(target)))[0]
    confidence = read_pfm(os.path.join(data_folder, 'confidence/{:0>8}.pfm'.format(target)))[0]
    photo_mask = confidence > 0.1

    # print(np.shape(ref_img))
    # print(np.shape(ref_depth_est))
    # print(ref_intrinsics)
    # print(ref_extrinsics)
    # print(np.min(ref_depth_est), np.max(ref_depth_est))

    all_srcview_depth_ests = []
    all_srcview_x = []
    all_srcview_y = []
    all_srcview_geomask = []

    # compute the geometric mask
    geo_mask_sum = 0

    for ref in refs:
        src_intrinsics, src_extrinsics = read_camera_parameters(
                os.path.join(data_folder, 'cams/{:0>8}_cam.txt'.format(ref)))
        src_depth_est = read_pfm(os.path.join(data_folder, 'depth_gt/{:0>8}.pfm'.format(ref)))[0]
        geo_mask, depth_reprojected, x2d_src, y2d_src = check_geometric_consistency(ref_depth_est, ref_intrinsics, ref_extrinsics,
                                                                      src_depth_est,
                                                                      src_intrinsics, src_extrinsics)
        geo_mask_sum += geo_mask.astype(np.int32)
        all_srcview_depth_ests.append(depth_reprojected)
        all_srcview_x.append(x2d_src)
        all_srcview_y.append(y2d_src)
        all_srcview_geomask.append(geo_mask)
        
    depth_est_averaged = (sum(all_srcview_depth_ests) + ref_depth_est) / (geo_mask_sum + 1)
    # at least 1 source views matched
    geo_mask = geo_mask_sum >= 2
    final_mask = np.logical_and(photo_mask, geo_mask)

    height, width = depth_est_averaged.shape[:2]
    x, y = np.meshgrid(np.arange(0, width), np.arange(0, height))
    # valid_points = np.logical_and(final_mask, ~used_mask[ref_view])
    valid_points = final_mask
    print("valid_points", valid_points.mean())
    x, y, depth = x[valid_points], y[valid_points], depth_est_averaged[valid_points]
    #color = ref_img[1:-16:4, 1::4, :][valid_points]  # hardcoded for DTU dataset

    num_stage = 3

    if num_stage == 1:
        color = ref_img[1::4, 1::4, :][valid_points]
    elif num_stage == 2:
        color = ref_img[1::2, 1::2, :][valid_points]
    elif num_stage == 3:
        color = ref_img[valid_points]

    xyz_ref = np.matmul(np.linalg.inv(ref_intrinsics),
                        np.vstack((x, y, np.ones_like(x))) * depth)
    xyz_world = np.matmul(np.linalg.inv(ref_extrinsics),
                            np.vstack((xyz_ref, np.ones_like(x))))[:3]
    vertexs.append(xyz_world.transpose((1, 0)))
    vertex_colors.append((color * 255).astype(np.uint8))

vertexs = np.concatenate(vertexs, axis=0)
vertex_colors = np.concatenate(vertex_colors, axis=0)
vertexs = np.array([tuple(v) for v in vertexs], dtype=[('x', 'f4'), ('y', 'f4'), ('z', 'f4')])
vertex_colors = np.array([tuple(v) for v in vertex_colors], dtype=[('red', 'u1'), ('green', 'u1'), ('blue', 'u1')])

vertex_all = np.empty(len(vertexs), vertexs.dtype.descr + vertex_colors.dtype.descr)
for prop in vertexs.dtype.names:
    vertex_all[prop] = vertexs[prop]
for prop in vertex_colors.dtype.names:
    vertex_all[prop] = vertex_colors[prop]

plyfilename = "output_ply.obj"
el = PlyElement.describe(vertex_all, 'vertex')
PlyData([el]).write(plyfilename)
print("saving the final model to", plyfilename)

# Replace 'your_ply_file.ply' with the path to your PLY file
ply_file_path = plyfilename

try:
    # Try to read the PLY file
    with open(ply_file_path, 'rb') as f:
        plydata = plyfile.PlyData.read(f)

    # Check if vertices exist
    if 'vertex' in plydata.elements:
        vertices = plydata['vertex']
        if vertices is not None and len(vertices) > 0:
            print("Vertices exist in the PLY file.")
        else:
            print("No vertices found in the PLY file.")
    else:
        print("No 'vertex' element found in the PLY file.")
except FileNotFoundError:
    print(f"File '{ply_file_path}' not found.")
except Exception as e:
    print(f"An error occurred while processing the PLY file: {e}")

This was just some messy code (sorry for that) I put together almost an year ago at this point. Let me know if you have any particular questions about it, and if you do happen to have time to clean it up and make a generalized version of it, it would be great if you could contribute it back to this repo or an open-source repo that you might be building. You should be able to get started with this script by just pointing data_folder to your colon data results folder path (e.g., cecum_t1_a_under_review) and then specifying with specify_depth the exact sub-folder (e.g., depth_est or whatever it may be named) with the .pfm files you want to process.

Thanks for your interest in this work and hope this helps,

Akshay

yahskapar · 2024-12-19T04:56:41Z

By the way, generally speaking, once I generate the point cloud (or if I further process it, a mesh), I just view the .ply or .obj file in MeshLab. It's a pretty nifty tool, with one of its numerous abilities being able to view and further process point clouds or meshes. It's free and you can download it here. There should also be some simple tutorials online for using MeshLab.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Generating Point Cloud #7

Generating Point Cloud #7

vpwang commented Dec 19, 2024

yahskapar commented Dec 19, 2024

yahskapar commented Dec 19, 2024

Generating Point Cloud #7

Generating Point Cloud #7

Comments

vpwang commented Dec 19, 2024

yahskapar commented Dec 19, 2024

yahskapar commented Dec 19, 2024