Fast approximate nearest neighbor counting on depth images

Overview

This post describes how the Ouster python SDK and numpy can be used to efficiently process Ouster structured LidarScans with 2D convolution-style algorithms.

Skip to the end of this post to see the code which does the following:

  1. approximate nearest neighbor counting
  2. data blurring
  3. a tutorial on visualizing multiple data streams at once with SimpleViz using features introduced in the v0.13 SDK release.

Approximate Neighbor Counting
The approach in this post is similar in behavior to the query_ball_point method from Scipy or the radiusSearch method in the Organized Neighbor class in PCL. The idea is to count the number of neighboring points within a defined radius from each point in the point cloud.

There has been interest in fast nearest neighbor searching and counting with Ouster lidars online, which this post may help address:

Ouster lidars are unique among spinning lidars in that they are designed to output structured/organized point clouds which is similar to the 2D range images output by depth cameras like the Kinect or Realsense. Taking advantage of the structured 2D nature of the data allows use of faster versions of familiar algorithms, like PCL’s organized OrganizedNeighbor class versus a normal KDTree, or fast approximate methods, like in this post, or any number of other algorithms intended for structured depth cameras or even traditional cameras - for instance running YOLO on LidarScans.

approximate_neighbor_count()

approximate_neighbor_count() runs a brute force search on an MxN kernel of neighbor points surrounding each point in the 2D range image and returns the number of points (In this case between 0 and 8 for a 3x3 kernel.) that are within a threshold radius of the seed point. It uses true euclidian distance by first calculating an MxNx3 array of XYZ points to run the calculations on. The method runs in under 10ms on an M1 Mac with 128x2048 resolution LidarScans.

Below is an example of the result overlayed on the point cloud. The top grayscale image is the near infrared data, and the bottom grayscale image and the point cloud’s coloring are the neighbor counts for each pixel.

You can see that the number of neighbors drops off as the distance from the sensor increases since the points are naturally farther apart as the beams diverge. Walls and other objects normal to the sensor have lots of neighbors (red), while trees and other objects with less structure have fewer or no neighbors (blue).

A familiar challenge when working with raw lidar data is the noisy points in the near range that stem from interactions with the vehicle body, from rain or fog near the sensor window, or from clutter like cabling. The approximate_neighbor_count() method can provide useful information to filter these points. Below is an example where points within 10 meters that have fewer than 3 neighbors are highlighted in white showing the strong preference for the unstructured trees:

Bonus: 2D Blurring
The approach taken in ‘approximate_neighbor_count’ is useful for other 2D convolutions like blurring, which is implemented in the convolve_blur() function in the provided script and can be seen in the image below - multiple data channels have been blurred, including the range data, which affects the XYZ point cloud. Any 2D kernel-based algorithm developed for cameras or for structured depth cameras can be applied to Ouster data in a similar manner.

The code

from functools import partial
from typing import List
import numpy as np

from ouster.sdk.client import LidarScan, XYZLut, ChanField, SensorInfo, destagger
from ouster.sdk.viz import SimpleViz
from ouster.sdk.open_source import open_source


def update(scan: LidarScan, xyzlut: XYZLut, info: SensorInfo) -> LidarScan:
    # 1. First demonstrate the nearest neighbor search on the scan and save the results
    valid = np.uint8(scan.field(ChanField.RANGE) != 0)
    xyz = xyzlut(scan.field(ChanField.RANGE))
    neighbor_count = approximate_neighbor_count(xyz, valid, info)
    scan.add_field("NEIGHBOR-COUNT", neighbor_count)
    # An example of the types of layered criteria that can be applied to generate a high quality noisy point remover
    scan.add_field("EXAMPLE-CULL", np.uint8(valid & (neighbor_count <= 1) & (scan.field(ChanField.RANGE) < 10_000)))
    return scan


def add_blurry_scan(original_scan: LidarScan, info: SensorInfo) -> List[LidarScan]:
    # 2. Second, let's create a copy of the LidarScan and make the copy blurry to demonstrate
    # the similarities of the processing to 2d convolution.
    blurry_scan = LidarScan(original_scan)  # Make a copy of the existing scan
    valid = np.uint8(blurry_scan.field(ChanField.RANGE) != 0)  # Don't blur the range data into pixels that had no return
    blurry_scan.field(ChanField.RANGE)[:] = convolve_blur(blurry_scan.field(ChanField.RANGE), valid, info)
    # Let's blur some other data channels and overwrite the original data so we can compare to the original_scan
    blurry_scan.field(ChanField.REFLECTIVITY)[:] = convolve_blur(blurry_scan.field(ChanField.REFLECTIVITY), np.ones_like(valid), info)
    blurry_scan.field(ChanField.NEAR_IR)[:] = convolve_blur(blurry_scan.field(ChanField.NEAR_IR), np.ones_like(valid), info)
    # Key here is to return both the original and blurry scan in list form. This matches with the scans.metadata list
    # input in SimpleViz
    return [original_scan, blurry_scan]


def approximate_neighbor_count(xyz: np.ndarray, valid: np.ndarray, info: SensorInfo) -> np.ndarray:
    """
    Calculates the number of pixels that are within the euclidian distance of a
    NEIGHBOR_DISTANCE_M threshold. Runs an exhaustive search, but only considers pixels that are within
    the K_M x K_N 2D kernel - hence the approximate method. The operation performed is similar to a
    2D image convolution and is intended to run much faster than realtime.

    This method does not consider second returns.

    Args:
        xyz: Staggered (M, N, 3) xyz data in units of meters
        valid: Staggered (M, N) array indicating which pixels contain valid XYZ data
        info: A SensorInfo object required to destagger and restagger data

    Returns:
        A staggered 2D image of the approximate neighbor count
    """
    assert(xyz.dtype in (np.float64, np.float32, np.float16))
    assert (valid.dtype == np.uint8)

    NEIGHBOR_DISTANCE_M = 0.1
    NEIGHBOR_DISTANCE_SQUARED = NEIGHBOR_DISTANCE_M ** 2  # Avoid a sqrt later on by squaring the threshold

    K_M, K_N = 3, 3  # The shape of the 2D kernel on which to exhaustively search for neighbors.
    M, N = valid.shape

    # Destagger the data into a human-viewable 2D image so that data is spatially adjacent
    xyz = destagger(info, xyz)
    valid = destagger(info, valid)

    # This step defines which pixels in the 2D kernel we process. We only need to process half the pixels in the kernel
    # because the kernel and calculations performed are symmetric ->> The calculations for neighbor pixels A and B
    # apply equally to both pixel A and B regardless of which is the seed.

    kernel_flag = np.ones((K_M, K_N), np.uint8)
    kernel_flag[K_M // 2+1:, :] = 0  # Don't process the symmetric kernel positions. We copy them in the loop
    kernel_flag[K_M // 2, K_N // 2:] = 0  # Don't process the symmetric kernel positions. We copy them in the loop
    idx_m, idx_n = np.nonzero(kernel_flag)  # Indices of the kernel to process

    # Create padded versions of the inputs to make indexing easier
    orig_slice = slice(K_M//2, K_M//2+M), slice(K_N//2, K_N//2+N)
    padding = ((K_M // 2, K_M // 2), (K_N // 2, K_N // 2), (0, 0))
    xyz_pad = np.pad(xyz, pad_width=padding, mode='constant', constant_values=(0,))
    valid_pad = np.pad(valid, pad_width=padding[0:2], mode='constant', constant_values=(0,))
    neighbor_count_pad = np.zeros(valid_pad.shape, dtype=np.int32)

    for m, n in zip(idx_m, idx_n):
        # Select shifted versions of the inputs that correspond to specific kernel positions
        xyz_pad_shift = xyz_pad[m:m + M, n:n + N, :]
        valid_pad_shift = valid_pad[m:m + M, n:n + N]

        # identify the neighbors by calculating the euclidian distance between the seed xyz and the shifted xyz
        sub = xyz - xyz_pad_shift
        # Writing it out is faster than np.einsum('mnk,mnk->mn', sub, sub) or np.sum(sub**2, axis=-1)
        distance_squared = sub[..., 0] ** 2 + sub[..., 1] ** 2 + sub[..., 2] ** 2
        neighbors = np.int32(valid_pad_shift & valid & (distance_squared <= NEIGHBOR_DISTANCE_SQUARED))

        neighbor_count_pad[orig_slice] += neighbors
        # We can skip half of the processing by taking advantage of the fact that the computation is symmetric.
        # The neighbor count should apply to each pixel in the neighbor pair. Hence we sum the neigbors to the pairs.
        neighbor_count_pad[m:m + M, n:n + N] += neighbors

    # Remove the padding
    neighbor_count = neighbor_count_pad[orig_slice]

    # Re-stagger the data back to its original format that streamed from the sensor
    return destagger(info, neighbor_count, inverse=True)


def convolve_blur(data: np.ndarray, valid: np.ndarray, info: SensorInfo) -> np.ndarray:
    """
   This method applies an edge-aware 2D box convolution using similar fast indexing approach
   to the approximate_neighbor_count() method. The function is edge-aware in the sense that it only takes
   the mean of the pixels that are 'valid' within the kernel

    Args:
        data: Staggered (M, N) data in arbitrary units
        valid: Staggered (M, N) array indicating which pixels contain valid data
        info: A SensorInfo object required to destagger and restagger data

    Returns:
        A staggered 2D image of the blurred data
    """
    assert (valid.dtype == np.uint8)
    K_M, K_N = 5, 5  # The size of the blurring kernel
    M, N = valid.shape

    # Destagger the data into a human viewable 2D image so that data is spatially adjacent
    data = destagger(info, data)
    valid = destagger(info, valid)

    # Process all of the pixels in the kernel
    kernel_flag = np.ones((K_M, K_N), np.uint8)
    idx_m, idx_n = np.nonzero(kernel_flag)

    # Create padded versions of the inputs to make indexing easier
    orig_slice = slice(K_M//2, K_M//2+M), slice(K_N//2, K_N//2+N)
    padding = ((K_M//2, K_M//2), (K_N//2, K_N//2))
    data_pad = np.pad(data, pad_width=padding, mode='constant', constant_values=(0,))
    valid_pad = np.pad(valid, pad_width=padding, mode='constant', constant_values=(0,))

    denominator_pad = np.zeros(valid_pad.shape, dtype=np.float32)
    numerator_pad = np.zeros(valid_pad.shape, dtype=np.float32)

    # accumulate the result by iteratively shifting the image across the window in both axes. This is a convolution
    for m, n in zip(idx_m, idx_n):
        # shift the input range spatially to determine neighbors
        data_pad_shift = data_pad[m:m + M, n:n + N]
        valid_pad_shift = valid_pad[m:m + M, n:n + N]

        # This technically is an edge-aware blur because we are only blurring pixels where
        # we have valid range data on both pixels in the pair.
        denominator_pad[orig_slice] += valid_pad_shift & valid
        numerator_pad[orig_slice] += data_pad_shift * (valid_pad_shift & valid)

    # Calculate the final result as the mean of the values, remove the padding, and convert to the original dtype
    data_blurred = (numerator_pad/denominator_pad)[orig_slice].astype(data.dtype)

    # Re-stagger the data back to its original column format streaming from the sensor
    return destagger(info, data_blurred, inverse=True)


if __name__ == '__main__':
    """
    Download sample data here (PCAP + JSON, or OSF): 
        OS1-128: https://studio.ouster.com/share/3LYMNICNJ840A6MJ
        OS0-128: https://studio.ouster.com/share/B4IZ0LSQE6OMAYXZ
        Even more: https://static.ouster.dev/sensor-docs/index.html#sample-data
    """
    filename = ''  # Add your PCAP, OSF, or live sensor hostname here.
    scans = open_source(filename, cycle=True)
    xyzlut = XYZLut(scans.metadata)  # Create the xyz look up table just once

    # Apply the update function to each LidarScan emitted from the scans ScanSource object
    # This example doesn't use an class to manage the update function
    scans_updated = map(partial(update, xyzlut=xyzlut, info=scans.metadata), scans)

    # Pass the scans iterator to SimpleViz for viewing
    viz = SimpleViz(scans.metadata, rate=0)
    viz.run(scans_updated)

    # CTRL+C the first viz and a new viz will pop up that takes advantage of multi-sensor functionality.

    # Here we are telling the viz there are two scan sources because we've put two scans.metadata into a list.
    viz = SimpleViz([scans.metadata, scans.metadata], rate=0)

    # We map the add_blurry_scan() function which emits two corresponding LidarScans into a list.
    # With these changes we can view both ScanSources as part of a synchronized multi-sensor setup
    two_scan_sources = map(partial(add_blurry_scan, info=scans.metadata), scans_updated)
    viz.run(two_scan_sources)

3 Likes