Image processing on near_ir images

Hemanthmandava98 · December 4, 2024, 12:29pm

I’m working on image segmentation for Ouster near-infrared images. I initially extracted 10 frames and stored them as a .npy file. After applying min-max normalization to convert the images from 16-bit to 8-bit, I used auto-exposure to enhance the contrast. However, the results have not been satisfactory; the images have lost all their features, and upon inspecting the histograms, I noticed significant pixel loss.

Could you please help me with this issue? I need guidance on how to handle the near-infrared data effectively and how different image processing techniques can influence the segmentation results. What steps should I take to improve the quality of the IR images for segmentation? Thank you!

Samahu · December 4, 2024, 4:54pm

For the NEAR_IR we recommend applying BeamUniformityCorrector along with the AutoExposure. Also can you confirm that you are not sharing the AutoExposure instance with other fields as the AutoExposure is stateful object.

Hemanthmandava98 · December 5, 2024, 12:13pm

yes i am sharing autoexposure instance to row wise normaliation function.
can i share you the code?

Samahu · December 5, 2024, 4:26pm

yes you can, but you shouldn’t re-use the same AutoExposure instance for another field as this would affect the estimated min/max values. Just declare another AutoExposure instance if you have a different field/image that you want to process.

Hemanthmandava98 · December 5, 2024, 4:48pm

model = YOLO(‘yolov8x-seg.pt’)
auto_exposure = AutoExposure()

Storage for first 10 IR images

Directory setup remains the same as before

saved_images =

Process the first 10 scans

for idx, scan in enumerate(source):
if len(saved_images) >= 10:
break

if scan:
    # Extract IR image data (16-bit precision)
    # ir_image = np.array(scan.field(client.ChanField.NEAR_IR))
    #destagger
    ir_image = scan.field(client.ChanField.NEAR_IR)
    ir_val = np.array(client.destagger(metadata,ir_image))
    if ir_val is None or ir_val.size == 0:
        print(f"No valid IR image data found for scan {idx}.")
        continue

    # Save raw IR image as a .npy file (preserves exact 16-bit precision)
    raw_path = os.path.join(raw_dir, f"raw_image_{idx + 1}.npy")
    np.save(raw_path, ir_val)
    saved_images.append(ir_val)
    print(f"Saved raw IR images {idx + 1}.")

Process the 10 saved IR images

for idx, ir_val in enumerate(saved_images):
# Resize using bicubic interpolation (preserve 16-bit precision)
resized_image = cv2.resize(ir_image, (1024, 324), interpolation=cv2.INTER_CUBIC)

# Save resized image as a 16-bit PNG
resized_path = os.path.join(resized_dir, f"resized_image_{idx + 1}.png")
cv2.imwrite(resized_path, resized_image)
print(f"Saved resized IR image {idx + 1}.")

# Apply auto-exposure
enhanced_image = resized_image.astype(np.float32)
auto_exposure(enhanced_image)
enhanced_image = np.clip(np.rint(enhanced_image * 255), 0, 255).astype(np.uint8)

# Apply row-wise normalization
normalized_image, row_means = row_wise_normalization(enhanced_image)
normalized_path = os.path.join(normalized_dir, f"normalized_image_{idx + 1}.png")
cv2.imwrite(normalized_path, normalized_image)
print(f"Saved normalized IR image {idx + 1}.")

# YOLO inference
input_image = cv2.cvtColor(normalized_image, cv2.COLOR_GRAY2BGR)
results = model(input_image)
inference_result = results[0].plot()  # Visualize predictions
inference_path = os.path.join(inference_dir, f"inference_image_{idx + 1}.png")
cv2.imwrite(inference_path, inference_result)
print(f"Saved YOLO inference result for image {idx + 1}.")

Samahu · December 5, 2024, 11:49pm

I can’t comment on the other operations that you have but you should check if the problem you are experiencing is due to the AutoExposure on its own first. So remove all other operations and just keep the AutoExposure and then render the processed images and see if the images look fine. also check if the results make a difference if you let it run for more than 10 frames (try a 100 frames) as the AutoExposure is only updated every 3rd frame.

Samahu · December 6, 2024, 8:03pm

Hi @Hemanthmandava98 can you examine the images produced following the script below?

from ouster.sdk.client._util import AutoExposure, BeamUniformtyCorrector
from PIL import Image

auto_exposure = AutoExposure()
buc = BeamUniformtyCorrector()

# Apply beam uniformity corrector & auto-exposure
enhanced_image = resized_image.astype(np.float32)
buc(enhanced_image)
auto_exposure(enhanced_image)
enhanced_image = np.clip(np.rint(enhanced_image * 255), 0, 255).astype(np.uint8)
enhanced_path = os.path.join(enhanced_dir, f"enhanced_image_{idx + 1}.png")
image = Image.fromarray(enhanced_image)
image.save(enhanced_path)
print(f"Saved enhanced IR image {idx + 1}.")

This mainly applies the BeamUniformityCorrector followed by AutoExposure to check if the image look correct before attempting other operations.

P.S. I used Pillow library to save images rather than cv2 but both should work fine.

Hemanthmandava98 · December 8, 2024, 5:16pm

Hi Samahu,

I utilized the Beam Uniformity Corrector (BUC) and AutoExposure (AE) functions on my infrared images, but the output is not satisfactory. The resulting images have low contrast, making it difficult to distinguish the objects present.

I would like to share the code and the images produced to help diagnose the issue. Previously, these functions worked well for reflective images, yielding good contrast and improved results with YOLO.

Please let me know if you can assist me in figuring out what might be going wrong. I applied BUC and AE to 150 images in total.

Thank you for your help

yggdrasil · December 9, 2024, 1:37pm

Are you using the low data rate UDP profile by any chance that provides NEAR_IR as only 8 bit? There is a known drawback to using that mode where the 8bit data can saturate on a sunny day leading to unusable data. The solution is to use any of the UDP profiles that provide the NEAR_IR channel as 16 bit.

If you can, please post the output of:

scans = open_source(filename)
print(scans.metadata.format.udp_profile_lidar)

Hemanthmandava98 · December 9, 2024, 1:50pm

i am using the sample data provided by ouster:
OS-0-128_v3.0.1_2048x10_20230216_173241-000.pcap.

Hemanthmandava98 · December 9, 2024, 2:03pm

scans = open_source(filename)
print(scans.metadata.format.udp_profile_lidar)
output:

loading metadata from [‘/home/mandava/Downloads/OS-0-128_v3.0.1_2048x10_20230216_173241.json’]
RNG19_RFL8_SIG16_NIR16_DUAL

Samahu · December 9, 2024, 4:37pm

@Hemanthmandava98 can you share a download link to the dataset that you are work with.

Hemanthmandava98 · December 9, 2024, 5:13pm

i had downloaded data long ago, so it is difficult me to find the links. is itpossible to share pcap and json files of my data?

Samahu · December 9, 2024, 5:18pm

Yes, please do so … I wouldn’t be able to pin which dataset you might be using.

Hemanthmandava98 · December 9, 2024, 8:45pm

OS-0-128_v3.0.1_2048x10_20230216_173241.json

OS-0-128_v3.0.1_2048x10_20230216_173241-000.pcap

OS-0-128_v3.0.1_2048x10_20230216_173241.json

Samahu · December 9, 2024, 10:06pm

files require access.

Hemanthmandava98 · December 9, 2024, 10:22pm

I have given access of my Google drive

Samahu · December 9, 2024, 10:52pm

Ok, I think I found the issue. you need to cast the near_ir image to float before you can use these image withthe BUC/AutoExposure filters.

Here is a complete example:

from ouster.sdk._bindings.client import AutoExposure, BeamUniformityCorrector
from ouster.sdk.client import destagger
from PIL import Image

auto_exposure = AutoExposure()
buc = BeamUniformityCorrector()

from ouster.sdk import open_source
import numpy as np

scans = open_source("OS-0-128_v3.0.1_2048x10_20230216_173241-000.pcap")
for idx, scan in enumerate(scans):
    if idx > 10:
        break
    near_ir = scan.field("NEAR_IR")
    near_ir = destagger(scans.metadata, near_ir)
    near_ir = near_ir.astype(np.float32)
    buc(near_ir)
    auto_exposure(near_ir)

    key_max = np.max(near_ir)
    if key_max:
        near_ir = near_ir / key_max
    near_ir = np.clip((near_ir * 255), 0, 255).astype(np.uint8)

    enhanced_path = f"near_ir_{idx + 1}.png"
    image = Image.fromarray(near_ir).convert('L')
    image.save(enhanced_path)
    print(f"Saved enhanced IR image {idx + 1}.")

Hemanthmandava98 · January 3, 2025, 5:24pm

@Samahu ,I conducted YOLO inference on reflectivity and IR images with an initial resolution of 2048x128, resizing them to 1024x128 for improved segmentation. To ensure accurate results and address issues, I need clarification and guidance on the following:

Distortion Correction: I applied a distortion correction assuming a 45-degree vertical field of view (FOV) for spherical to rectilinear conversion. Is this approach valid, or should I consider alternative methods?
2D to 3D Projection: I aim to project 2D inferences (e.g., bounding boxes of detected objects) onto 3D point clouds to estimate the poses of moving objects like cars. How can I accurately align the resized images with the 3D data?
Metadata Usage: To maintain a correspondence between 2D inferences and 3D points, I am considering resizing the images using metadata, such as azimuth and elevation angles, for better spatial alignment. Would this be an appropriate approach?
Required Information for Projection:
- What additional information is critical for projecting 2D data onto 3D point clouds (e.g., range data, calibration parameters)?
- How can I accurately map inferred 2D pixel ranges to their corresponding 3D spatial positions in the point cloud?

import cv2
import matplotlib.pyplot as plt
import numpy as np

def correct_lidar_image_distortion(input_image, vertical_fov=45, horizontal_fov=360, interpolation=cv2.INTER_LINEAR):
if len(input_image.shape) == 3: # Check if the image has 3 channels (color)
input_image = cv2.cvtColor(input_image, cv2.COLOR_BGR2GRAY)

h, w = input_image.shape

# Initialize the remap grids
map_x = np.zeros((h, w), dtype=np.float32)
map_y = np.zeros((h, w), dtype=np.float32)

# Calculate angular resolution
vertical_angle_per_pixel = vertical_fov / h   # ~0.3515° per pixel
horizontal_angle_per_pixel = horizontal_fov / w  # ~0.176° per pixel

new_vertical_angle_per_pixel = horizontal_angle_per_pixel  # ~0.3515° per pixel
#new_horizontal_angle_per_pixel = horizontal_fov / w  # ~0.176° per pixel

# Populate remapping grids based on angular distribution
for y in range(h):
    for x in range(w):
        # Calculate the original angles for each pixel
        theta = (x * horizontal_angle_per_pixel) - (horizontal_fov / 2)
        phi = (y * vertical_angle_per_pixel) - (vertical_fov / 2)

        # Map to corrected positions (assuming cylindrical to rectilinear correction if needed)
        corrected_x = ((theta + (horizontal_fov / 2)) / horizontal_fov) * w
        corrected_y = ((phi + (vertical_fov / 2)) / vertical_fov) * h

        # Fill the remapping arrays
        map_x[y, x] = corrected_x
        map_y[y, x] = corrected_y

# Apply the remapping to correct the distortion
        corrected_image = cv2.remap(input_image, map_x, map_y, interpolation)
        return corrected_image

def resize_to_half_width(input_image):
“”"
Reduces the width of the image by eliminating every 2nd column (down-sample horizontally by 2).

:param input_image: Input image to be resized
:return: Image with half the original width and the same height
"""
# Select every alternate column to reduce width by half
resized_image = input_image[:, ::2]
return resized_image

then i ran inference following code reference:
class ScanIterator:
if torch.cuda.is_available():
DEVICE = “cuda”
elif torch.backends.mps.is_available():
DEVICE = “mps”
else:
DEVICE = “cpu”

def __init__(self, scans: ScanSource):
    self._metadata = scans.metadata

    # Load YOLO pretrained model
    self.model_yolo_nir = YOLO("yolov9c-seg.pt").to(device=self.DEVICE)
    self.model_yolo_ref = YOLO("yolov9c-seg.pt").to(device=self.DEVICE)

    # Define classes to output results for
    self.name_to_class = {value: key for key, value in self.model_yolo_ref.names.items()}
    self.classes_to_detect = [
        self.name_to_class['person'],
        self.name_to_class['car'],
        self.name_to_class['traffic light'],
        self.name_to_class['bus']
    ]

    # Post-process REFLECTIVITY channel
    self.paired_list = [
        [ChanField.NEAR_IR, AutoExposure(), BeamUniformityCorrector(), self.model_yolo_nir],
        [ChanField.REFLECTIVITY, AutoExposure(), BeamUniformityCorrector(), self.model_yolo_ref]
    ]

    # Map the self._update function to the scans iterator
    self._scans = map(partial(self._update), scans)

# Return the scans iterator when instantiating the class
def __iter__(self):
    return self._scans

def _update(self, scan: LidarScan) -> LidarScan:
    resized_width = 1024
    resized_height = 128
    stacked_result_rgb = np.empty((resized_width * len(self.paired_list), resized_width, 3), np.uint8)

    for i, (field, ae, buc, model) in enumerate(self.paired_list):
        # Destagger the data to get a normal-looking 3D image
        img = destagger(self._metadata, scan.field(field)).astype(np.float32)
        correct_lidar_image_distortion(img)
        resize_to_half_width(img)
        img = cv2.resize(img, (1024, 128), interpolation=cv2.INTER_LINEAR)

        # Make the image more uniform and better exposed
        ae(img)
        buc(img, update_state=True)

        # Convert to 3-channel uint8 for YOLO inference
        img_rgb = np.repeat(np.uint8(np.clip(np.rint(img * 255), 0, 255))[..., np.newaxis], 3, axis=-1)

        # Run YOLO inference with tracking enabled
        results: Results = next(
            model.track(
                [img_rgb],
                stream=True,  # Reduce memory requirements for streaming
                persist=True,  # Maintain tracks across sequential frames
                conf=0.1,
                imgsz=[img.shape[0], img.shape[1]],
                classes=self.classes_to_detect
            )
        ).cpu()

        # Plot results with bounding boxes and masks
        img_rgb = results.plot(boxes=True, masks=True, line_width=1, font_size=3)

        # Save stacked RGB images for OpenCV viewing
        stacked_result_rgb[i * scan.h:(i + 1) * scan.h, ...] = img_rgb

    # Display in OpenCV
    cv2.imshow("YOLO Results", stacked_result_rgb)
    cv2.waitKey(1)

    return scan

Samahu · January 4, 2025, 12:15am

Have you considered the example from this previous post which shows how to run YOLO against the REFLECTIVITY and other channels? Are you not achieving the desired results?

Topic		Replies	Views
Instance segmentation on near infrared or reflectivty images Software object-detection , sdk	2	90	November 4, 2024
Noisy 2D Images from OS0 PCAP Files (512 Resolution) Affecting Pallet Detection Software sdk	1	32	June 3, 2025
Running YOLO on 2D LidarScans using the SDK Software sdk , yolo , object_detection	3	1645	August 10, 2024
3D Object Detection in Ouster point cloud data Software sdk , windows , object-detection	1	572	April 29, 2024
Fast approximate nearest neighbor counting on depth images Software sdk , python	0	118	November 29, 2024

Image processing on near_ir images

Storage for first 10 IR images

Directory setup remains the same as before

Process the first 10 scans

Process the 10 saved IR images

Related topics