Reducing the triangle count in complex 3D models is a basic geometry preprocessing step in graphics pipelines such as efficient rendering and interactive editing. However, most existing mesh simplification methods exhibit a few issues. Firstly, they often lead to self-intersections during decimation, a major issue for applications such as 3D printing and soft-body simulation. Second, to perform simplification on a mesh in the wild, one would first need to perform re-meshing, which often suffers from surface shifts and losses of sharp features. Finally, existing re-meshing and simplification methods can take minutes when processing large-scale meshes, limiting their applications in practice. To address the challenges, we introduce a novel GPU-based mesh optimization approach containing three key components: (1) a parallel re-meshing algorithm to turn meshes in the wild into watertight, manifold, and intersection-free ones, and reduce the prevalence of poorly shaped triangles; (2) a robust parallel simplification algorithm with intersection-free guarantees; (3) an optimization-based safe projection algorithm to realign the simplified mesh with the input, eliminating the surface shift introduced by re-meshing and recovering the original sharp features. The algorithm demonstrates remarkable efficiency, simplifying a 2-million-face mesh to 20k triangles in 3 seconds on RTX4090. We evaluated the approach on the Thingi10K dataset and showcased its exceptional performance in geometry preservation and speed.
Our first stage converts arbitrary inputs into watertight, manifold, and intersection-free meshes for downstream processing.
We begin by computing a narrow-band unsigned distance field (UDF) around the input mesh using a voxel grid hierarchy.
This step removes unnecessary voxel–triangle pairs and enables efficient GPU computation.
We also provide a standalone CUDA implementation for this stage.
This stage reduces the number of triangles efficiently while preserving manifoldness and preventing self-intersections. We propose a parallel simplification method based on an iterative edge-collapse algorithm. The edge cost combines QEM with edge-length and skinny-triangle penalties, balancing output quality with the need to avoid poorly shaped elements. To parallelize the process, the mesh is divided into independent regions. In each iteration, all regions perform edge collapses simultaneously on the GPU, since a single collapse only affects its local region.
After each iteration, a parallel self-intersection detector checks for intersections. Any invalid collapses are reverted, ensuring that the mesh remains manifold and intersection-free throughout the process.
In the final stage, we safely project the simplified mesh back toward the original input, strictly avoiding self-intersections while recovering sharp features and preserving mesh quality. We optimize an energy function that balances fidelity to the input surface (Chamfer/Hausdorff distance) with elastic and bending terms that preserve triangle quality and shape. Inspired by IPC, we employ a barrier-based, collision-aware solver with continuous collision detection to guarantee intersection-free projection. This process restores sharp features, reduces surface error, and maintains high mesh quality.
@misc{oh2025pamo,
title={PaMO: Parallel Mesh Optimization for Intersection-Free Low-Poly Modeling on the GPU},
author={Seonghun Oh and Xiaodi Yuan and Xinyue Wei and Ruoxi Shi and Fanbo Xiang and Minghua Liu and Hao Su},
year={2025},
eprint={2509.05595},
archivePrefix={arXiv},
primaryClass={cs.GR},
url={https://arxiv.org/abs/2509.05595},
}