TripoSR: Fast, transformer-based 3D reconstruction model from single RGB images.
TripoSR is a transformer-based model designed for rapid 3D object reconstruction from a single RGB image, capable of generating high-quality 3D meshes in under 0.5 seconds on an NVIDIA A100 GPU.
TripoSR is designed for applications in entertainment, gaming, industrial design, and architecture, where rapid 3D visualization from 2D images is crucial.
As an image-to-3D model, TripoSR is language-agnostic.
TripoSR's architecture is a sophisticated blend of transformer-based components optimized for 3D reconstruction:
The model was trained on a curated subset of the Objaverse dataset, focusing on realistic and high-quality 3D models.
TripoSR outperforms other open-source alternatives in both quantitative and qualitative evaluations, particularly excelling in Chamfer Distance and F-score metrics across diverse datasets.
import requests
def main():
response = requests.post(
"https://api.aimlapi.com/v1/images/generations",
headers={
"Authorization": "Bearer API_KEY",
"Content-Type": "application/json",
},
json={
"model": "triposr",
"image_url": "Image_URL",
"do_remove_background": True,
"foreground_ratio": 0.9,
"mc_resolution": 256,
},
)
response.raise_for_status()
data = response.json()
url = data["model_mesh"]["url"]
file_name = data["model_mesh"]["file_name"]
mesh_response = requests.get(url, stream=True)
with open(file_name, "wb") as file:
for chunk in mesh_response.iter_content(chunk_size=8192):
file.write(chunk)
if __name__ == "__main__":
main()
TripoSR is released under the MIT license, promoting open-source development and responsible use in AI, computer vision, and computer graphics applications.
License Type: MIT License, permitting commercial, personal, and research use.
By leveraging TripoSR's advanced capabilities, developers can create powerful 3D reconstruction applications with unprecedented speed and accuracy, opening new possibilities in various domains requiring rapid 2D to 3D conversion.