3D Generation

Stable TripoSR 3D

TripoSR API generates high-quality 3D meshes from single images in under 0.5 seconds, using transformer architecture for efficient reconstruction.
Try it now

import requests
def main():
    response = requests.post(
        "https://api.aimlapi.com/v1/images/generations",
        headers={
            "Authorization": "Bearer API_KEY",
            "Content-Type": "application/json",
        },
        json={
            "model": "triposr",
            "image_url": "Image_URL",
            "do_remove_background": True,
            "foreground_ratio": 0.9,
            "mc_resolution": 256,
        },
    )
    response.raise_for_status()
    data = response.json()
    url = data["model_mesh"]["url"]
    file_name = data["model_mesh"]["file_name"]
    mesh_response = requests.get(url, stream=True)
    with open(file_name, "wb") as file:
        for chunk in mesh_response.iter_content(chunk_size=8192):
            file.write(chunk)
if __name__ == "__main__":
    main()

AI Playground

Test all API models in the sandbox environment before you integrate. We provide more than 200 models to integrate into your app.
AI Playground image
Ai models list in playground
Testimonials

Our Clients' Voices

Stable TripoSR 3DTechflow Logo - Techflow X Webflow Template

Stable TripoSR 3D

TripoSR: Fast, transformer-based 3D reconstruction model from single RGB images.

Model Overview Card for TripoSR

Basic Information

  • Model Name: TripoSR
  • Developer/Creator: Stability AI and Tripo AI
  • Release Date: March 4, 2024
  • Version: 1.0
  • Model Type: Image-to-3D reconstruction

Description

Overview:

TripoSR is a transformer-based model designed for rapid 3D object reconstruction from a single RGB image, capable of generating high-quality 3D meshes in under 0.5 seconds on an NVIDIA A100 GPU.

Key Features:

  • Fast feed-forward 3D generation
  • Transformer architecture for efficient processing
  • High-quality 3D mesh output
  • Single image input requirement
  • State-of-the-art performance in Chamfer Distance and F-score metrics

Intended Use:

TripoSR is designed for applications in entertainment, gaming, industrial design, and architecture, where rapid 3D visualization from 2D images is crucial.

Language Support:

As an image-to-3D model, TripoSR is language-agnostic.

Technical Details

Model Architecture

TripoSR's architecture is a sophisticated blend of transformer-based components optimized for 3D reconstruction:

  1. Image Encoder:
    • Utilizes DINOv1 pre-trained vision transformer
    • Converts RGB image into latent vectors encoding global and local features
  2. Image-to-Triplane Decoder:
    • Transformer-based decoder
    • Converts latent vectors to triplane NeRF representation
    • Leverages attention mechanisms for learning relationships between triplane components
  3. Triplane-based Neural Radiance Field (NeRF):
    • Generates final 3D representation
    • Optimized for complex shapes and textures

Training Data:

The model was trained on a curated subset of the Objaverse dataset, focusing on realistic and high-quality 3D models.

Performance Metrics:

TripoSR outperforms other open-source alternatives in both quantitative and qualitative evaluations, particularly excelling in Chamfer Distance and F-score metrics across diverse datasets.

Comparison to Other Models:

  • Accuracy: Superior performance in 3D reconstruction quality compared to open-source alternatives.
  • Speed: Generates 3D meshes in under 0.5 seconds on an NVIDIA A100 GPU.
  • Robustness: Demonstrates adaptability to diverse imaging conditions by inferring camera parameters rather than relying on explicit conditioning.

Usage

Code Samples:

import requests
def main():
    response = requests.post(
        "https://api.aimlapi.com/v1/images/generations",
        headers={
            "Authorization": "Bearer API_KEY",
            "Content-Type": "application/json",
        },
        json={
            "model": "triposr",
            "image_url": "Image_URL",
            "do_remove_background": True,
            "foreground_ratio": 0.9,
            "mc_resolution": 256,
        },
    )
    response.raise_for_status()
    data = response.json()
    url = data["model_mesh"]["url"]
    file_name = data["model_mesh"]["file_name"]
    mesh_response = requests.get(url, stream=True)
    with open(file_name, "wb") as file:
        for chunk in mesh_response.iter_content(chunk_size=8192):
            file.write(chunk)
if __name__ == "__main__":
    main()
Ethical Guidelines:

TripoSR is released under the MIT license, promoting open-source development and responsible use in AI, computer vision, and computer graphics applications.

Licensing

License Type: MIT License, permitting commercial, personal, and research use.

By leveraging TripoSR's advanced capabilities, developers can create powerful 3D reconstruction applications with unprecedented speed and accuracy, opening new possibilities in various domains requiring rapid 2D to 3D conversion.

Try it now

The Best Growth Choice
for Enterprise

Get API Key