TEDRASIM is a research demonstrator from a larger project exploring automated analysis of technical drawings.
Traditionally, technical drawings are created first in order to manufacture or model a three-dimensional object. TEDRASIM investigates the reverse direction: starting from a 3D scan or a photo of an object, the system automatically derives the corresponding technical drawing.
The overall TEDRASIM project explores two complementary approaches:
- Mesh-based pipeline – extracting projections and feature edges directly from 3D scan data
- Vision–Language Model (VLM) approach – generating 3D models from textual descriptions using fine-tuned multimodal models
This repository contains the machine learning–based VLM approach.
The geometry-based pipeline is implemented in a separate repository:
Note: This repository contains a simplified public version of the project; parts of the original pipeline and the core technical drawing similarity search system are not included.
The goal of this module is to generate technical drawings from images using Visual Language Models (VLMs) fine-tuned on a custom dataset.
The project currently includes:
- Synthetic dataset generation
- Fine-tuning of a VLM
- A pipeline that converts images into 3D models
- TODO: Pipeline that convers 3D models into technical drawings
You can find the dataset used for the fine-tuning here.
The approach relies on the deliberately constrained nature of the dataset, which was designed for a hands-on experimental setup in a lab environment. All objects are constructed from a finite set of geometric primitives.
These primitives can be represented as a structured JSON scene graph that describes object types and their spatial relationships.
This JSON representation acts as the intermediate representation between the VLM and the geometric solver.
Target pipeline:
Photo of an object
→ VLM
→ structured JSON scene
→ solver
→ 3D model
→ technical drawing
Please see the notebooks for an example.
tedrasim_vlm/
├── notebooks/ # pipeline demos and fine-tuning experiments
├── data_example/ # example dataset structure
│ ├── raw_data/
│ └── training/
│ ├── real_dataset/
│ └── synthetic_dataset/
│
├── src/tedrasim/
│ ├── json_to_3dmodel_pipeline/ # JSON → 3D model solver
│ ├── synth_dataset_generation/ # scene generation + rendering
│ ├── gui_apps/ # annotation / evaluation tools
│ └── viz/ # visualization utilities
│
├── assets/prompts/ # prompt templates for VLM training/inference
├── research/ # experimental notes and references
├── pyproject.toml
├── uv.lock
└── README.md
The base model is
InternVL_3.5_8B, which was finetuned using the notebook notebooks/finetune_internvl_3_5.ipynb, following a LoRA finetuning approach on the LLM part of the model on our custom dataset (TODO: Link to Huggingface).
Fine-tuning was based on the tutorial:
https://github.com/Arseny5/InternVL-3.5-QLoRA-Fine-tune
gui_apps/ contains small applications for experimenting with the pipeline.
They allow:
- sending requests to different models
- visualizing predicted JSON scenes
- interactively inspecting generated 3D models
The project uses uv for dependency management.
Create a virtual environment:
uv venv
source .venv/bin/activateInstall dependencies:
uv sync- Python 3.10+
- PyTorch
- HuggingFace Transformers
Optional:
- Blender is required for synthetic dataset generation (scene rendering)