Abstract
Key Innovations
Methodology
Results
Application
Data Pipeline
Research Findings
Limitations
Citation

Automated CAD Modeling Sequence Generation from Text Descriptions via Transformer-Based Large Language Models

Jianxing Liao^1*, Junyan Xu^1*, Yatao Sun¹, Maowen Tang¹, Sicheng He¹, Jingxian Liao¹, Shui Yu^1†, Yun Li^1†, Xiaohong Guan²

¹Shenzhen Institute for Advanced Study, University of Electronic Science and Technology of China, Shenzhen 518000, China
²i4AI Ltd, London WCIN3AX, United Kingdom
ACL 2025 Main
^*Equal first authors
^†Corresponding authors: Shui.Yu@i4AI.org, Yun.Li@ieee.org

Paper Code Dataset arXiv

Abstract

Designing complex computer-aided design (CAD) models is often time-consuming due to challenges such as computational inefficiency and the difficulty of generating precise models. We propose a novel language-guided framework for industrial design automation to address these issues, integrating large language models (LLMs) with computer-automated design (CAutoD). Through this framework, CAD models are automatically generated from parameters and appearance descriptions, supporting the automation of design tasks during the detailed CAD design phase.

Our approach introduces three key innovations: (1) a semi-automated data annotation pipeline that leverages LLMs and vision-language large models (VLLMs) to generate high-quality parameters and appearance descriptions; (2) a Transformer-based CAD generator (TCADGen) that predicts modeling sequences via dual-channel feature aggregation; (3) an enhanced CAD modeling generation model, called CADLLM, that is designed to refine the generated sequences by incorporating the confidence scores from TCADGen.

Experimental results demonstrate that the proposed approach outperforms traditional methods in both accuracy and efficiency, providing a powerful tool for automating industrial workflows and generating complex CAD models from textual prompts.

Key Innovations

Semi-Automated Annotation Pipeline

Leverages LLMs and VLLMs to generate high-quality CAD descriptions with automated validation and verification processes, achieving 98.4% automatic pass rate.

TCADGen Architecture

Transformer-based dual-channel architecture that effectively fuses parameter and appearance features for accurate CAD command sequence generation.

CADLLM Enhancement

Fine-tuned large language model that refines generated sequences using confidence scores, achieving significant performance improvements.

Methodology

Figure 1: Overall framework of automated CAD modeling from text descriptions, leveraging Transformer-based sequence generation and LLM-driven refinement.

Pipeline Overview

Semi-Automated Data Annotation

Utilizes multiple LLMs and VLLMs to generate detailed appearance and parameter descriptions, validated through automated consistency checks and reflection optimization.

TCADGen Sequence Generation

Transformer-based dual-channel architecture processes parameter and appearance descriptions to generate initial CAD command sequences with confidence scores.

CADLLM Enhancement

Fine-tuned large language model refines the generated sequences using confidence information to produce accurate final CAD commands.

Figure 2: TCADGen architecture with dual-channel feature aggregator and CAD command sequence decoder.

CAD Command Sequence (CCS) Representation

Example CCS:

                    <SOL>

                    <Line>: x=144, y=112

                    <Arc>: x=223, y=128, α=64, f=1

                    <Line>: x=223, y=204

                    <Extrude>: θ=192, φ=64, γ=192, px=105, py=121, pz=40, s=46, e1=148, e2=128, b=NewBodyFeatureOperation, u=OneSideFeatureExtentType

                    <EOS>

The CCS consists of 2D sketch commands (Line, Arc, Circle) followed by 3D extrusion operations, enabling precise parametric CAD model generation.

Experimental Results

96.6%

Command Accuracy

0.983

Average LCS Ratio

155503

Training Samples

98.4%

Auto Annotation Rate

Performance Comparison

Models	Avg Command ACC	F1	AUC	Line AUC	Arc AUC	Circle AUC	Extrude AUC
DeepCAD	0.571	0.606	0.747	0.648	0.540	0.587	0.616
Text2CAD	0.840	0.722	0.819	0.763	0.584	0.751	0.772
TCADGen	0.890	0.771	0.854	0.808	0.682	0.837	0.781
TCADGen+CADLLM (Ours)	0.966	0.947	0.962	0.957	0.925	0.959	0.942

Our TCADGen+CADLLM achieves significant improvements across all metrics, demonstrating the effectiveness of the dual-stage approach.

Quality Metrics Comparison

Method	Model	CD ↓	MMD ↓	JSD ↓
Transformer-based	DeepCAD	169.93	31.91	45.03
	Text2CAD	142.83	28.98	40.23
	TCADGen	120.99	21.36	35.25
	CAD Translator	-	2.94	10.92
LLM-based	CADFusion	45.67	3.49	17.11
	DeepCAD+CADLLM	4.25	3.13	8.58
	Text2CAD+CADLLM	4.31	3.12	8.42
	TCADGen+CADLLM (Ours)	3.12	2.78	8.38

Application: Motorcycle Frame Design

Real-World Industrial Application

We demonstrate the practical applicability of our framework through motorcycle frame design automation. The framework generates individual components based on textual descriptions, which are then assembled to create the complete frame structure.

Rapid Prototyping: Accelerated design iteration from concept to CAD model
Precision Engineering: Maintains geometric accuracy required for manufacturing
Automated Workflow: Reduces manual CAD modeling time by 80%
Scalable Design: Applicable to various mechanical components

Semi-Automated Annotation Pipeline

Our annotation pipeline combines automated LLM-based description generation with quality control mechanisms. The system achieves 98.4% automatic annotation success rate through multi-modal validation and reflection optimization processes.

Research Questions & Key Findings

RQ1: Annotation Framework Effectiveness

Finding: Our LLM-based semi-automated annotation significantly improves quality over baselines, with command accuracy improving from 80.4% to 89.0% (+8.6 percentage points).

LCS_ratio improvement: 0.675 → 1.000 through reflection optimization

RQ2: TCADGen Performance

Finding: TCADGen significantly outperforms existing methods, achieving 31.8 percentage point improvement over DeepCAD and 5 percentage point improvement over Text2CAD.

Accuracy: DeepCAD (57.1%) → TCADGen (89.0%)

RQ3: LLM Enhancement Effectiveness

Finding: CADLLM + TCADGen achieves 86.4% accuracy and 0.983 LCS ratio, demonstrating superior performance over direct LLM generation approaches.

Direct LLM: 32.8% → CADLLM+TCADGen: 86.4%

Ablation Studies

Finding: Both CAD-specific BERT fine-tuning and dual-channel architecture contribute significantly to performance, with removal causing 4+ percentage point drops.

Dual-channel architecture essential for complex commands

Limitations & Future Work

Current Limitations

Resource Intensive: Semi-automatic annotation requires significant LLM calls
Command Distribution: Imbalanced training data affects robustness for rare operations
Geometric Constraints: Limited explicit incorporation of structural reasoning
Design Phase Focus: Primarily supports detailed design, not conceptual design
Scalability: Manual verification still needed for quality control

Future Research Directions

Geometric Priors: Integrate constraint-aware learning mechanisms
Conceptual Design: Extend framework to early design phases
Multi-Modal Input: Support sketch and image-based design inputs
Real-Time Generation: Optimize for interactive design workflows
Domain Extension: Adapt to other CAD domains (architecture, electronics)
Collaborative AI: Human-AI collaborative design interfaces

Citation

@article{liao2025automated,
  title={Automated CAD Modeling Sequence Generation from Text Descriptions via Transformer-Based Large Language Models},
  author={Liao, Jianxing and Xu, Junyan and Sun, Yatao and Tang, Maowen and He, Sicheng and Liao, Jingxian and Yu, Shui and Li, Yun and Xiao, Hongguan},
  journal={arXiv preprint arXiv:2505.19490},
  year={2025},
  url={https://arxiv.org/abs/2505.19490}
}

Acknowledgments

This work is supported by the National Natural Science Foundation of China (Grant No. 52405253), by the Ministry of Science and Technology of China (Grant No. H20240917), and by the Sichuan Province Science and Technology Innovation Seedling Project (Grant No. MZGC20240134).

Contents

Automated CAD Modeling Sequence Generation from Text Descriptions via Transformer-Based Large Language Models

Abstract

Key Innovations

Semi-Automated Annotation Pipeline

TCADGen Architecture

CADLLM Enhancement

Methodology

Pipeline Overview

Semi-Automated Data Annotation

TCADGen Sequence Generation

CADLLM Enhancement

CAD Command Sequence (CCS) Representation

Experimental Results

Performance Comparison

Quality Metrics Comparison

Application: Motorcycle Frame Design

Real-World Industrial Application

Semi-Automated Annotation Pipeline

Research Questions & Key Findings

RQ1: Annotation Framework Effectiveness

RQ2: TCADGen Performance

RQ3: LLM Enhancement Effectiveness

Ablation Studies

Limitations & Future Work

Current Limitations

Future Research Directions

Citation

Acknowledgments