Contents

Automated CAD Modeling Sequence Generation from Text Descriptions via Transformer-Based Large Language Models

1Shenzhen Institute for Advanced Study, University of Electronic Science and Technology of China, Shenzhen 518000, China
2i4AI Ltd, London WCIN3AX, United Kingdom
ACL 2025

*Equal first authors
Corresponding authors: Shui.Yu@i4AI.org, Yun.Li@ieee.org
Paper Code Dataset arXiv

Abstract

Designing complex computer-aided design (CAD) models is often time-consuming due to challenges such as computational inefficiency and the difficulty of generating precise models. We propose a novel language-guided framework for industrial design automation to address these issues, integrating large language models (LLMs) with computer-automated design (CAutoD). Through this framework, CAD models are automatically generated from parameters and appearance descriptions, supporting the automation of design tasks during the detailed CAD design phase.

Our approach introduces three key innovations: (1) a semi-automated data annotation pipeline that leverages LLMs and vision-language large models (VLLMs) to generate high-quality parameters and appearance descriptions; (2) a Transformer-based CAD generator (TCADGen) that predicts modeling sequences via dual-channel feature aggregation; (3) an enhanced CAD modeling generation model, called CADLLM, that is designed to refine the generated sequences by incorporating the confidence scores from TCADGen.

Experimental results demonstrate that the proposed approach outperforms traditional methods in both accuracy and efficiency, providing a powerful tool for automating industrial workflows and generating complex CAD models from textual prompts.

Key Innovations

Semi-Automated Annotation Pipeline

Leverages LLMs and VLLMs to generate high-quality CAD descriptions with automated validation and verification processes, achieving 98.4% automatic pass rate.

TCADGen Architecture

Transformer-based dual-channel architecture that effectively fuses parameter and appearance features for accurate CAD command sequence generation.

CADLLM Enhancement

Fine-tuned large language model that refines generated sequences using confidence scores, achieving significant performance improvements.

Methodology

Framework Overview

Figure 1: Overall framework of automated CAD modeling from text descriptions, leveraging Transformer-based sequence generation and LLM-driven refinement.

Pipeline Overview

1

Semi-Automated Data Annotation

Utilizes multiple LLMs and VLLMs to generate detailed appearance and parameter descriptions, validated through automated consistency checks and reflection optimization.

2

TCADGen Sequence Generation

Transformer-based dual-channel architecture processes parameter and appearance descriptions to generate initial CAD command sequences with confidence scores.

3

CADLLM Enhancement

Fine-tuned large language model refines the generated sequences using confidence information to produce accurate final CAD commands.

TCADGen Architecture

Figure 2: TCADGen architecture with dual-channel feature aggregator and CAD command sequence decoder.

CAD Command Sequence (CCS) Representation

Example CCS:
<SOL>
<Line>: x=144, y=112
<Arc>: x=223, y=128, α=64, f=1
<Line>: x=223, y=204
<Extrude>: θ=192, φ=64, γ=192, px=105, py=121, pz=40, s=46, e1=148, e2=128, b=NewBodyFeatureOperation, u=OneSideFeatureExtentType
<EOS>

The CCS consists of 2D sketch commands (Line, Arc, Circle) followed by 3D extrusion operations, enabling precise parametric CAD model generation.

Experimental Results

96.6%
Command Accuracy
0.983
Average LCS Ratio
155503
Training Samples
98.4%
Auto Annotation Rate

Performance Comparison

Models Avg Command ACC F1 AUC Line AUC Arc AUC Circle AUC Extrude AUC
DeepCAD 0.571 0.606 0.747 0.648 0.540 0.587 0.616
Text2CAD 0.840 0.722 0.819 0.763 0.584 0.751 0.772
TCADGen 0.890 0.771 0.854 0.808 0.682 0.837 0.781
TCADGen+CADLLM (Ours) 0.966 0.947 0.962 0.957 0.925 0.959 0.942

Our TCADGen+CADLLM achieves significant improvements across all metrics, demonstrating the effectiveness of the dual-stage approach.

Quality Metrics Comparison

Method Model CD ↓ MMD ↓ JSD ↓
Transformer-based DeepCAD 169.93 31.91 45.03
Text2CAD 142.83 28.98 40.23
TCADGen 120.99 21.36 35.25
CAD Translator - 2.94 10.92
LLM-based CADFusion 45.67 3.49 17.11
DeepCAD+CADLLM 4.25 3.13 8.58
Text2CAD+CADLLM 4.31 3.12 8.42
TCADGen+CADLLM (Ours) 3.12 2.78 8.38

Application: Motorcycle Frame Design

Motorcycle Frame Application

Real-World Industrial Application

We demonstrate the practical applicability of our framework through motorcycle frame design automation. The framework generates individual components based on textual descriptions, which are then assembled to create the complete frame structure.

  • Rapid Prototyping: Accelerated design iteration from concept to CAD model
  • Precision Engineering: Maintains geometric accuracy required for manufacturing
  • Automated Workflow: Reduces manual CAD modeling time by 80%
  • Scalable Design: Applicable to various mechanical components

Semi-Automated Annotation Pipeline

Data Annotation Pipeline

Our annotation pipeline combines automated LLM-based description generation with quality control mechanisms. The system achieves 98.4% automatic annotation success rate through multi-modal validation and reflection optimization processes.

Research Questions & Key Findings

RQ1: Annotation Framework Effectiveness

Finding: Our LLM-based semi-automated annotation significantly improves quality over baselines, with command accuracy improving from 80.4% to 89.0% (+8.6 percentage points).

LCS_ratio improvement: 0.675 → 1.000 through reflection optimization

RQ2: TCADGen Performance

Finding: TCADGen significantly outperforms existing methods, achieving 31.8 percentage point improvement over DeepCAD and 5 percentage point improvement over Text2CAD.

Accuracy: DeepCAD (57.1%) → TCADGen (89.0%)

RQ3: LLM Enhancement Effectiveness

Finding: CADLLM + TCADGen achieves 86.4% accuracy and 0.983 LCS ratio, demonstrating superior performance over direct LLM generation approaches.

Direct LLM: 32.8% → CADLLM+TCADGen: 86.4%

Ablation Studies

Finding: Both CAD-specific BERT fine-tuning and dual-channel architecture contribute significantly to performance, with removal causing 4+ percentage point drops.

Dual-channel architecture essential for complex commands

Limitations & Future Work

Current Limitations

  • Resource Intensive: Semi-automatic annotation requires significant LLM calls
  • Command Distribution: Imbalanced training data affects robustness for rare operations
  • Geometric Constraints: Limited explicit incorporation of structural reasoning
  • Design Phase Focus: Primarily supports detailed design, not conceptual design
  • Scalability: Manual verification still needed for quality control

Future Research Directions

  • Geometric Priors: Integrate constraint-aware learning mechanisms
  • Conceptual Design: Extend framework to early design phases
  • Multi-Modal Input: Support sketch and image-based design inputs
  • Real-Time Generation: Optimize for interactive design workflows
  • Domain Extension: Adapt to other CAD domains (architecture, electronics)
  • Collaborative AI: Human-AI collaborative design interfaces

Citation

@inproceedings{liao2025automated,
  title={Automated CAD Modeling Sequence Generation from Text Descriptions via Transformer-Based Large Language Models},
  author={Liao, Jianxing and Xu, Junyan and Sun, Yatao and Tang, Maowen and He, Sicheng and Liao, Jingxian and Yu, Shui and Li, Yun and Guan, Xiaohong},
  booktitle={Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (ACL 2025)},
  year={2025},
  organization={Association for Computational Linguistics}
}

Acknowledgments

This work is supported by the National Natural Science Foundation of China (Grant No. 52405253), by the Ministry of Science and Technology of China (Grant No. H20240917), and by the Sichuan Province Science and Technology Innovation Seedling Project (Grant No. MZGC20240134).