성과공개
성과공개
상세 정보
상세 정보
논문
Enhancing Building Semantics Preservation in AI Model Training with Large Language Model Encodings
연도
5차
분류
구성기술2
연구기관
연세대학교
Yonsei University
Yonsei University
구분2
학술발표
논문명
Enhancing Building Semantics Preservation in AI Model Training with Large Language Model Encodings
Enhancing Building Semantics Preservation in AI Model Training with Large Language Model Encodings
Enhancing Building Semantics Preservation in AI Model Training with Large Language Model Encodings
학술지명
42nd International Symposium on Automation and Robotics in Construction (ISARC 2025)
ISSN
학술지 볼륨번호
게재일
2025.07.28
논문페이지
1004-1011
주저자명
장수형
Suhyung Jang
Suhyung Jang
교신저자명
이강
Ghang Lee
Ghang Lee
공동저자명
이재근
Jaekun Lee
Jaekun Lee
논문 초록
Accurate representation of building semantics—encompassing both generic object types and specific subtypes—is essential for effective AI model training in the architecture, engineering, construction, and operation (AECO) industry. Conventional encoding methods (e.g., one-hot) often fail to convey the nuanced relationships among closely related subtypes, limiting AI’s semantic comprehension. To address this limitation, this study proposes a novel training approach that employs large language model (LLM) embeddings (e.g., OpenAI GPT and Meta LLaMA) as encodings to preserve finer distinctions in building semantics. We evaluated the proposed method by training GraphSAGE models to classify 42 building object subtypes across five high-rise residential building information models (BIMs). Various embedding dimensions were tested, including original high-dimensional LLM embeddings (1,536, 3,072, or 4,096) and 1,024-dimensional compacted embeddings generated via the Matryoshka representation model. Experimental results demonstrated that LLM encodings outperformed the conventional one-hot baseline, with the “llama-3 (compacted)” embedding achieving a weighted average F1-score of 0.8766, compared to 0.8475 for one-hot encoding. The results underscore the promise of leveraging LLM-based
encodings to enhance AI’s ability to interpret complex, domain-specific building
semantics. As the capabilities of LLMs and dimensionality reduction techniques continue to evolve, this approach holds considerable potential for broad application in semantic elaboration tasks throughout the AECO industry.