Institute of General Mechanical Engineering, Faculty of Computer Science and Engineering Science, TH Köln University of Applied Science, Germany
Submission date: 2026-02-05
Final revision date: 2026-02-27
Acceptance date: 2026-03-02
Online publication date: 2026-04-16
Corresponding author
Tom Keller
Institute of General Mechanical Engineering, Faculty of Computer Science and Engineering Science, TH Köln University of Applied Science, Steinmüllerallee 1, 51643, Gummersbach, Germany
Retrieval-Augmented Generation (RAG) is widely used for manufacturing assistance, but its effectiveness depends on selecting retrievable text units. We test whether humans or Large Language Models (LLMs) can judge which case descriptions are better suited as RAG inputs. We constructed 100 synthetic manufacturing service cases, each paired with a realistic query and two comparable problem–solution variants differing in contextual completeness, granularity, and quality. Five engineers and five LLMs chose the variant expected to be more retrievable and useful. As a reference, both variants were indexed in a minimal retrieval setup with one chunk per case and evaluated with MRR@3, treating the case-matching chunk as the only relevant item among distractors. LLMs showed much higher within-group agreement than humans, yet neither cohort consistently matched retrieval-derived winners. Ties were frequent; on non-tied cases, majority decisions fell below chance and were significantly worse than random guessing in one embedding setting, while no individual rater achieved above-chance performance. Overall, the findings indicate that perceived RAG-fitness is not a reliable proxy for retrieval performance and should be grounded in retrieval-based evaluation under the target deployment setup.
REFERENCES(18)
1.
GAO Y., XIONG Y., GAO X., JIA K., PAN J., BI Y., DAI Y., SUN J., WANG M., WANG H., 2024, Retrieval-Augmented Generation for Large Language Models: a Survey, arXiv.
CHENG M., LUO Y., JIE O., LIU Q., LIU H., LI L., YU S., ZHANG B., CAO J., MA J., WANG D., CHEN E., 2025, A Survey on Knowledge-Oriented Retrieval-Augmented Generation, ArXiv, abs/2503.10677, https://doi.org/10.48550/arxiv....
FLEISCHER J., PUCHTA A., GÖNNHEIMER P., 2021, Seamless and Modular Architecture for Autonomous Machine Tools, Journal of Machine Engineering, https://doi.org/10.36897/jme/1....
FRIEDRICH C., VOGT S., RUDOLPH F., PATOLLA P., GRÜTZMANN J.M., HOHMEIER O., RICHTER M., WENZEL K., REICHELT D., IHLENFELDT S., 2024, Enabling Federated Learning Services Using OPC UA, Linked Data and GAIA-X in Cognitive Production, Journal of Machine Engineering, 24/2, 18–33, https://doi.org/10.36897/jme/1....
MAYAT N., WACHTER C., SPATZENEGGER S., HINRICHS M.P., WEISSER T., SCHMITT R.H., 2025, Performance of Rag-Based Systems in Industrial Organizations: A Case Study in the Automotive Industry, IEEE 8th International Conference on Industrial Cyber-Physical Systems (ICPS), 1–6, https://doi.org/10.1109/icps65....
SHUMAILOV I., SHUMAYLOV Z., ZHAO Y., PAPERNOT N., ANDERSON R., GAL Y., 2024, AI Models Collapse when Trained on Recursively Generated Data, Nature, 631/8022, 755–759, https://doi.org/10.1038/s41586....
ZHAO P., ZHANG H., YU Q., WANG Z., GENG Y., FU F., YANG L., ZHANG W., JIANG J., CUI B., 2024, Retrieval-Augmented Generation for AI-Generated Content: A Survey, arXiv.
BLEICHER F., RAMSAUER C., LEONHARTSBERGER M., LAMPRECHT M., STADLER P., STRASSER D., WIEDERMANN C., 2021, Tooling Systems with Integrated Sensors Enabling Data Based Process Optimization, Journal of Machine Engineering, 5–21, https://doi.org/10.36897/jme/1....
BREHME L., DORNAUER B., STRÖHLE T., EHRHART M., BREU R., 2025, Retrieval-Augmented Generation in Industry: an Interview Study on Use Cases, Requirements, Challenges, and Evaluation, Proceedings of the 17th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management, 110–122, https://doi.org/10.5220/001373....
ZHOU Y., LIU Y., LI X., JIN J., QIAN H., LIU Z., LI C., DOU Z., HO T.-Y., YU P.S., 2024, Trustworthiness in Retrieval-Augmented Generation Systems: A Survey, arXiv.
CAVALCANTI Y.C., DA MOTA SILVEIRA NETO P.A., LUCRÉDIO D., VALE T., DE ALMEIDA E.S., DE LEMOS MEIRA S.R., 2013, The Bug Report Duplication Problem: an Exploratory Study, Software Qual J, 21/1, 39–66, https://doi.org/10.1007/s11219....
EBRAHIMI N., TRABELSI A., ISLAM MD.S., HAMOU-LHADJ A., KHANMOHAMMADI K., 2019, An HMM-Based Approach for Automatic Detection and Classification of Duplicate Bug Reports, Information and Software Technology, 113, 98–109, https://doi.org/10.1016/j.infs....
XU Z., CRUZ M.J., GUEVARA M., WANG T., DESHPANDE M., WANG X., LI Z., 2024, Retrieval-Augmented Generation with Knowledge Graphs for Customer Service Question Answering, Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2905–2909, https://doi.org/10.1145/362677....
We process personal data collected when visiting the website. The function of obtaining information about users and their behavior is carried out by voluntarily entered information in forms and saving cookies in end devices. Data, including cookies, are used to provide services, improve the user experience and to analyze the traffic in accordance with the Privacy policy. Data are also collected and processed by Google Analytics tool (more).
You can change cookies settings in your browser. Restricted use of cookies in the browser configuration may affect some functionalities of the website.