Efficient and reliable AI-driven molecular simulation
Computational tools such as Molecular Dynamics (MD) have revolutionized the way we study biomolecules; however, they are severely limited by the computational cost of running simulations on biological time- and length-scales. Various coarse-grained (CG) models have been developed which rely on simpler representations of molecular systems than atomistic MD. While these models are difficult to configure using physical intuition, we have shown that by using state-of-the-art machine learning methods, it is possible to design accurate and efficient CG models which can correctly reproduce protein dynamics. By enhancing both our training dataset and network architecture, we hope to produce a “universal” CG model to study biological systems.
High-level electronic-structure calculations of novel materials with the all-electron code exciting
Converging calculations is a common need in the ab initio materials-science community.
This tedious and resource-intensive process can be largely avoided if well-validated
recommendations are available. In order to create a recommender system to assist
users, benchmark data are required. This project addresses this need. It evaluates the
convergence behavior of electronic properties for a dataset of 10 materials that are
promising for optoelectronic applications.
Open GPT-X - Evaluating the Performance of Large Language Models
OpenGPT-X has set a goal to create and train open large language models (LLM) for European languages. Existing language models focus primarily on the English language, and hence perform unfavourably when used for any of the other commonly spoken European languages.
From large-scale benchmarking of multilingual LLMs to introducing Teuken-7B models, our research uncovers how tokenization and balanced datasets enhance cross-lingual performance. Join us in exploring transparent and reproducible innovations shaping the future of multilingual AI.
Semi-Automatic Subject Classification with Basisklassifikation
In this project the goal is to use algorithms to predict classes of the library classification system “Basisklassifikation” (which can be translated as basic classification). A library classification system is a taxonomy of predefined classes that represent disciplines, subdisciplines, themes or types of publications. Subject librarians assign one or more of these classes to each publication, allowing both final users or retrieval system to use this annotated information for finding publications. As input data we observe mainly bibliographic data, such as for example the title, the name of the publisher, the year of publication and the language of the publication. The algorithms should suggest several classes, which are then analyzed by