Hi, im Lukas! 👋

Lukas Gienapp

I am a Researcher at the ScaDS.AI Center for Scalable Data Science and Artificial Intelligence of Leipzig University. I am passionate about all things in Text Mining, Data Science, and Information Retrieval. I work on generative models for search, and search for generative models.

Professional Experience

since 2025
Researcher
Research on Generative Models for Search and Search for Generative Models.
Deep Semantic Learning Group, Kassel University
2022-2025
Researcher
Research on Generative Models for Search and Search for Generative Models.
ScaDS.AI Centre for Scalable Data Science & Artificial Intelligence, Leipzig
2019–2022
Researcher
Research on Web Search, Crowdsourcing & Evaluation, and Plagiarism Detection
Text Mining & Retrieval Group, Leipzig University
2017–2019
Student Assistant
Research Infrastructure, Technical Support, Experiment Assistance
Institute for Sociology, Leipzig University
2017–2019
Student Assistant
Programming, Typesetting, Research Assistance
Institute for Translatology, Leipzig University

Teaching Experience

I have given seminars and lab sessions on both bachelors and masters level covering topics in ML, NLP, and IR:

Education

2019-2022
M.Sc. Data Science
Leipzig University
2019-2022
M.Sc. Digital Humanities
Leipzig University
2016-2019
B.Sc. Digital Humanities
Leipzig University
2015-2016
B.A. Linguistics
Leipzig University
until 2014
Highschool
Gymnasium Carolinum Bernburg

Publications

Filter by research type:

Gienapp, L., Hagen, T., Fröbe, M., Hagen, M., Stein, B., Potthast, M. & Scells, H. (2025). The Viability of Crowdsourcing for RAG Evaluation. ACM . Data Evaluation
Peters, J., Neumann, A., Jaeger, M., Gienapp, L. & Umlauft, J. (2025). ml4xcube: Machine Learning Toolkits for Earth System Data Cubes. Methods
Gienapp, L., Deckers, N., Potthast, M. & Scells, H. (2024). Learning Effective Representations for Retrieval Using Self-Distillation with Adaptive Relevance Margins. CoRR, abs/2407.21515. Methods
Fröbe, M., Scells, H., Elstner, T., Akiki, C., Gienapp, L., Reimer, J., MacAvaney, S., Stein, B., Hagen, M. & Potthast, M. (2024). Resources for Combining Teaching and Research in Information Retrieval Courses. ACM . Teaching
Gienapp, L., Scells, H., Deckers, N., Bevendorff, J., Wang, S., Kiesel, J., Syed, S., Fröbe, M., Zuccon, G., Stein, B., Hagen, M. & Potthast, M. (2024). Evaluating Generative Ad Hoc Information Retrieval. ACM . Evaluation
Elstner, T., Loebe, F., Ajjour, Y., Akiki, C., Bondarenko, A., Fröbe, M., Gienapp, L., Kolyada, N., Mohr, J., Sandfuchs, S., Wiegmann, M., Frochte, J., Ferro, N., Hofmann, S., Stein, B., Hagen, M. & Potthast, M. (2023). Shared Tasks as Tutorials: A Methodical Approach. EAAI . Teaching
Reimer, J., Schmidt, S., Fröbe, M., Gienapp, L., Scells, H., Stein, B., Hagen, M. & Potthast, M. (2023). The Archive Query Log: Mining Millions of Search Result Pages of Hundreds of Search Engines from 25 Years of Web Archives. ACM . Data
Bevendorff, J., Sauer, P., Gienapp, L., Kircheis, W., Körner, E., Stein, B. & Potthast, M. (2023). SMAuC – The Scientific Multi-Authorship Corpus. IEEE . Data
Fröbe, M., Gienapp, L., Potthast, M. & Hagen, M. (2023). Bootstrapped nDCG Estimation in the Presence of Unjudged Documents. Springer . Evaluation
Gienapp, L., Kircheis, W., Sievers, B., Stein, B. & Potthast, M. (2023). A large dataset of scientific text reuse in Open-Access publications. Scientific Data, 10(1). Data
Bondarenko, A., Fröbe, M., Gienapp, L., Pugachev, A., Reimer, J., Schlatt, F., Artemova, E., Potthast, M., Stein, B., Braslavski, P. & Hagen, M. (2022). Webis at TREC 2022: Deep Learning and Health Misinformation. National Institute of Standards; Technology (NIST) . Methods
Gienapp, L., Fröbe, M., Hagen, M. & Potthast, M. (2022). Sparse Pairwise Re-ranking with Pre-trained Transformers. ACM . Methods
Akiki, C., Gienapp, L. & Potthast, M. (2022). Tracking Discourse Influence in Darknet Forums. CoRR, abs/2202.02081. Analyses
Fröbe, M., Hagen, M., Bevendorff, J., Völske, M., Stein, B., Schröder, C., Wagner, R., Gienapp, L. & Potthast, M. (2021). The Impact of Main Content Extraction on Near-Duplicate Detection. International Open Search Symposium . Analyses
Bondarenko, A., Gienapp, L., Fröbe, M., Beloucif, M., Ajjour, Y., Panchenko, A., Biemann, C., Stein, B., Wachsmuth, H., Potthast, M. & Hagen, M. (2021). Overview of Touché 2021: Argument Retrieval. Springer . Methods
Fröbe, M., Bevendorff, J., Gienapp, L., Völske, M., Stein, B., Potthast, M. & Hagen, M. (2021). CopyCat: Near-Duplicates within and between the ClueWeb and the Common Crawl. ACM . Data
Gienapp, L., Stein, B., Hagen, M. & Potthast, M. (2020). Estimating Topic Difficulty Using Normalized Discounted Cumulated Gain. ACM . Evaluation
Gienapp, L., Fröbe, M., Hagen, M. & Potthast, M. (2020). The Impact of Negative Relevance Judgments on NDCG. ACM . Evaluation
Bondarenko, A., Fröbe, M., Beloucif, M., Gienapp, L., Ajjour, Y., Panchenko, A., Biemann, C., Stein, B., Wachsmuth, H., Potthast, M. & Hagen, M. (2020). Overview of Touché 2020: Argument Retrieval. Springer . Methods
Gienapp, L., Stein, B., Hagen, M. & Potthast, M. (2020). Efficient Pairwise Annotation of Argument Quality. Association for Computational Linguistics . Evaluation Data Methods
Potthast, M., Gienapp, L., Euchner, F., Heilenkötter, N., Weidmann, N., Wachsmuth, H., Stein, B. & Hagen, M. (2019). Argument Search: Assessing Argument Relevance. ACM . Evaluation

Awards & Grants

SIGIR Student Travel Grant of the Special Interest Group on Information Retrieval (SIGIR) for the 29th ACM International Conference on Information and Knowledge Management (CIKM 2020) for the paper Estimating Topic Difficulty Using Normalized Discounted Cumulated Gain. (Citation: Gienapp, Stein et al., 2020 Gienapp, Stein et al. (2020). Estimating Topic Difficulty Using Normalized Discounted Cumulated Gain. ACM . Evaluation )
SIGIR Student Travel Grant of the Special Interest Group on Information Retrieval (SIGIR) for the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2019) for the paper Argument Search: Assessing Argument Relevance. (Citation: Potthast, Gienapp et al., 2019 Potthast, Gienapp et al. (2019). Argument Search: Assessing Argument Relevance. ACM . Evaluation )