Hi, im Lukas! đź‘‹
I am a Researcher at the ScaDS.AI Center for Scalable Data Science and Artificial Intelligence of Leipzig University. I am passionate about all things in Text Mining, Data Science, and Information Retrieval. I work on generative models for search, and search for generative models.
Professional Experience
- Researcher
Research on Generative Models for Search and Search for Generative Models.
ScaDS.AI
- Researcher
Research on Web Search, Crowdsourcing & Evaluation, and Plagiarism Detection
TEMIR Group, Leipzig University
- Student Assistant
Research Infrastructure, Technical Support, Experiment Assistance
Institute for Sociology, Leipzig University
- Student Assistant
Programming, Typesetting, Research Assistance
Institute for Translatology, Leipzig University
Teaching Experience
I have given seminars and lab sessions on both bachelors and masters level covering topics in ML, NLP, and IR:
- Foundations of Machine Learning
- Big Data & Language Technologies
- Advanced Information Retrieval
- Information Retrieval
Education
- M.Sc. Data Science
Leipzig University
- M.Sc. Digital Humanities
Leipzig University
- B.Sc. Digital Humanities
Leipzig University
- B.A. Linguistics
Leipzig University
- Highschool
Gymnasium Carolinum Bernburg
Publications
Filter by research type:
- Gienapp, L., Deckers, N., Potthast, M. & Scells, H. (2024). Learning Effective Representations for Retrieval Using Self-Distillation with Adaptive Relevance Margins. CoRR, abs/2407.21515. Methods
- Fröbe, M., Scells, H., Elstner, T., Akiki, C., Gienapp, L., Reimer, J., MacAvaney, S., Stein, B., Hagen, M. & Potthast, M. (2024). Resources for Combining Teaching and Research in Information Retrieval Courses. ACM . Teaching
- Gienapp, L., Scells, H., Deckers, N., Bevendorff, J., Wang, S., Kiesel, J., Syed, S., Fröbe, M., Zuccon, G., Stein, B., Hagen, M. & Potthast, M. (2024). Evaluating Generative Ad Hoc Information Retrieval. ACM . Evaluation
- Elstner, T., Loebe, F., Ajjour, Y., Akiki, C., Bondarenko, A., Fröbe, M., Gienapp, L., Kolyada, N., Mohr, J., Sandfuchs, S., Wiegmann, M., Frochte, J., Ferro, N., Hofmann, S., Stein, B., Hagen, M. & Potthast, M. (2023). Shared Tasks as Tutorials: A Methodical Approach. EAAI . Teaching
- Reimer, J., Schmidt, S., Fröbe, M., Gienapp, L., Scells, H., Stein, B., Hagen, M. & Potthast, M. (2023). The Archive Query Log: Mining Millions of Search Result Pages of Hundreds of Search Engines from 25 Years of Web Archives. ACM . Data
- Bevendorff, J., Sauer, P., Gienapp, L., Kircheis, W., Körner, E., Stein, B. & Potthast, M. (2023). SMAuC – The Scientific Multi-Authorship Corpus. IEEE . Data
- Fröbe, M., Gienapp, L., Potthast, M. & Hagen, M. (2023). Bootstrapped nDCG Estimation in the Presence of Unjudged Documents. Springer . Evaluation
- Gienapp, L., Kircheis, W., Sievers, B., Stein, B. & Potthast, M. (2023). A large dataset of scientific text reuse in Open-Access publications. Scientific Data, 10(1). Data
- Bondarenko, A., Fröbe, M., Gienapp, L., Pugachev, A., Reimer, J., Schlatt, F., Artemova, E., Potthast, M., Stein, B., Braslavski, P. & Hagen, M. (2022). Webis at TREC 2022: Deep Learning and Health Misinformation. National Institute of Standards; Technology (NIST) . Methods
- Gienapp, L., Fröbe, M., Hagen, M. & Potthast, M. (2022). Sparse Pairwise Re-ranking with Pre-trained Transformers. ACM . Methods
- Akiki, C., Gienapp, L. & Potthast, M. (2022). Tracking Discourse Influence in Darknet Forums. CoRR, abs/2202.02081. Analyses
- Fröbe, M., Hagen, M., Bevendorff, J., Völske, M., Stein, B., Schröder, C., Wagner, R., Gienapp, L. & Potthast, M. (2021). The Impact of Main Content Extraction on Near-Duplicate Detection. International Open Search Symposium . Analyses
- Bondarenko, A., Gienapp, L., Fröbe, M., Beloucif, M., Ajjour, Y., Panchenko, A., Biemann, C., Stein, B., Wachsmuth, H., Potthast, M. & Hagen, M. (2021). Overview of Touché 2021: Argument Retrieval. Springer . Methods
- Fröbe, M., Bevendorff, J., Gienapp, L., Völske, M., Stein, B., Potthast, M. & Hagen, M. (2021). CopyCat: Near-Duplicates within and between the ClueWeb and the Common Crawl. ACM . Data
- Gienapp, L., Stein, B., Hagen, M. & Potthast, M. (2020). Estimating Topic Difficulty Using Normalized Discounted Cumulated Gain. ACM . Evaluation
- Gienapp, L., Fröbe, M., Hagen, M. & Potthast, M. (2020). The Impact of Negative Relevance Judgments on NDCG. ACM . Evaluation
- Bondarenko, A., Fröbe, M., Beloucif, M., Gienapp, L., Ajjour, Y., Panchenko, A., Biemann, C., Stein, B., Wachsmuth, H., Potthast, M. & Hagen, M. (2020). Overview of Touché 2020: Argument Retrieval. Springer . Methods
- Gienapp, L., Stein, B., Hagen, M. & Potthast, M. (2020). Efficient Pairwise Annotation of Argument Quality. Association for Computational Linguistics . Evaluation Data Methods
- Potthast, M., Gienapp, L., Euchner, F., Heilenkötter, N., Weidmann, N., Wachsmuth, H., Stein, B. & Hagen, M. (2019). Argument Search: Assessing Argument Relevance. ACM . Evaluation
Awards & Grants
- SIGIR Student Travel Grant of the Special Interest Group on Information Retrieval (SIGIR) for the 29th ACM International Conference on Information and Knowledge Management (CIKM 2020) for the paper Estimating Topic Difficulty Using Normalized Discounted Cumulated Gain. (Citation: Gienapp, Stein et al., 2020 Gienapp, Stein et al. (2020). Estimating Topic Difficulty Using Normalized Discounted Cumulated Gain. ACM . Evaluation )
- SIGIR Student Travel Grant of the Special Interest Group on Information Retrieval (SIGIR) for the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2019) for the paper Argument Search: Assessing Argument Relevance. (Citation: Potthast, Gienapp et al., 2019 Potthast, Gienapp et al. (2019). Argument Search: Assessing Argument Relevance. ACM . Evaluation )