Course Introduction

This graduate-level seminar course discusses advanced topics on the intersection between database systems and large language models (LLMs). We will mainly discuss 1) the novel database technologies for supporting modern LLMs (i.e., vector databases for retrieve-augmented generation) and 2) the integration of LLMs into databases for analyzing structured and unstructured data, such as PDF documents and text.

Students will read, present, and discuss recent research papers on these topics. In addition, students will form small groups to conduct a research project related to the course topics.

This course assumes students have completed the following undergraduate-level courses: data structures, algorithms, databases, and operating systems, or possess equivalent knowledge. Familiarity with at least one general-purpose programming language is required for conducting the research project.


Instructors