The Graduate School
Wednesday, December 11, 2024 at 2:00 PM until 3:00 PMUS Eastern Standard Time UTC -05:00
247 Hesburgh Library, Navari Family Center for Digital ScholarsHesburgh Libraries, INUnited States
Divide and conquer a corpus of texts in order to better understand it as a whole.
Topic modeling is a process of dividing & conquering a collection of texts in order to better understand the collection as a whole. Given a corpora of documents (books, articles, Web pages, etc.), topic modeling divides the corpora into sub-corpora, and each sub-corpora will be identified with a theme. This process is sometimes useful for identifying genres, authors, and/or subjects in a body of literature.
This hands-on workshop will demonstrate and facilitate the use of a free Java-based program called Topic Modeling Tool. Participants are expected to bring their own computer, and the computer is expected to have Java already installed, which it probably does.