Debanjan Mahata, Bloomberg
Keyphrases in Building NLP Driven Financial Products
Automatic identification of keyphrases from text documents is an extreme summarization problem that lies at the intersection of natural language processing (NLP) and information retrieval (IR). Keyphrases aid in capturing the most salient topics from the input text and are useful in multiple downstream tasks, such as classification, clustering, summarization, document recommendation, query expansion, interactive document retrieval, and both semantic and faceted search. This presentation will focus on the task of identifying keyphrases in the context of NLP within the financial domain. We will learn how identifying keyphrases can serve as a crucial task in document-level enrichments, search, discovery and analytics, as well as information extraction pipelines built on top of financial documents. We will also explore some of the key challenges in this area and some recent research and work done by Bloomberg engineers in this domain.
Bio
Dr. Debanjan Mahata works as a Research Engineer at Bloomberg at the intersection of natural language processing (NLP), information retrieval, machine learning, information extraction and software engineering. He is responsible for researching some of the challenging problems related to these areas and build real-world solutions around them. The resulting applications ship as products in the Bloomberg Terminal, enabling our clients around the globe to make smarter, more informed decisions about their business and financial strategies. He serves as an Adjunct Faculty at IIIT-Delhi, India. His current focus lies in the areas of keyphrase extraction and generation and computational social science problems solved using NLP. He regularly publishes research papers in top-tier venues like ACL, NAACL-HLT, EMNLP, AAAI, etc. He has served as a program committee member and senior program committee member in the major ML and NLP conferences.