Partha Talukdar: Google DeepMind

I lead the Languages Group at Google DeepMind India, and also serve as a lead in Gemini Multilinguality. Our goal is to make sure Gemini and AI benefit all of humanity while ensuring language and cultural differences are not barriers anymore.

My group works on all aspects of LLM development (e.g., pretraining, posttraining, evaluations, etc) and our work has impacted Gemini models (2.0+). We have also enhanced various Google products such as Gemini App, Assistant, Search, etc. My group also made foundational contributions to Google Translate's largest expansion of 110 languages representing more than 600 million speakers.

We believe in advancing the field by publishing our research as well by contributing datasets (e.g., Vaani (vaani.iisc.ac.in) in collaboration with IISc) and models (e.g., MuRIL) to the ecosystem.

My recent research has focused on multilingual and multimodal learning, learning in low-resource settings, and multicultural learning -- all in the context of LLMs.

Previously, I founded KENOME to help enterprises make sense of dark unstructured data using Knowledge Graphs. I was also a faculty member at IISc Bangalore working on NLP, ML and Knowledge Graphs.

News & Activities

Infrequently updated ...
Nov 1, 2024: Elected Fellow of Indian National Academy of Engineering (INAE)
Feb 11, 2023: Awarded the ACM India Early Career 2022 Award
June 1, 2020: Action Editor, Transactions of the ACL (TACL)
Dec 21, 2019: Promoted to Associate Professor.
Nov 4, 2019: EMNLP 2019 Tutorial on Graph Neural Networks for NLP [Tutorial Page, Slides]
Jul 31, 2019: Our paper received Outstanding Paper Award at ACL 2019. [1, 2]
Jul 2, 2019: Received Tenure at IISc, Bangalore.
June 10, 2019: Presenting an invited tutorial at ICML 2019
May 28, 2019: Speaking at Goldman Sachs Research, New York
May 20-22, 2019: Serving as Workshop Co-chair for AKBC 2019
May 10, 2019: Speaking at Ericsson
Oct 2018 - April 2019: Talks at LG, Goldman Sachs, Accenture, Infosys, Rakuten, NVidia, Intuit, Dubai Police Symposium, ITech Law, and Mindtree.
October 11, 2018: Participating in the launch of World Economic Forum Centre for the Fourth Industrial Revolution India
Sep 7, 2018: Keynote at MLSLP 2018
Aug 19, 2018: Presenting a tutorial at KDD 2018
Feb 5, 2018: Attending the Nobel Prize Series, India 2018 at the Rashtrapati Bhawan, New Delhi [Press]
Oct 8, 2017: Speaking at TedXNITW
Jul 27-28, 2017: Participating in Google NLU Workshop
Jan 24, 2017: Talk at the University of Melbourne, Australia
Dec 2016: Thanks to IBM for an IBM Faculty Award
Dec 2016: Thanks to Google for a Focused Research Award
Oct 2016: Serving as PC co-chair for CODS 2017
July 2, 2016: Talk at the ML Workshop at IIT Kanpur.
June 29, 2016: Talk at the Allen Institute for Artificial Intelligence (AI2), Seattle, USA.
Jun 13, 2016: Talk at the Distinguished Lecture Series, IBM Research India.
May 16-17, 2016: Participating in Google workshops on NLU and NLP.
Mar 31, 2016: Speaking at the IISc Big Data Initiative
Oct 26, 2015: Serving as Senior PC Member in IJCAI 2016
Oct 8, 2015: Talk at Intel India Academic Forum (IAF 2015)
Sep 28, 2015: Serving as an Area co-chair for ACL 2016.
Sep 20, 2015: Thanks to Google for a Focused Faculty Research Award.
Aug 21, 2015: Talk at Indian Statistical Institute (ISI), Kolkata
Jul 6, 2015: Received an Accenture Open Innovation grant [Press Release]
Jun 19, 2015: Talk at Google Research, Mountain View
May 14, 2015: Talk at Flipkart, India.
Feb 28 - Mar 4, 2015: Speaking and attending the MLCN 2015 workshop at IIT, Kharagpur.
Feb 18, 2015: Speaking at Ericsson Research, India.
Jan 16, 2015: Speaking at the Alumni Research Talk, BITS Pilani.
Dec 04, 2014: Speaking at the Center for Neuroscience, IISc Seminar.
Nov 21, 2014: Talk at CSE, IIT Madras.
Nov 05, 2014: Talk at Google, Bangalore.
Oct 29, 2014: Talk at TextGraphs 2014
Oct 15, 2014: Talk at Xerox Research, India
Oct 09, 2014: Talk at IBM I-CARE 2014 Big Data Analytics and Cognitive Computing Winter School
Sep 19, 2014: Talk at the Machine Learning Group, Amazon Bangalore.
Aug 18-22, 2014: Co-teaching at ESSLLI 2014
Aug 01, 2014: Talk at IBM Research, Bangalore.
Jul 14, 2014: Joined Indian Institute of Science (IISc), Bangalore as an Assistant Professor.
Jun 27, 2014: Talk at the SIGMOD 2014 workshop WACCK 2014.
May 29, 2014: Received a Google Focused Research Award to work on Goal-directed Ontology Extension for Natural Language Understanding.
Apr 16, 2014: Attending SWANK 2014
Apr 15, 2014: Talk at Google Research, Mountain View
Mar 28, 2014: Participating in the NYAS 2014 Annual ML Symposium.
Nov 14-15, 2013: Participating in the Language Understanding and Knowledge Discovery workshop organized by Google.
Oct 27-31, 2013: Co-organizing AKBC 2013 and presenting our paper at CIKM 2013.
Oct 11, 2013: Speaking on Never Ending Lamguage Learning (NELL) at the Mid-Atlantic Student Colloquium on Speech, Language, and Learning.
May 28, 2013: Giving a talk at the NLP Lab at Simon Fraser University
May 27, 2013: Amar Subramanya and I are presenting a tutorial on Graph-based Semi-Supervised Learning Algorithms for Speech Processing at ICASSP 2013.
July 25, 2012: Speaking on graph-based semi-supervised learning at the Lisbon Machine Learning School 2012.
July 08, 2012: Amar Subramanya and I are presenting a tutorial on Graph-based Semi-Supervised Learning Algorithms for NLP at ACL 2012.
May 16, 2012: Serving as a grand awards judge at the ISEF 2012
I am serving as Information Extraction (IE) area co-chair for EMNLP-CoNLL 2012.
I'm co-organizing a NAACL-HLT 2012 joint workshop on Automatic Knowledge Extraction from Text. Please consider submitting and attending!
Junto label propagation toolkit is now out (Nov 21, 2011). It has an API and Hadoop implementations of three graph-based semi-supervised learning algorithms.
Nov 17-18, 2011: Participating in the Knowledge Discovery Workshop at Google.
Since May 2011, I have been a Postdoctoral Fellow working with Tom Mitchell in the Machine Learning Department at Carnegie Mellon University
I spent an exciting year at the Microsoft Research Search Labs during the period March 2010 - March 2011.
I defended my thesis on Feb 24, 2010. I was fortunate to have been advised by Fernando Pereira, Zack Ives, and Mark Liberman.

Research

My recent research has focused on multilingual and multimodal learning, learning in low-resource settings, and multicultural learning -- all in the context of LLMs.

My former research group at IISc: Machine And Language Learning (MALL) Lab

To learn more about AI@IISc, visit here. Former research groups at CMU: Read the Web (CMU), CMU Brain Research Group

Past Research Groups: Search Labs (Microsoft Research), Structured Learning at Penn, Penn Research in Machine Learning (PRIML), Penn Natural Language Processing, Penn BioIE Group.

Tutorials

Graph Neural Networks for Natural Language Processing (EMNLP 2019 and CODS-COMAD 2020)

Never-Ending Learning [Videos: [1][2]] (Invited tutorial at ICML 2019, with Tom Mitchell (CMU))

Knowledge Extraction and Inference from Text (KDD 2018)

Knowledge Extraction and Inference from Text (CIKM 2017, with Soumen Chakrabarti (IIT Bombay))

Graph-based Semi-supervised Learning (ACL 2012 and ICASSP 2013, with Amarnag Subramanya (Google))

Publications

[Google Scholar] [Arxiv]

Book

Graph-based Semi-Supervised Learning. Amarnag Subramanya, Partha Pratim Talukdar. Morgan & Claypool Publishers. [Amazon]

I don't actively maintain the publications list, please see the Google Scholar listing for a more up-to-date version

2024

LLM Augmented LLMs: Expanding Capabilities through Composition
Rachit Bansal, Bidisha Samanta, Siddharth Dalmia, Nitish Gupta, Shikhar Vashishth, Sriram Ganapathy, Abhishek Bapna, Prateek Jain, Partha Talukdar
ICLR 2024

IndicGenBench: A Multilingual Benchmark to Evaluate Generation Capabilities of LLMs on Indic Languages [Dataset]
Harman Singh, Nitish Gupta, Shikhar Bharadwaj, Dinesh Tewari, Partha Talukdar
ACL 2024

UGIF: UI Grounded Instruction Following [UGIF Dataset]
Sagar Gubbi Venkatesh, Partha Talukdar, Srini Narayanan
NAACL 2024 Findings

Multimodal Modeling For Spoken Language Identification
Shikhar Bharadwaj, Min Ma, Shikhar Vashishth, Ankur Bapna, Sriram Ganapathy, Vera Axelrod, Siddharth Dalmia, Wei Han, Yu Zhang, Daan van Esch, Sandy Ritchie, Partha Talukdar, Jason Riesa
ICASSP 2024

HEAL: Unlocking the Potential of Learning on Hypergraphs Enriched with Attributes and Layers
Naganand Yadati, Tarun Kumar, Deepak Maurya, Balaraman Ravindran, Partha Talukdar
Learning on Graphs Conference 2024

2023

Self-influence Guided Data Reweighing for Language Model Pre-training
Megh Thakkar, Sriram Ganapathy, Shikhar Vashishth, Tolga Bolukbasi, Sarath Chandar, Partha Talukdar
EMNLP 2023

XTREME-UP: A User-Centric Scarce-Data Benchmark for Under-Represented Languages
Sebastian Ruder, Jonathan H. Clark, Alexander Gutkin, Mihir Kale, Min Ma, Massimo Nicosia, Shruti Rijhwani, Parker Riley, Jean-Michel A. Sarr, Xinyi Wang, John Wieting, Nitish Gupta, Anna Katanova, Christo Kirov, Dana L. Dickinson, Brian Roark, Bidisha Samanta, Connie Tao, David I. Adelani, Vera Axelrod, Isaac Caswell, Colin Cherry, Dan Garrette, Reeve Ingle, Melvin Johnson, Dmitry Panteleev, Partha Talukdar
EMNLP 2023 Findings

Label Aware Speech Representation Learning For Language Identification
Shikhar Vashishth, Shikhar Bharadwaj, Sriram Ganapathy, Ankur Bapna, Min Ma, Wei Han, Vera Axelrod, Partha Talukdar
Interspeech 2023

Parameter-Efficient Finetuning for Robust Continual Multilingual Learning
Kartikeya Badola, Shachi Dave, Partha Talukdar
ACL 2023 Findings

Bootstrapping Multilingual Semantic Parsers using Large Language Models
Abhijeet Awasthi, Nitish Gupta, Bidisha Samanta, Shachi Dave, Sunita Sarawagi, Partha Talukdar
EACL 2023

TwiRGCN: Temporally Weighted Graph Convolution for Question Answering over Temporal Knowledge Graphs
Aditya Sharma, Apoorv Saxena, Chitrank Gupta, Seyed Mehran Kazemi, Partha Talukdar, Soumen Chakrabarti
EACL 2023

Evaluating the Diversity, Equity, and Inclusion of NLP Technology: A Case Study for Indian Languages
Simran Khanuja, Sebastian Ruder, Partha Talukdar
Findings of EACL 2023

Salient Span Masking for Temporal Understanding
Jeremy Cole, Aditi Chaudhary, Bhuwan Dhingra and Partha Talukdar
EACL 2023

2022

Re-contextualizing Fairness in NLP: The Case of India
Shaily Bhatt, Sunipa Dev, Partha Talukdar, Shachi Dave, Vinodkumar Prabhakaran
AACL-IJCNLP 2022

When is BERT Multilingual? Isolating Crucial Ingredients for Cross-lingual Transfer
Ameet Deshpande, Partha Talukdar, Karthik Narasimhan
NAACL 2022

Overlap-based Vocabulary Generation Improves Cross-lingual Transfer Among Related Languages
Vaidehi Patil, Partha Talukdar, Sunita Sarawagi
ACL 2022

Few-shot Controllable Style Transfer for Low-Resource Multilingual Settings
Kalpesh Krishna, Deepak Nathani, Xavier Garcia, Bidisha Samanta, Partha Talukdar
ACL 2022

GPU-accelerated connectome discovery at scale
Varsha Sreenivasan, Sawan Kumar, Franco Pestilli, Partha Talukdar, Devarajan Sridharan
Nature Computational Science volume 2, pages298–306 (2022)

Walking with PACE-Personalized and Automated Coaching Engine
Madhurima Vardhan, Narayan Hegde, Srujana Merugu, Shantanu Prabhat, Deepak Nathani, Martin Seneviratne, Nur Muhammad, Pranay Reddy, Sriram Lakshminarasimhan, Rahul Singh, Karina Lorenzana, Eshan Motwani, Partha Talukdar, Aravindan Raghuveer
30th ACM UMAP 2022 (Best Paper Award)

2021

Question Answering over Temporal Knowledge Graphs [Code]
Apoorv Saxena, Soumen Chakrabarti and Partha Talukdar
ACL 2021

MergeDistill: Merging Language Models using Pre-trained Distillation
Simran Khanuja, Melvin Johnson and Partha Talukdar
Findings of ACL 2021

Exploiting Language Relatedness for Low Web-Resource Language Model Adaptation: An Indic Languages Study [Code]
Yash Khemchandani, Sarvesh Mehtani, Vaidehi Patil, Abhijeet Awasthi, Partha Talukdar and Sunita Sarawagi
ACL 2021

Reordering Examples Helps during Priming-based Few-shot Learning [Code]
Sawan Kumar, Partha Talukdar
Findings of ACL 2021

OKGIT: Open Knowledge Graph Link Prediction with Implicit Types [Code]
Chandrahas, Partha Talukdar
Findings of ACL 2021

Spatial Reasoning from Natural Language Instructions for Robot Manipulation
Sagar Gubbi Venkatesh, Anirban Biswas, Raviteja Upadrashta, Vikram Srinivasan, Partha Talukdar, Bharadwaj Amrutur
ICRA 2021

Graph Neural Networks for Soft Semi-Supervised Learning on Hypergraphs [Code]
Naganand Yadati, Tingran Gao, Shahab Asoodeh, Partha Talukdar, Anand Louis
PAKDD 2021

MuRIL: Multilingual Representations for Indian Languages
Simran Khanuja, Diksha Bansal, Sarvesh Mehtani, Savya Khosla, Atreyee Dey, Balaji Gopalan, Dilip Kumar Margam, Pooja Aggarwal, Rajiv Teja Nagipogu, Shachi Dave, Shruti Gupta, Subhash Chandra Bose Gali, Vish Subramanian, Partha Talukdar
Preprint: arXiv:2103.10730

2020

NHP: Neural Hypergraph Link Prediction
Naganand Yadati, Vikram Nitin, Madhav Nimishakavi, Prateek Yadav, Anand Louis and Partha Talukdar
CIKM 2020

Natural Language Inference with Faithful Natural Language Explanations [Code]
Sawan Kumar and Partha Talukdar
ACL 2020

Improving Multi-hop Question Answering over Knowledge Graphs using Knowledge Base Embeddings [Code]
Apoorv Saxena, Aditay Tripathi and Partha Talukdar
ACL 2020

A Re-evaluation of Knowledge Graph Completion Methods [Code]
Zhiqing Sun, Shikhar Vashishth, Soumya Sanyal, Partha Talukdar and Yiming Yang
ACL 2020 [Short Paper]

Syntax-guided Controlled Generation of Paraphrases [Code]
Ashutosh Kumar, Kabir Ahuja, Raghuram Vadapalli, Partha Talukdar
Transactions of the ACL (TACL) 2020 [To be presented at ACL 2020]

Composition- based Multi-Relational Graph Convolutional Networks [Code]
Shikhar Vashishth*, Soumya Sanyal*, Vikram Nitin, and Partha Talukdar
ICLR 2020

ASAP: Adaptive Structure Aware Pooling for Learning Hierarchical Graph Representations [Code]
Ekagra Ranjan, Soumya Sanyal, Partha Talukdar
AAAI 2020

InteractE: Improving Convolution-based Knowledge Graph Embeddings by Increasing Feature Interactions [Code]
Shikhar Vashishth*, Soumya Sanyal*, Vikram Nitin, Nilesh Agrawal, Partha Talukdar
AAAI 2020

P-SIF: Document Embeddings using Partition Averaging
Vivek Gupta, Ankit Saw, Pegah Nokhiz, Praneeth Netrapalli, Piyush Rai, Partha Talukdar
AAAI 2020

Improving Document Classification with Multi-Sense Embeddings
Vivek Gupta, Ankit Kumar Saw, Pegah Nokhiz, Harshit Gupta, and Partha Talukdar
ECAI 2020

2019

HyperGCN: A New Method of Training Graph Convolutional Networks on Hypergraphs [Code]
Naganand Yadati, Madhav Nimishakavi, Prateek Yadav, Vikram Nitin, Anand Louis, Partha Talukdar
NeurIPS 2019, Canada

CaRe: Open Knowledge Graph Embeddings [Code]
Swapnil Gupta, Sreyash Kenkre and Partha Talukdar
EMNLP 2019, Hong Kong

Zero-shot Word Sense Disambiguation using Sense Definition Embeddings [Code]
Sawan Kumar, Sharmistha Jat, Karan Saxena and Partha Talukdar
ACL 2019, Italy [Recipient of Outstanding Paper Award]

Relating Simple Sentence Representations in Deep Neural Networks and Brain
Sharmistha Jat, Hao Tang, Partha Talukdar and Tom Mitchell
ACL 2019, Italy [Press]

Incorporating Syntactic and Semantic Information in Word Embeddings using Graph Convolutional Networks [Code]
Shikhar Vashishth, Manik Bhandari, Prateek Yadav, Piyush Rai, Chiranjib Bhattacharyya and Partha Talukdar
ACL 2019, Italy

Submodular Optimization-based Diverse Paraphrasing and its Effectiveness in Data Augmentation [Code]
Ashutosh Kumar*, Satwik Bhattamishra*, Manik Bhandari and Partha Talukdar
NAACL 2019, USA.

Confidence-based Graph Convolutional Networks for Semi-Supervised Learning [Code]
Shikhar Vashishth*, Prateek Yadav*, Manik Bhandari*, Partha Talukdar
AISTATS 2019, Japan

Lovasz Convolutional Networks [Code]
Prateek Yadav, Madhav Nimishakavi, Naganand Yadati, Shikhar Vashishth, Arun Rajkumar and Partha Talukdar
AISTATS 2019, Japan

KVQA: Knowledge-aware Visual Question Answering [Project and Data Page]
Sanket Shah*, Anand Mishra*, Naganand Yadati and Partha Talukdar
Thirty-Third AAAI Conference on Artificial Intelligence (AAAI-19), USA

ReAl-LiFE: Accelerating the Discovery of Individualized Brain Connectomes on GPUs
Sawan Kumar*, Varsha Sreenivasan*, Partha Talukdar, Franco Pestilli, Devarajan Sridharan
Thirty-Third AAAI Conference on Artificial Intelligence (AAAI-19), USA

(*: Equal contributions)

2018

HyTE: Hyperplane-based Temporally aware Knowledge Graph Embedding [Code]
Shib Sankar Dasgupta, Swayambhu Nath Ray and Partha Talukdar
International Conference on Empirical Methods in Natural Language Processing (EMNLP 2018), Belgium

AD3: Attentive Deep Document Dater [Code]
Swayambhu Nath Ray, Shib Sankar Dasgupta and Partha Talukdar
International Conference on Empirical Methods in Natural Language Processing (EMNLP 2018), Belgium

RESIDE: Improving Distantly-Supervised Neural Relation Extraction using Side Information [Code]
Shikhar Vashishth, Rishabh Joshi, Sai Suman Prayaga, Chiranjib Bhattacharyya and Partha Talukdar
International Conference on Empirical Methods in Natural Language Processing (EMNLP 2018), Belgium

MT-CGCNN: Integrating Crystal Graph Convolutional Neural Network with Multitask Learning for Material Property Prediction
S Sanyal, J Balachandran, N Yadati, A Kumar, P Rajagopalan, S Sanyal, P Talukdar
NIPS 2018 Workshop on Machine Learning for Molecules and Materials, Canada.

Inductive Framework for Multi-Aspect Streaming Tensor Completion with Side Information [Code]
Madhav Nimishakavi, Bamdev Mishra, Manish Gupta and Partha Talukdar
27th International Conference on Information and Knowledge Management (CIKM 2018), Italy.
[Acceptance rate: 17%]

Dating Documents using Graph Convolution Networks [Code]
Shikhar Vashishth, Swayambhu Nath Ray, Shib Sankar Dasgupta and Partha Talukdar
56th Annual Meeting of the Association for Computational Linguistics (ACL 2018), Melbourne, Australia

Towards Understanding the Geometry of Knowledge Graph Embeddings [Code]
Chandrahas Dewangan, Aditya Sharma and Partha Talukdar
56th Annual Meeting of the Association for Computational Linguistics (ACL 2018), Melbourne, Australia

Higher-order Relation Schema Induction using Tensor Factorization with Back-off and Aggregation [Code]
Madhav Nimishakavi, Manish Gupta and Partha Talukdar
56th Annual Meeting of the Association for Computational Linguistics (ACL 2018), Melbourne, Australia

CESI: Canonicalizing Open Knowledge Bases using Embeddings and Side Information [Code]
Shikhar Vashishth, Prince Jain and Partha Talukdar
The Web Conference 2018 (WWW 2018), Lyon, France. [acceptance rate: 14.8%]

ELDEN: Improved Entity Linking using Densified Knowledge Graphs [Code]
Priya Radhakrishnan, Partha Talukdar and Vasudeva Varma
NAACL 2018, New Orleans, USA

Never-Ending Learning
Tom M. Mitchell, W. Cohen, E. Hruschka, P. Talukdar, B. Yang, J. Betteridge, A. Carlson, B. Dalvi, M. Gardner, B. Kisiel, J. Krishnamurthy, N. Lao, K. Mazaitis, T. Mohamed, N. Nakashole, E. Platanios, A. Ritter, M. Samadi, B. Settles, R. Wang, D. Wijaya, A. Gupta, X. Chen, A. Saparov, M. Greaves, J. Welling
Communications of the ACM, 61(5), pp. 103-115, May 2018.

Efficient and Distributed Generalized Canonical Correlations Analysis for Big Multiview Data
X. Fu , K. Huang, E.E. Papalexakis, H. Song, P. Talukdar, N. D. Sidiropoulos, C. Faloutsos, and T. Mitchell
IEEE Transactions on Knowledge and Data Engineering, Sep. 2018.

2017

KGEval: Accuracy Estimation of Automatically Constructed Knowledge Graphs
Ojha P, Partha Talukdar
International Conference on Empirical Methods in NLP (EMNLP 2017), Cohenhagen, Denmark

Speeding up Reinforcement Learning-based Information Extraction Training using Asynchronous Methods
Aditya Sharma, Zarana Parekh, Partha Talukdar
International Conference on Empirical Methods in NLP (EMNLP 2017), Cohenhagen, Denmark [Short Paper]

Revisiting Simple Neural Networks for Learning Representations of Knowledge Graphs
Srinivas Ravishankar, Chandrahas Dewangan and Partha Talukdar
6th Workshop on Automated Knowledge Base Construction (AKBC) 2017

Improving Distantly Supervised Relation Extraction using Word and Entity Based Attention
Sharmistha Jat, Siddhesh Khandelwal and Partha Talukdar
6th Workshop on Automated Knowledge Base Construction (AKBC) 2017

BRAINZOOM: High Resolution Reconstruction from Multi-modal Brain Signals
Xiao Fu, Kejun Huang, Otilia Stretcu, Hyun Ah Song, Evangelos Papalexakis, Partha Talukdar, Tom Mitchell, Nicholas Sidiropoulos, Christos Faloutsos, Barnabas Poczos
SIAM International Conference on Data Mining (SDM 2017), Houston, USA

Facets: Adaptive Local Exploration of Large Graphs
Robert Pienta, Minsuk (Brian) Kahng, Zhiyuan Lin, Jilles Vreeken, Partha Talukdar, James Abello, Ganesh Parameswaran, Duen Horng (Polo) Chau
SIAM International Conference on Data Mining (SDM 2017), Houston, USA

2016

Relation Schema Induction using Tensor Factorization with Side Information [Code]
Madhav Nimishakavi, Uday Singh Saini, Partha Talukdar
International Conference on Empirical Methods in NLP (EMNLP 2016), Austin, USA

Quality Estimation of Workers in Collaborative Crowdsourcingusing Group Testing
Ojha P, Partha Talukdar
AAAI Conference on Human Computation and Crowdsourcing (HCOMP 2016), Austin, USA.

Discovering Response-Eliciting Factors in Social Question-Answering: A Reddit Inspired Study
Danish, Yogesh Dahiya and Partha Talukdar
10th International AAAI Conference on Web and Social Media (ICWSM-16) [acceptance rate: 17%]

ClaimEval: Integrated and Flexible Framework for Claim Evaluation Using Credibility of Sources
Mehdi Samadi, Partha Talukdar, Manuela Veloso, Manuel Blum
30th AAAI Conference Conference on Artificial Intelligence (AAAI-16), Phoenix, USA

Efficient and Distributed Algorithms for Large-Scale Generalized Canonical Correlations Analysis
Xiao Fu, Kejun Huang, Evangelos Papalexakis, Hyun-Ah Song, Partha Talukdar, Nicholas Sidiropoulos, Christos Faloutsos, Tom Mitchell
International Conference on Data Mining (ICDM 2016), Barcelona, Spain [Short Paper, acceptance: 11.1%]

Turbo-SMT: Parallel Coupled sparse Matrix-Tensor Factorizations and applications
Evangelos Papalexakis, Tom Mitchell, Nicholas Sidiropoulos, Christos Faloutsos, Partha Talukdar, Brian Murphy
Statistical Analysis and Data Mining: The ASA Data Science Journal

2015

An Entity-centric Approach for Overcoming Knowledge Graph Sparsity
Manjunath Hegde, Partha Talukdar
Empirical Methods in NLP (EMNLP 2015), Portugal (Short Paper)

Knowledge Base Inference using Bridging Entities
Bhushan Kotnis, Pradeep Bansal, Partha Talukdar
Empirical Methods in NLP (EMNLP 2015), Portugal (Short Paper)

Translation Invariant Word Embeddings
Matt Gardner, Kejun Huang, Evangelos Papalexakis, Xiao Fu, Partha Talukdar, Christos Faloutsos, Nicholas Sidiropoulos, Tom Mitchell
Empirical Methods in NLP (EMNLP 2015), Portugal (Short Paper)

AskWorld: Budget-Sensitive Query Evaluation for Knowledge-on-Demand
Mehdi Samadi, Partha Talukdar, Manuela Veloso, Tom Mitchell
International Joint Conference on Artificial Intelligence (IJCAI 2015), Buenos Aires, Argentina.

Never-Ending Learning
Tom Mitchell, William Cohen, Estevam Hruschka, Partha Talukdar, Justin Betteridge, Andrew Carlson, Bhavana Dalvi, Matt Gardner, Bryan Kisiel, Jayant Krishnmurthy, Ni Lao, Kathryn Mazaitis, Tahir Mohammad, Ndapa Nakashole, Emmanouil Antonios Platanios, Alan Ritter, Mehdi Samadi, Burr Settles, Richard Wang, Derry Wijaya, Abhinav Gupta, Xinlei Chen, Abulhair Saparov, Malcolm Greaves and Joel Welling
Twenty-Ninth AAAI Conference on Artificial Intelligence (AAAI 2015), Austin, USA.

Automatic Gloss Finding for a Knowledge Base using Ontological Constraints
Bhavana Dalvi, Einat Minkov, Partha Talukdar and William Cohen
International Conference on Web Search and Data Mining (WSDM 2015), Shanghai, China

A Compositional and Interpretable Semantic Space
Alona Fyshe, Leila Wehbe, Partha P. Talukdar, Brian Murphy and Tom M. Mitchell
Conference of the North American Chapter of the ACL (NAACL 2015), Denver, USA

Principled Neuro-Functional Connectivity Discovery
K. Huang, N. Sidiropoulos, C. Faloutsos, E. Papalexakis, P. Talukdar, T. Mitchell
SIAM International Conference on Data Mining (SDM 2015), Vancouver, Canada

Active Learning in Keyword Search-based Data Integration
Zhepeng Yan, Nan Zheng, Zachary Ives, Partha Pratim Talukdar, Cong Yu
The VLDB Journal Special Issue on Best Papers of VLDB 2013

Combining Vector Space Embeddings with Symbolic Logical Inference over Open-Domain Text
Matt Gardner, Partha Talukdar and Tom Mitchell
AAAI Spring Symposium on Knowledge Representation and Reasoning, Stanford, USA

2014

Scaling Graph-based Semi Supervised Learning to Large Number of Labels Using Count-Min Sketch
Partha Pratim Talukdar, William Cohen
17th International Conference on Artificial Intelligence and Statistics (AISTATS 2014), Reykjavik, Iceland.
[pre-print presented at NIPS 2013 Workshop on Randomized Methods for Machine Learning]

Incorporating Vector Space Similarity in Random Walk Inference over Knowledge Bases
Matt Gardner, Partha Talukdar, Jayant Krishnamurthy, and Tom Mitchell
International Conference on Empirical Methods in NLP (EMNLP 2014), Doha, Qatar.

Simultaneously Uncovering the Patterns of Brain Regions Involved in Different Story Reading Subprocesses
Leila Wehbe, Brian Murphy, Partha Talukdar, Alona Fyshe, Aaditya Ramdas, Tom Mitchell
PLoS ONE 9(11): e112575. doi:10.1371/journal.pone.0112575

Press Coverage: CMU Press Release, Scientific American (blog), AP, TIME

Interpretable Semantic Vectors from a Joint Model of Brain- and Text-based Meaning
Alona Fyshe, Partha Talukdar, Brian Murphy, Tom Mitchell
52nd Annual Meeting of the Association for Computational Linguistics (ACL 2014), Baltimore, USA.

Good-Enough Brain Model: Challenges, Algorithms and Discoveries in Multi-Subject Experiments
Evangelos Papalexakis, Alona Fyshe, Nicholas Sidiropoulos, Partha Talukdar, Tom Mitchell, Christos Faloutsos
ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 2014), New York City, USA.

Turbo-SMT: Accelerating Coupled Sparse Matrix-Tensor Factorizations by 200x [Supplementary] [Code]
E. Papalexakis, T. Mitchell, N. Sidiropoulos, C. Faloutsos, P. Talukdar, B. Murphy
SIAM International Conference on Data Mining (SDM 2014), Philadelphia, USA.

Invited to the Statistical Analysis and Data Mining (SAM) Special Issue of "Best of SDM 2014"

FlexiFaCT: Scalable Flexible Factorization of Coupled Tensors on Hadoop
Alex Beutel, Abhimanu Kumar, Evangelos Papalexakis, Partha Talukdar, Christos Faloutsos, Eric Xing
SIAM International Conference on Data Mining (SDM 2014), Philadelphia, USA.

2013

Improving Learning and Inference in a Large Knowledge-base using Latent Syntactic Cues [Details]
Matt Gardner, Partha Talukdar, Bryan Kisiel, Tom Mitchell
International Conference on Empirical Methods in NLP (EMNLP 2013), Seattle, USA. [Short Paper]

PIDGIN: Ontology Alignment using Web Text as Interlingua [Details] [Slides]
Derry Wijaya, Partha Pratim Talukdar, Tom Mitchell
International Conference on Information and Knowledge Management (CIKM 2013), San Francisco, USA.

Documents and Dependencies: an Exploration of Vector Space Models for Semantic Composition
Alona Fyshe, Partha Talukdar, Brian Murphy, and Tom Mitchell
International Conference on Computational Natural Language Learning (CoNLL 2013), Sofia, Bulgaria.

Actively Soliciting Feedback for Query Answers in Keyword Search-Based Data Integration
Zhepeng Yan, Nan Zheng, Zack Ives, Partha Talukdar, Cong Yu
International Conference on Very Large Databases (VLDB 2013), Trento, Italy.

Invited to the special issue of the VLDB Journal with the "Best Papers of VLDB 2013"

Advances in Automated Knowledge Base Construction
Fabian M. Suchanek, James Fan, Raphael Hoffmann, Sebastian Riedel, Partha Talukdar
ACM SIGMOD Record [To Appear]

2012

Acquiring Temporal Constraints between Relations
Partha Pratim Talukdar, Derry Wijaya, Tom Mitchell
International Conference on Information and Knowledge Management (CIKM 2012), Hawaii, USA.

Coupled Temporal Scoping of Relational Facts
Partha Pratim Talukdar, Derry Wijaya, Tom Mitchell
International Conference on Web Search and Data Mining (WSDM 2012), Seattle, USA.

Learning Effective and Interpretable Semantic Models using Non-Negative Sparse Embedding
Brian Murphy, Partha Talukdar, Tom Mitchell
International Conference on Computational Linguistics (COLING 2012), Mumbai, India.
[ Slides ] [ Data ]

Selecting Corpus-Semantic Models for Neurolinguistic Decoding
Brian Murphy, Partha Talukdar, Tom Mitchell
Joint Conference on Lexical and Computational Semantics (StarSem) 2012, Montreal, Canada.

Metric Learning for Graph-based Domain Adaptation
Paramveer Dhillon, Partha Pratim Talukdar, Koby Crammer
International Conference on Computational Linguistics (COLING 2012) [Short Paper], Mumbai, India.

Associating Structured Records To Text Documents
Rakesh Agrawal, Ariel Fuxman, Anitha Kannan, John Shafer, Partha Pratim Talukdar
International World Wide Web Conference (WWW 2012) [Poster], Lyon, France.

Crowdsourced Comprehension: Predicting Prerequisite Structure in Wikipedia
Partha Pratim Talukdar, William Cohen
HLT-NAACL 2012 Workshop on Innovative Use of NLP for Building Educational Applications (BEA7)

Tracking Story Reading in the Brain
Leila Wehbe, Partha Talukdar, Brian Murphy, Alona Fyshe, Gustavo Sudre, and Tom Mitchell
NIPS 2012 Workshop on Machine Learning and Interpretation in NeuroImaging, Lake Tahoe, USA.

2011

SCAD: Collective Discovery of Attribute Values
Anton Bakalov, Ariel Fuxman, Partha Pratim Talukdar, Soumen Chakrabarti
International World Wide Web Conference (WWW 2011), Hyderabad, India.

Improving Product Classification Using Images
Anitha Kannan, Partha Pratim Talukdar, Nikhil Rasiwasia, Qifa Ke
International Conference on Data Mining (ICDM 2011), Vancouver, Canada.

2010

Graph-Based Weakly-Supervised Methods for Information Extraction & Integration
Partha Pratim Talukdar
PhD Thesis, CIS Department, University of Pennsylvania, May 2010.

Experiments in Graph-based Semi-Supervised Learning Methods for Class-Instance Acquisition [ Slides ] [ Data ]
Partha Pratim Talukdar, Fernando Pereira
ACL 2010, Uppsala, Sweden.

Learning Better Data Representation using Inference-Driven Metric Learning [ Poster ]
Paramveer Dhillon, Partha Pratim Talukdar, Koby Crammer
ACL 2010 (Short Paper), Uppsala, Sweden.

Automatically Incorporating New Sources in Keyword Search-Based Data Integration [ Slides ]
Partha Pratim Talukdar, Zack Ives, Fernando Pereira
2010 ACM SIGMOD Conference, Indianapolis, USA.

Inference-Driven Metric Learning (IDML) for Graph Construction
Paramveer Dhillon, Partha Pratim Talukdar, Koby Crammer
UPenn CIS Technical Report MS-CIS-10-18

2009

New Regularized Algorithms for Transductive Learning [ Slides ] [ Video ]
Partha Pratim Talukdar, Koby Crammer
European Conference on Machine Learning (ECML-PKDD) 2009, Bled, Slovenia.

Sequence Learning from Data with Multiple Labels [ Slides ]
Mark Dredze, Partha Pratim Talukdar, Koby Crammer
ECML-PKDD 2009 workshop on Learning from Multi-Label Data (MLD 09), Bled, Slovenia.

Interactive Data Integration through Smart Copy and Paste
Zack Ives, Craig Knoblock, Steve Minton, Marie Jacob, Partha Talukdar, Rattapoom Tuchinda, Jose Luis Ambite, Maria Muslea, Cenk Gazen.
Conference on Innovative Data Systems Research (CIDR) 2009, Asilomar, California.

Regularized Learning with Networks of Features.
Ted Sandler, John Blitzer, Partha Pratim Talukdar, Lyle H. Ungar.
Advances in Neural Information Processing Systems (NIPS) 2009.

Topics in Graph Construction for Semi-Supervised Learning
Partha Pratim Talukdar
UPenn CIS Technical Report MS-CIS-09-13

2008

Weakly Supervised Acquisition of Labeled Class Instances using Graph Random Walks [ Slides ]
Partha Pratim Talukdar, Joseph Reisinger, Marius Pasca, Deepak Ravichandran, Rahul Bhagat, Fernando Pereira.
EMNLP 2008, Honolulu, Hawaii.

The Orchestra Collaborative Data Sharing System.
Todd J. Green, Grigoris Karvounarakis, Nicholas E. Taylor, Val Tannen, Partha Pratim Talukdar, Marie Jacob, Fernando Pereira.
ACM SIGMOD Record, September 2008.

Learning to Create Data-Integrating Queries [ Slides ]
Partha Pratim Talukdar, Marie Jacob, Mohammad Salman Mehmood, Koby Crammer, Zack Ives, Fernando Pereira, Sudipto Guha.
34th International Conference on Very Large Databases (VLDB 2008), Auckland, New Zealand.

A Rate-Distortion One-Class Model and its Applications to Clustering. [ Slides ] [ Video ]
Koby Crammer, Partha Pratim Talukdar, Fernando Pereira.
International Conference on Machine Learning (ICML) 2008, Helsinki, Finland.

DRASO: Declaratively Regularized Alternating Structural Optimization. [ Slides ] [ Video ]
Partha Pratim Talukdar, John Blitzer, Ted Sandler, Mark Dredze, Koby Crammer, Fernando Pereira.
ICML 2008 Workshop on Prior Knowledge for Text and Language Processing, Helsinki, Finland.

2007

Lightly-Supervised Attribute Extraction.
Kedar Bellare, Partha Pratim Talukdar, Giridhar Kumaran, Fernando Pereira, Mark Liberman, Andrew McCallum and Mark Dredze.
NIPS 2007 Workshop on Machine Learning for Web Search.

Frustratingly Hard Domain Adaptation for Dependency Parsing.
Mark Dredze, John Blitzer, Partha Pratim Talukdar, Kuzman Ganchev, Joao Graca, and Fernando Pereira.
CoNLL Shared Task Session of EMNLP-CoNLL 2007, Prague.

Automatic Code Assignment to Medical Text.
Koby Crammer, Mark Dredze, Kuzman Ganchev, Partha Pratim Talukdar and Steve Caroll.
BioNLP 2007, Prague.

2006

A Context Pattern Induction Method for Named Entity Extraction [ Slides ]
Partha Pratim Talukdar, Thorsten Brants, Mark Liberman and Fernando Pereira
Tenth Conference on Computational Natural Language Learning (CoNLL-X), New York City, June 8-9, 2006.

2004

Hindi Text Normalization.
K. Panchapagesan, Partha Pratim Talukdar, N. Sridhar Krishna, Kalika Bali, A.G. Ramakrishnan.
Fifth International Conference on Knowledge Based Computer Systems (KBCS), 19-22 December 2004, Hyderabad India.

Phonetic Distance Based Cross-lingual Search.
Sriram S., Partha Pratim Talukdar, Sameer Badaskar, Kalika Bali, A.G. Ramakrishnan.
International Conference on Natural Language Processing, 19-22 December 2004, Hyderabad India.

Optimal Creation of Speech Databases for Indian Language Speech Technology
Satinder Singh, Partha Talukdar, Sridhar Krishna, Sandeep Manocha, Kalika Bali,Sitaram R.N.V..
International Conference on Speech and Language Technology/ O-COCOSDA , 17-19 November 2004, New Delhi, India.

Tools for the Development of a Hindi Speech Synthesis System
Kalika Bali, A.G.Ramakrishnan, Partha Pratim Talukdar, N. Sridhar Krishna.
5th ISCA Speech Synthesis Workshop, 14th-16th June 2004, Carnegie Mellon University, USA.

Duration Modeling for Hindi Text-to-Speech Synthesis.
N. Sridhar Krishna, Partha Pratim Talukdar, Kalika Bali, A.G. Ramakrishnan.
8th International Conference on Spoken Language Laguage Processing (ICSLP), 4th-8th October 2004, Jeju Island, Korea.

Automatic Generation of Compound Word Lexicon for Hindi Speech Synthesis.
Deepa S.R., A.G. Ramakrishnan, Kalika Bali, Partha Pratim Talukdar.
Language Resources and Evaluation Conference (LREC) 2004, Portugal, 26-28 May 2004.

Teaching

Jan-May 2019: E1 246: Natural Language Understanding
Aug-Dec 2018: DS 222: Machine Learning with Large Datasets
Jan-May 2018: E1 246: Natural Language Understanding
Aug-Dec 2017: DS 222: Machine Learning with Large Datasets
Jan-May 2016: SE 256: Scalable Systems for Data Science
Jan-May 2016: SE 294: Data Analysis and Visualization
Aug-Dec 2015: E1 246: Natural Language Understanding
Jan-May 2015: SE 294: Data Analysis and Visualization
Aug-Dec 2014: SE 305: Web-scale Knowledge Harvesting

Software

Junto Label Propagation Toolkit: This toolkit consists of implementations of various graph-based semi-supervised learning (SSL) algorithms.

OCRD: One Class Algorithm based on Rate-Distortion theory (download)
An algorithm to choose a coherent subset of points from a large set. Please see A Rate-Distortion One-Class Model and its Applications to Clustering for details.

About Me

I was born and raised in Guwahati, Assam. At Guwahati, I attended Haripriya Vidyapeeth, Don Bosco High School, and Cottton College.