Projects
ULTRA: Unbiased Learning to Rank Algorithms toolbox
ULTRA is a toolkit for unbiased/online learning to rank algorithms. It was developed with a focus on facilitating the designing, comparing and sharing of unbiased/online learning to rank algorithms. ULTRA creates a single pipeline for input/simulating noisy labels (e.g., clicks), implementing different ranking models, and testing different learning algorithms. There are a number of unbiased/online learning to rank algorithms, such as IPWrank, DLA, RegressionEM, DBGD, PDGD, and PairwiseDebias, designed with a unified interface. It also have a number of ranking models that support gradient descent optimizations, such as DNN, linear regression, DLCM, GSF, and SetRank. We are always happy to receive any code contributions, suggestions, and comments. |
Automatic Unbiased Learning to Rank from Unbiased Propensity Estimation
Learning to rank with biased click data is a well-known challenge. A variety of methods has been explored to debias click data for learning to rank such as click models, result interleaving and, more recently, the unbiased learning-to-rank framework based on inverse propensity weighting. Despite their differences, most existing studies separate the estimation of click bias (namely the propensity model) from the learning of ranking algorithms. To estimate click propensities, they either conduct online result randomization, which can negatively affect the user experience, or offline parameter estimation, which has special requirements for click data and is optimized for objectives (e.g. click likelihood) that are not directly related to the ranking performance of the system. In this work, we propose a Dual Learning Algorithm (DLA) that jointly learns an unbiased ranker and an unbiased propensity model. DLA is an automatic unbiased learning-to-rank framework as it directly learns unbiased ranking models from biased click data without any preprocessing. It can adapt to the change of bias distributions and is applicable to online learning. [SIGIR'18] [code] |
Learning to Rank with Deep Local Context Models
![]() |
Joint Representation Learning Model for Top-N Recommendation
|
Hierarchical Embedding Model for Personalized Product Search
The unique characteristics of product search make search personalization essential for both customers and e-shopping companies. In this work, we propose a hierarchical embedding model to learn semantic representations for entities (i.e. words, products, users and queries) from different levels with their associated language data. [SIGIR'17] [code] [data] |
Characterizing Email Search using Large-scale Behavioral Logs and Surveys
As the number of email users and messages continues to grow, search is becoming more important for finding information in personal archives. In spite of its importance, email search is much less studied than web search, particularly using large-scale behavioral log analysis. In this project, we conduct a large-scale log analysis of email search and complement it with a survey to better understand email search intent and success. [CHIIR'17] [WWW'17] *This is an internship project in Microsoft Research. |
Enhanced Paragraph Vector Model for Ad-hoc Retrieval
Incorporating topic level estimation into language models has been shown to be beneficial for information retrieval (IR) models such as cluster-based retrieval and LDA-based document representation. Neural embedding models, such as paragraph vector (PV) models, on the other hand have shown their effectiveness and efficiency in learning semantic representations of documents and words in multiple Natural Language Processing (NLP) tasks. In this work, we study how to effectively use the PV model to improve ad-hoc retrieval. [SIGIR'16] [ICTIR'16] [code] |
1-7 of 7