Online Marketplace eBay has added additional buying signals such as ‘Add to Watchlist’, ‘Make Offer’ and ‘Add to Cart’ to its machine learning model to improve the relevance of recommended ad listings, based on the first items searched. Chen Xue goes into this in detail recent article†
eBay’s Promoted Listings Standard (PLS) is a paid option for sellers. With one option, PLSIM, eBay’s recommendation engines suggest sponsored items that resemble something a potential buyer just clicked on. The PLSIM is paid on a CPA model (the seller only pays eBay when a sale is made), so that can be very motivating in terms of creating the most effective model to promote the best offers. This usually works well for sellers, buyers, and eBay.
The PLSIM journey looks like this:
- The user searches for an item.
- The user clicks on a result of their search -> lands on a display item (VI) page for a listed item (eBay calls this the launch item).
- The user scrolls down the VI page and sees the recommended items in the PLSIM.
- User clicks on an item from the PLSIMs and takes action (view, add to cart, buy now, etc.) or check out another new set of featured items
The PLSIM journey from a machine learning perspective:
- Get a subset candidate Default for promoted listings (the “recall set”) most relevant to the seed item
- Applies a trained machine learning ranker to rank the recall set entries based on the likelihood of purchase
- Rearranges listings based on ad speed to balance seller speed through promotion and recommendation relevance
The ranking model is based on the following historical data:
- The recommended article data
- Recommended item to sow item match
- Context (country, product category)
- User Personalization Features
eBay uses a gradient-boosted tree, which, for a given seed item, ranks items based on the items’ relative purchase probabilities.
From binary feedback to multi-relevance feedback
In the past, the buying opportunity depended on binary buying data. It was as simple as ‘relevant’ if bought with a seed item and ‘irrelevant’ if not. This was a failed method, but there were important areas for optimization.
- false negatives: Since users generally only buy one item from a list of recommendations, good recommendations can be considered bad in cases where a purchase has not been made, leading to false negatives.
- Purchases are rare: Compared to other user events, making it challenging to train a model with enough volume and diversity in purchases to be predictive of the positive class.
- Missing data: User actions ranging from clicks to add to cart reveal a large amount of user information that reveals likely results
Out of all of this, eBay engineers considered the following user actions in addition to the first click and how to add them to the ranking model.
- Buy it now (applies only to (Buy Now) BIN List)
- add to cart (applies only to BIN list)
- Make an offer (applies only to the best offer list)
- Place bid (only applicable to auction list)
- Add to watch list (applied to BIN, Best Offer or Auction List)
Relevance levels of multi-relevance feedback
eBay now understands that purchases are most relevant and that other actions need to be added, but the new question is, where do these actions fall on the scale of relevance?
The journey always starts with a click on the featured item before you see the new rake action item. This leads eBay to rank the “lack of selection” action that leads to and “select a recommendation” (clicking on a recommendation) as the least and second-least relevant actions that lead to a purchase, respectively.
The charts below: illustrate how eBay ranks the remaining possible actions – “Bid”, “Buy Now”, “Add to Watchlist” and “Add to Cart”.
In historical training data for a seed item, each of the potential items on the following scale was labeled as its relevance level.
As a result of the labeling, during training the ranker penalized mis-ranked purchases more severely than mis-ranked “buy it now” and so on on the list.
Sample weights of feedback across multiple relevance
But it’s not that simple, because a click isn’t exactly one point less likely to result in an “Add to Watchlist” any more than “Add to Cart” is two points more likely to lead to a purchase than an ” Click”. The gradient-enhanced tree supported multiple labels to capture a range of relevance, but there was no direct way to implement the magnitude of relevance.
eBay had to test iteratively until they came up with numbers that made the model work. The researchers included additional weights (referred to as “sample weights”) that were added to the pairwise loss function. They optimized the hyperparameter tuning task and ran over 25 iterations before coming up with the best sample weights: “Add to Watchlist” (6), “Add to Cart” (15), “Make Offer” (38), “Buy Now” (8) and “Purchase” (15) Without the sample weights, the new model performed worse.With the sample weights, the new model outperforms the binary model.
They experimented with adding just a click as additional relevance feedback and applied the tuned hyperparameter “Purchase” sample weight 150. The offline results are also shown below, where “BOWC” represents the “Buy It Now”, “Make Offer”, “Add to Watchlist” and “Add to Cart” actions. Purchase Rank displays the average rank of the item purchased. The smaller, the better.
In total, more than 2,000 copies of models were trained. The A/B tests were performed in two phases. The first phase included only additional selection labels and showed a 2.97% increase in purchases and a 2.66% increase in ad revenue on the eBay mobile app, which was deemed successful enough to take the model globally. to take production.
The second phase included more actions such as “Add to Watchlist”, “Add to Cart”, “Make Offer” and “Buy Now” in the model, and the A/B test showed even better engagement (e.g. more clicks and BOWCs).
Feature image by Igor Schubin from Pixabay