I'm also interested in this use case. For instance, when using search/multi for "Star Wars", we get 77 results split across four pages. However, 80%+ of the time, the user is looking for "Star Wars: Episode IV - A New Hope" (id 11), and that is pretty easy to pick out as a top result when considering the popularity and vote_count values. Unfortunately, it is not the top result in search and I don't want to have to pull all search result pages in order to sort. What should I do here?
Search will never support sorting as it's designed to a sorted 100% by relevance while being boosted by popularity to help narrow results. In this case, "Clone Wars" is above "A New Hope" because the word "Wars" is in the title twice so it's believed to be a much more relevant result.
Search goes incredibly sideways when you don't sort by the calculated Lucene score in ElasticSearch.
As always, the more you type the better the results:
Wow - thanks for your quick reply. Appreciate it, Travis!
I understand what you're saying with regards to relevancy, but real-world relevancy means more than just string matching amongst the movie title(s). In the real-world use cases we've tested, Clone Wars is much less relevant than A New Hope for the query "Star Wars", even though "Wars" appears twice - that's not something the user cares about. We've given the advice to our users to use more specific search terms many times and in many ways, but many people still think "Star Wars" is the right thing to search for, so they just aren't going to do that. I can type "Star Wars A New Hope" no problem, but I'm not concerned with my own use case, I'm concerned with end-user behavior. Lucene and Elasticsearch have powerful weighting functionality to balance string-matched Queries with other relevancy parameters, such as popularity.
I think I agree with your goal here: sorting 100% by relevance, but I am skeptical that the definition of relevance is just focused on search terms, from the perspective of the end-user. Does that make sense?
Travis Bell 的回复
于 2014 年 04 月 01 日 10:11上午
Hi vshwnth2,
You can't sort search results but you can sort the discover pages. Here's an example of movies that are rated PG-13 or less sorted by popularity:
You can read more here: http://docs.themoviedb.apiary.io/#discover
Cheers.
joshjordan 的回复
于 2015 年 02 月 16 日 10:12上午
I'm also interested in this use case. For instance, when using
search/multi
for "Star Wars", we get 77 results split across four pages. However, 80%+ of the time, the user is looking for "Star Wars: Episode IV - A New Hope" (id 11), and that is pretty easy to pick out as a top result when considering thepopularity
andvote_count
values. Unfortunately, it is not the top result insearch
and I don't want to have to pull all search result pages in order to sort. What should I do here?Travis Bell 的回复
于 2015 年 02 月 16 日 10:36上午
Hi joshjordan,
Search will never support sorting as it's designed to a sorted 100% by relevance while being boosted by popularity to help narrow results. In this case, "Clone Wars" is above "A New Hope" because the word "Wars" is in the title twice so it's believed to be a much more relevant result.
Search goes incredibly sideways when you don't sort by the calculated Lucene score in ElasticSearch.
As always, the more you type the better the results:
Cheers.
joshjordan 的回复
于 2015 年 02 月 16 日 11:01上午
Wow - thanks for your quick reply. Appreciate it, Travis!
I understand what you're saying with regards to relevancy, but real-world relevancy means more than just string matching amongst the movie title(s). In the real-world use cases we've tested, Clone Wars is much less relevant than A New Hope for the query "Star Wars", even though "Wars" appears twice - that's not something the user cares about. We've given the advice to our users to use more specific search terms many times and in many ways, but many people still think "Star Wars" is the right thing to search for, so they just aren't going to do that. I can type "Star Wars A New Hope" no problem, but I'm not concerned with my own use case, I'm concerned with end-user behavior. Lucene and Elasticsearch have powerful weighting functionality to balance string-matched Queries with other relevancy parameters, such as popularity.
I think I agree with your goal here: sorting 100% by relevance, but I am skeptical that the definition of relevance is just focused on search terms, from the perspective of the end-user. Does that make sense?