If I use the movie search API to search for "Hard Boiled", I get six results. The first two are 172230 (German overview text and no other data) and 11782 (English text and lots of other data).
"page" : 1, "results" : [ { "adult" : false, "backdropPath" : null, "id" : 172230, "originalTitle" : "Hard Boiled", "popularity" : 0.23, "posterPath" : "/vzkMzRMAvwUJkf5K0aMLbk1N2jZ.jpg", "releaseDate" : null, "title" : "Hard Boiled", "voteAverage" : 0.0, "voteCount" : 0 }, { "adult" : false, "backdropPath" : "/kWBOQ9G5INFsfPvrDfXUYj06kF0.jpg", "id" : 11782, "originalTitle" : "辣手神探", "popularity" : 2.645050811847, "posterPath" : "/pH1w6HQIxVqvx40dcUJIvFJb0D2.jpg", "releaseDate" : 703378800000, "title" : "Hard Boiled", "voteAverage" : 7.1, "voteCount" : 10 ...
I am curious as to why 172230 is being returned first, since ordinarily tmdb seems to do a great job of returning the "right" (I accept that "right" is subjective!) movie first. In fact, out of about 250 title searches I do, even when there are multiple search results returned I reckon I get back a correct first match on 240 or more of them (and that's without specifying a year).
Even if I request language=en, I get that German movie first.
I suppose the short version of my question is what is the usual/expected practice here to rank search results?
لم تجد الفلم أو المسلسل ؟ سجل دخولك و انشئها
هل تريد تقييم او اضافة هذا العنصر للقائمة؟
لست عضو؟
رد بواسطة caprica6
بتاريخ يناير 23, 2014 في 3:17 مساءا
To add insult to injury, that German movie has a poster that I would say should not be returned if I exclude adult searches.
رد بواسطة Travis Bell
بتاريخ يناير 23, 2014 في 5:59 مساءا
Search results are ordered by exact matches first, then in order of Solr's relevance score but boosted but things like popularity and release date. Searching for titles has nothing to do with languages, or the translations that exist. It's a string to string check meaning whatever you enter as a query will be searched against everything we index. If there's a match we show it.
The fields that are checked are the original title, translated titles and alternative titles. More weight is given to match original then to translated, and lastly, the least weight to alternatives.
With regards to an image, there's no image filtering that takes place (we can't even tag the images). Only movies themselves are filtered and if the movie isn't an adult movie then whatever is set on the image will show.
رد بواسطة David Flannery
بتاريخ يناير 28, 2014 في 11:27 صباحا
The movie search fails to find the following titlles (in the first 20 results anyway):
Dune
Devil
Dave
Big
Searches using longer movie titles succeed.
Let's take "Big" as an example. The url used by my app was:
http://api.themoviedb.org/3/search/movie?api_key=&query="Big"
The 20 results all had "Big" in the title but they were all longer than just "Big".
Yet a website search using "Big" returns the "Big" movie as the first result.
I find that if I add "&search_type=ngram" to the url, the searches succeed! And yet I've seen posts that ngram search is not used for the web site searches. This seems to be inconsistent. Can this be explained -- or what am I doing wrong?
رد بواسطة Travis Bell
بتاريخ يناير 28, 2014 في 11:34 صباحا
I'm not sure I understand… if I use the following query:
The first result is Dune.
Same for Devil, Dave and Big, so… I'm not sure what you're doing but everything is returning like I expect here...
رد بواسطة David Flannery
بتاريخ يناير 28, 2014 في 1:17 مساءا
I guess it was a simple error. I was enclosing the query string in double-quotes in the url. When I removed the double-quotes all seems to work super well, both for short single-word queries and for longer, and partial, titles. (The multi-word query strings are escaped so a space becomes %20, etc.)
Strangely, before, when the multi-word queries were enclosed in double-quotes the searches seemed to work OK, but not for single-word queries.
رد بواسطة Travis Bell
بتاريخ يناير 28, 2014 في 1:24 مساءا
Hey David,
Wrapping your query in quotes would indeed cause different results. I believe Solr will treat that different but I don't know the exact specifics.
You should indeed be escaping your queries though. The proper way you'll get handling of characters like ampersands, apostrophes, etc… will always be if they're escaped.
Cheers.