I've never given anyone anything other than the API. All I can assume is that they imported the data that way. I don't think they cached that much, just enough for their demo. We have many companies who cache data locally this way.
Hi Travis, thanks for the quick reply. But it doesnt seem like they had a small subset. They have movies in "2018" as well. They have the whole db or movies and actors. I think I'm pretty sure of that. Ah well, but the API is so slow. There are over 200,000 persons atleast on your DB and with a 6 second break every 39 queries is making it such a big deal. My script has been running for over 2 days now and has managed to scrape around 30,000 persons only :'( At this rate I think it will be over by the end of a month. And I'm not even sure as to the last person id. Because, this person's ID https://www.themoviedb.org/person/1267329-lupita-nyong-o is around 1.2 millon (gulp... :"()
I can't speak to anything Algolia is or isn't doing. They have the same limits as everyone else. Keep in mind, we limit by IP (not API key) so perhaps they've spun up a few extra servers to have some jobs running in parallel.
I do have some plans on releasing some ID files at some point, to let you know in advance which IDs exist so you can skip all of the dead IDs. I'm not sure when I'll get to that but it is on my radar.
Ahh! That would be perfect because without the missing resources it a very long task. Which is why, i'm creating first extracting between existing ids and dumping it in a local file. But still, it is slow. But anyway, thank for this amazing API! :) One more quick question. Do the id's get reused by any chance? For instance, an id with no resource is it possible that in future it might have a value or do you insert new names into new ids ONLY?
You should be using append_to_response to do all media queries in a single request. Here's an example calling credits, images and videos in a single request:
I do not take donations, don't worry about that. Just attribute TMDb as the source of your data and/or images and help out contributing missing data if you come across it ;)
Oh, actually I wanted to write a script to dump the translations of movies titles from IMDB to TMDb using python :P Will that be a lgal thing to do? Since you know.. it's anonymous data that you are receiving.. haha xD
Travis Bell 的回复
于 2016 年 08 月 19 日 11:49上午
Hi John,
I've never given anyone anything other than the API. All I can assume is that they imported the data that way. I don't think they cached that much, just enough for their demo. We have many companies who cache data locally this way.
john1jacob 的回复
于 2016 年 08 月 19 日 1:46下午
Hi Travis, thanks for the quick reply. But it doesnt seem like they had a small subset. They have movies in "2018" as well. They have the whole db or movies and actors. I think I'm pretty sure of that. Ah well, but the API is so slow. There are over 200,000 persons atleast on your DB and with a 6 second break every 39 queries is making it such a big deal. My script has been running for over 2 days now and has managed to scrape around 30,000 persons only :'( At this rate I think it will be over by the end of a month. And I'm not even sure as to the last person id. Because, this person's ID https://www.themoviedb.org/person/1267329-lupita-nyong-o is around 1.2 millon (gulp... :"()
Travis Bell 的回复
于 2016 年 08 月 20 日 12:07下午
Hi John,
I can't speak to anything Algolia is or isn't doing. They have the same limits as everyone else. Keep in mind, we limit by IP (not API key) so perhaps they've spun up a few extra servers to have some jobs running in parallel.
I do have some plans on releasing some ID files at some point, to let you know in advance which IDs exist so you can skip all of the dead IDs. I'm not sure when I'll get to that but it is on my radar.
john1jacob 的回复
于 2016 年 08 月 20 日 11:57下午
Ahh! That would be perfect because without the missing resources it a very long task. Which is why, i'm creating first extracting between existing ids and dumping it in a local file. But still, it is slow. But anyway, thank for this amazing API! :) One more quick question. Do the id's get reused by any chance? For instance, an id with no resource is it possible that in future it might have a value or do you insert new names into new ids ONLY?
PS: Why doesn't you API have the cast list while searching for the movie ID? :'( Because of which I'm requiring to make 2 calls. One on http://api.themoviedb.org/3/movie/17?api_key=XXX and the other on http://api.themoviedb.org/3/movie/17/credits?api_key=XXXXX for the people on this movie. Is there any simple way to get them both in a single request by any chance?
Travis Bell 的回复
于 2016 年 08 月 21 日 1:22上午
Ids do not get reused, no.
You should be using
append_to_response
to do all media queries in a single request. Here's an example calling credits, images and videos in a single request:john1jacob 的回复
于 2016 年 08 月 21 日 3:18上午
Amazing! Thanks mate. :) Also, if there a donation option I would love to denote. Because I'm assuming themoviedb is more a non-profit thingy?
Travis Bell 的回复
于 2016 年 08 月 21 日 10:37上午
Thanks!
I do not take donations, don't worry about that. Just attribute TMDb as the source of your data and/or images and help out contributing missing data if you come across it ;)
john1jacob 的回复
于 2016 年 08 月 22 日 8:46上午
Oh, actually I wanted to write a script to dump the translations of movies titles from IMDB to TMDb using python :P Will that be a lgal thing to do? Since you know.. it's anonymous data that you are receiving.. haha xD
Travis Bell 的回复
于 2016 年 08 月 24 日 9:59下午
As long as the data is copyright-less or something like Creative Commons, it's usually ok. It's more about the source content than anything else.