Maximum 2000 titles #40

robertox85 · 2019-09-15T10:17:01Z

I'm trying to get all the movie list by paging out the results, but the answer provides a maximum of 2000 titles. I am using the search_for_item method setting the providers, page, and page size as parameters. Have you noticed this situation? Here's my code:

`

from justwatch import JustWatch

import json

just_watch = JustWatch(country='IT')

i = 1

while True:

get_providers = just_watch.search_for_item(
providers=[
    'nfx',
    'prv',
    'ntv',
    'tvi',
    'inf',
    'skg',
    'rai',
    'itu',
    'ply',
    'msf',
    'pls',
    'chi',
    'wki',
    'ytr',
    'mbi',
    'gdc',
    'uci',
    'cru'
],
page=i,
page_size=300);

if not get_providers['items']:
    break;
else:
    print(json.dumps(get_providers['items']), file=open("output-"+str(i)+".json", "a"))
    i = i + 1;`

The text was updated successfully, but these errors were encountered:

siftnoorsingh · 2020-11-16T05:38:27Z

im having the same issue. is there a way to fix this?

AdriaPadilla · 2022-03-08T12:55:30Z

Same issue. Maximum titles usting search_for_item() are limited to 2000, even using a loop with page=n and page_size=n modifiers.

from justwatch import JustWatch
import time

provider = "nfx"
country = "ES"
page_size = 100
page = 1

def extractor(page, provider, country):
  just_watch = JustWatch(country=country)
  results = just_watch.search_for_item(providers=[provider], page=page, page_size=page_size)
  results = json.dumps(results)
  data = json.loads(results)

  ## Here is code to dump response to .json files.. :)

  if page < pages:
      page = page+1
      time.sleep(1)
      extractor(page, provider, country)
  else:
      pass

extractor(page, provider, country)

JSON response header:

{
    "page": 1,
    "page_size": 100,
    "total_pages": 20,
    "total_results": 5031,
    "items": [
        {

Althought total results are 5031, only 2000 are paginated.

By the way... great job and many thanks for working on this.

Ideas?

AltFreq07 · 2022-05-23T11:32:07Z

Still an issue, anyone got a fix yet?

AdriaPadilla · 2022-05-23T11:36:13Z

Still an issue, anyone got a fix yet?

I think there's no solution for this... This thread is open since 2019. I don't have too many expectations. But it would be nice to know if the limitation is imposed by the platform or the code. If it's the second one, maybe we can work together to find a solution.

AltFreq07 · 2022-05-23T11:47:05Z

Looks like they may have moved away from that API and to graphql which I have no idea about

This is pretty much how the site works now, cant quite get the numbers correct when getting towards the 2,000 mark but it does seem to go higher using the "afterCursor" instead of page.

data = await getAPI("POPULAR", 40, "", "SHOW", "nfx")

async function getAPI(sortBy, numberOfItems, afterCursor, type, service) {
    const fetchPost = await fetch("https://apis.justwatch.com/graphql", {
        "headers": {
            "content-type": "application/json"
        },
        "referrer": "https://www.justwatch.com/",
        "referrerPolicy": "strict-origin-when-cross-origin",
        "body": `{\"operationName\":\"GetPopularTitles\",
        \"variables\":{\"popularTitlesSortBy\":\"${sortBy}\",\"first\":${numberOfItems},\"platform\":\"WEB\",\"sortRandomSeed\":0,\"popularAfterCursor\":\"${afterCursor}\",\"popularTitlesFilter\":{\"ageCertifications\":[],\"excludeGenres\":[],\"excludeProductionCountries\":[],\"genres\":[],\"objectTypes\":[\"${type}\"],\"productionCountries\":[],\"packages\":[\"${service}\"],\"excludeIrrelevantTitles\":false,\"presentationTypes\":[],\"monetizationTypes\":[]},\"watchNowFilter\":{\"packages\":[\"${service}\"],\"monetizationTypes\":[]},\"language\":\"en\",\"country\":\"AU\"},
        \"query\":\"query GetPopularTitles($country: Country!, $popularTitlesFilter: TitleFilter, $watchNowFilter: WatchNowOfferFilter!, $popularAfterCursor: String, $popularTitlesSortBy: PopularTitlesSorting! = ${sortBy}, $first: Int! = ${numberOfItems}, $language: Language!, $platform: Platform! = WEB, $sortRandomSeed: Int! = 0, $profile: PosterProfile, $backdropProfile: BackdropProfile, $format: ImageFormat) {\\n  popularTitles(\\n    country: $country\\n    filter: $popularTitlesFilter\\n    after: $popularAfterCursor\\n    sortBy: $popularTitlesSortBy\\n    first: $first\\n    sortRandomSeed: $sortRandomSeed\\n  ) {\\n    totalCount\\n    pageInfo {\\n      startCursor\\n      endCursor\\n      hasPreviousPage\\n      hasNextPage\\n      __typename\\n    }\\n    edges {\\n      ...PopularTitleGraphql\\n      __typename\\n    }\\n    __typename\\n  }\\n}\\n\\nfragment PopularTitleGraphql on PopularTitlesEdge {\\n  cursor\\n  node {\\n    id\\n    objectId\\n    objectType\\n    content(country: $country, language: $language) {\\n      title\\n      fullPath\\n      scoring {\\n        imdbScore\\n        __typename\\n      }\\n      posterUrl(profile: $profile, format: $format)\\n      ... on ShowContent {\\n        backdrops(profile: $backdropProfile, format: $format) {\\n          backdropUrl\\n          __typename\\n        }\\n        __typename\\n      }\\n      __typename\\n    }\\n    likelistEntry {\\n      createdAt\\n      __typename\\n    }\\n    dislikelistEntry {\\n      createdAt\\n      __typename\\n    }\\n    watchlistEntry {\\n      createdAt\\n      __typename\\n    }\\n    watchNowOffer(country: $country, platform: $platform, filter: $watchNowFilter) {\\n      id\\n      standardWebURL\\n      package {\\n        packageId\\n        clearName\\n        __typename\\n      }\\n      retailPrice(language: $language)\\n      retailPriceValue\\n      lastChangeRetailPriceValue\\n      currency\\n      presentationType\\n      monetizationType\\n      __typename\\n    }\\n    ... on Movie {\\n      seenlistEntry {\\n        createdAt\\n        __typename\\n      }\\n      __typename\\n    }\\n    ... on Show {\\n      seenState(country: $country) {\\n        seenEpisodeCount\\n        progress\\n        __typename\\n      }\\n      __typename\\n    }\\n    __typename\\n  }\\n  __typename\\n}\\n\"}`,
        "method": "POST",
        "mode": "cors",
        "credentials": "omit"
    })
    return await fetchPost.json()
}

AdriaPadilla · 2022-05-23T12:05:21Z

Did you find any documentation or implementation info about this "new" api?

Mention that the JustWatch project is financed (I don't know if totally or partially) by public funds from the European Union. It would be interesting to know if there's an open data policy... and clear documentation on how to access the complete catalogue.

Thanks!

AltFreq07 · 2022-05-23T12:34:31Z

Nothing other than how their site interacts with the API. I might have a look through the frontend source and see if they have any easily readable query builders for it. I will keep you posted

…

On Mon, 23 May 2022, 10:05 pm Adrià Padilla, ***@***.***> wrote: Did you find any documentation or implementation info about this "new" api? Mention that the JustWatch project is financed (I don't know if totally or partially) by public funds from the European Union. It would be interesting to know if there's an open data policy... and clear documentation on how to access the complete catalogue. Thanks! — Reply to this email directly, view it on GitHub <#40 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AIZLDTFGV5A5LNVL3KWITHTVLNYA7ANCNFSM4IW2BJRA> . You are receiving this because you commented.Message ID: ***@***.***>

AltFreq07 · 2022-05-24T02:41:10Z

Unfortunately even their website has the same limitations

robertox85 · 2022-05-24T04:11:40Z

Unfortunately even their website has the same limitations

what a shame. I'm looking for other open databases but I don't find anything interesting. But instead of using bees, couldn't you scrape the data from the front end?

AltFreq07 · 2022-05-24T04:46:32Z

Their website only displays up to 2000 entries as well when you scroll down :( I'd say it's to do with the popular listings we are querying as it's likely limited to roughly 2k entries

…

On Tue, 24 May 2022, 2:11 pm Roberto Di Marco, ***@***.***> wrote: Unfortunately even their website has the same limitations what a shame. I'm looking for other open databases but I don't find anything interesting. But instead of using bees, couldn't you scrape the data from the front end? — Reply to this email directly, view it on GitHub <#40 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AIZLDTGTBNAZ2C5XHWE6T7TVLRJIPANCNFSM4IW2BJRA> . You are receiving this because you commented.Message ID: ***@***.***>

AdriaPadilla · 2022-05-24T10:32:25Z

Unfortunately even their website has the same limitations

what a shame. I'm looking for other open databases but I don't find anything interesting. But instead of using bees, couldn't you scrape the data from the front end?

True, what a shame. I can undersand they need to do money with this... so it is normal that it's not as accessible as we would like.

There aren't too many alternatives. IMDB has never had an API. It is now integrating with Amazon Prime, and if you want to access its data you have to pay for the service. Web scraping on IMDB has also become very complicated, because they have obfuscated all the code with javaScript. Other alternatives like FilmAffinity have a less extensive catalog...

Did you try TheMovieDB? https://www.themoviedb.org/documentation/api?language=en
they say: "Our API is free to use as long as you attribute TMDB as the source of the data and/or images".

AltFreq07 · 2022-05-27T03:46:21Z

You might be onto something Adria, they do have a where to watch which is why I was using JustWatch

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Maximum 2000 titles #40

Maximum 2000 titles #40

robertox85 commented Sep 15, 2019 •

edited

Loading

siftnoorsingh commented Nov 16, 2020

AdriaPadilla commented Mar 8, 2022 •

edited

Loading

AltFreq07 commented May 23, 2022

AdriaPadilla commented May 23, 2022 •

edited

Loading

AltFreq07 commented May 23, 2022

AdriaPadilla commented May 23, 2022

AltFreq07 commented May 23, 2022 via email

AltFreq07 commented May 24, 2022

robertox85 commented May 24, 2022

AltFreq07 commented May 24, 2022 via email

AdriaPadilla commented May 24, 2022

AltFreq07 commented May 27, 2022

Maximum 2000 titles #40

Maximum 2000 titles #40

Comments

robertox85 commented Sep 15, 2019 • edited Loading

siftnoorsingh commented Nov 16, 2020

AdriaPadilla commented Mar 8, 2022 • edited Loading

AltFreq07 commented May 23, 2022

AdriaPadilla commented May 23, 2022 • edited Loading

AltFreq07 commented May 23, 2022

AdriaPadilla commented May 23, 2022

AltFreq07 commented May 23, 2022 via email

AltFreq07 commented May 24, 2022

robertox85 commented May 24, 2022

AltFreq07 commented May 24, 2022 via email

AdriaPadilla commented May 24, 2022

AltFreq07 commented May 27, 2022

robertox85 commented Sep 15, 2019 •

edited

Loading

AdriaPadilla commented Mar 8, 2022 •

edited

Loading

AdriaPadilla commented May 23, 2022 •

edited

Loading