max 5000 artworks scraped per artist #30

neenkah · 2023-11-13T19:48:06Z

Only the first 5000 artwork urls get saved to works.txt for artists with many artworks (photographers) such as Gordon Parks, Alfred Eisenstaedt or Carl Mydans.

Scraping all of these would be incredibly time demanding, so it might be nice to provide an adjustable limit.
I also noticed when scrolling through the artworks index to gather the urls, the page gets incredibly slow due to loading all the artworks. It might be nice to sort by date and scroll through every two years (with the url ending in: &date=1954).

The text was updated successfully, but these errors were encountered:

modhurita · 2023-11-23T12:31:49Z

Hi @neenkah, thanks for bringing this to our attention.

Only the first 5000 artwork urls get saved to works.txt

Do you know what exactly happens when you hit the 5000 artworks limit while scraping? Does the right arrow disappear, is it no longer clickable, or something else entirely?

Scraping all of these would be incredibly time demanding, so it might be nice to provide an adjustable limit.

What exactly do you mean? Do you mean that we should scroll through to (much) fewer than 5000 artworks at a time?

It might be nice to sort by date and scroll through every two years (with the url ending in: &date=1954).

Collecting the artworks by year is a good suggestion. However, an artist like Alfred Eisenstaedt with ~200,000 paintings might well have >5000 artworks / year in their most prolific years.

neenkah · 2023-11-27T22:49:14Z

Hi @modhurita,

I have never seen the moment 5000 artworks are scraped, so I cannot provide any details about that.

About the adjustable limit, I indeed meant fewer artworks in the case that someone using the scraper might prefer to have coverage of all artists, but not all artworks within a limited scrape timeframe. I think it could be a nice feature to have, but it might not be feasible to check this whilst scraping...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

max 5000 artworks scraped per artist #30

max 5000 artworks scraped per artist #30

neenkah commented Nov 13, 2023

modhurita commented Nov 23, 2023 •

edited

Loading

neenkah commented Nov 27, 2023

max 5000 artworks scraped per artist #30

max 5000 artworks scraped per artist #30

Comments

neenkah commented Nov 13, 2023

modhurita commented Nov 23, 2023 • edited Loading

neenkah commented Nov 27, 2023

modhurita commented Nov 23, 2023 •

edited

Loading