Zach’s ugly mug (his face) Zach Leatherman

Speedy Screenshots, or How I Improved the Robustness of the Screenshot Service

February 18, 2022

This is a reply to my own blog post: Building an Automated Screenshot Service on Netlify in ~140 Lines of Code.

There is a limitation with the Screenshot Service: when the page you’re taking a screenshot of is slow and/or very large, the request times out. Quoth myself, from ~7 months ago:

What happens if a site is super slow or is currently down?

Netlify Functions have a 10 second execution limit. If the site doesn’t render in 10 seconds, we show a fallback image by default. Currently this is a low-contrast 11ty logo using the same image size as the requested screenshot (via SVG width and height attributes).

While this fallback behavior is okay I was starting to see it more often than I’d like. Why, you might ask? Why would it take more than 10 seconds to fetch a screenshot?

Here’s a sample OpenGraph image declaration from the <head> of one of my blog posts (the <details-utils> one):

<meta name="og:image" content="https://v1.screenshot.11ty.dev/https%3A%2F%2Fwww.zachleat.com%2Fopengraph%2Fweb%2Fdetails-utils%2F/opengraph/">

When requesting this image, the api-screenshot service loads and renders https://www.zachleat.com/opengraph/web/details-utils/ using Puppeteer to return a 1200×630 screenshot jpeg image.

However, on that dedicated /opengraph/web/details-utils/ page waits a big ’ol chunk of kryptonite. Specifically, this page makes another screenshot service request 😅 to use the referenced blog post as a background image (in this case /web/details-utils/).

background-image: url('https://v1.screenshot.11ty.dev/https%3A%2F%2Fwww.zachleat.com%2Fweb%2Fdetails-utils%2F/opengraph/');

Okay, fine. Let’s admit what happened here. I flew too close to the sun. I chained too many screenshots together. This was causing timeouts for larger/weightier blog posts and pages (showing the low-contrast SVG of the default 11ty logo).

Have I overengineered it? Yes. But if we engineer it more—it will modulo back around to normal levels of engineering. Maybe even underengineering. Right? That’s how this works? I’m not willing to admit the answer to this yet.

But I do know that we can fix it. And we can fix it without removing any of the links in the chain of prized and celebrated screenshots.

Adding a new timeout

I started by adding a new timeout to the screenshot service:

  1. New: At 7 seconds (by default, 1.5 seconds before the timeout option), we attempt to inject a clientside JavaScript window.stop() on the page to cancel page load. The logic here is that a partially rendered page is better than the fallback 11ty logo.
    • via MDN: This method cannot interrupt its parent document's loading, but it will stop its images, new windows, and other still-loading objects.
  2. At 8.5 seconds (by default), we use Puppeteer’s timeout property on the goto method to stop early. You can now customize this in the Screenshot API url. We handle this error by showing the aforementioned fallback 11ty logo.
  3. At 10 seconds, the serverless function times out and shows a gnarly HTTP 502 with a text/plain error message. You shouldn’t see this.
    • e.g. {"errorMessage":"2022-02-20T02:00:14.320Z […truncated] Task timed out after 10.00 seconds"}

For the second rendered screenshot in my Open Graph image chain (the one of the real blog post), I’ve manually lowered the timeout option (and the clientside timeout) on the second screenshot before the first screenshot hits the timeout too.

You can see it in action on this 12 second blocking external CSS file demo. Note that when the page loads successfully (after 12 seconds), it has a green background.

Now check out this screenshot of the 12 second demo with a 3 second screenshot timeout:

Screenshot showing a white background, the CSS file was not loaded

Previously, running the screenshot service against this page would have shown the fallback 11ty logo.

Use ttl for fallback images

When an image times out and the 11ty logo fallback image is shown, we were forced to use a HTTP 200 status code for that condition or some browsers wouldn’t show the fallback image at all (Firefox). This was a bit of a problem because it meant that screenshots that timed out wouldn’t retry again until a new build was triggered (which could be a long time for a dedicated screenshots service).

Fortunately Netlify has added a Time to live ttl option to On-demand builders that allows you to specify a fixed amount of time (minimum 60 seconds) before a request is invalidated and a new request is generated. We can now add the ttl specifically for requests that hit this timeout without invalidating any of the previously successful ones!

Other Puppeteer improvements

Next, I added a grab bag of small performance tweaks to Puppeteer:

  • goto->waitUntil: Added wait option to control when a specific screenshot considers the page to be “finished.”
  • screenshot->captureBeyondViewport: Used captureBeyondViewport: false (default was true), this cuts the screenshot to the viewport size. I’m not sure how this is different but I also enabled the screenshot->clip option in Puppeteer.
  • I also attempted the GitHub approach to speed up Puppeteer (in the Some Performance Gotchas section) but the approach only worked with foreground images.

Things I didn’t do but should (?)

I could have removed the api-opengraph references from my site altogether. It’s my site. I have full knowledge of where the OpenGraph images are. I don’t need an external service to tell me that.

I could make requests directly to the screenshots API. However, api-screenshot doesn’t currently support image resizing with the opengraph viewport size—but api-opengraph does resize/optimize images.

I kept api-opengraph to get image optimization for free to avoid those weighty default 1200×630 images clogging up my page. Ultimately this means I’m chaining 3 different serverless functions together under the same 10 second limit, which feels a little risky but seems okay in practice (maybe because my site renders pretty fast as-is?).

I will likely improve the screenshot service to support at least one smaller OpenGraph image size (probably 600×315) and additional image formats (png and webp) at some point. Feels like v2.screenshot.11ty.dev may be in our future.

Good enough for now

With these changes in place, I haven’t seen any fallback 11ty logo screenshots on my site in quite some time!


< Newer
JS Party Episode #217: Going full-time on Eleventy
Older >
Full Time Open Source Development for Eleventy, sponsored by Netlify

Zach Leatherman IndieWeb Avatar for https://zachleat.com/is a builder for the web at IndieWeb Avatar for https://cloudcannon.com/CloudCannon. He is the creator and maintainer of IndieWeb Avatar for https://www.11ty.devEleventy (11ty), an award-winning open source site generator. At one point he became entirely too fixated on web fonts. He has given 79 talks in nine different countries at events like Beyond Tellerrand, Smashing Conference, Jamstack Conf, CSSConf, and The White House. Formerly part of Netlify, Filament Group, NEJS CONF, and NebraskaJS. Learn more about Zach »

10 Likes

David Hund ✌Cody PetersonMatt BiilmannTodd MoreyStephanie EcklesBart VenemanMatt Rossman 🍌Matthias OttMayankAjit Panigrahi
1 Comment
  1. Mayank

    @m_yxnk

    "Have I overengineered it? Yes. But if we engineer it more—it will modulo back around to normal levels of engineering. Maybe even underengineering. Right?" quote of the day

Shamelessly plug your related post

These are webmentions via the IndieWeb and webmention.io.

Sharing on social media?

This is what will show up when you share this post on Social Media:

How did you do this? I automated my Open Graph images. (Peer behind the curtain at the test page)