GenMockupsBlogFAQ
SevenLabs

SevenLabs

sevenlabs.site

Construisons ensemble.

Un projet en tête ? Nous sommes une petite équipe rapide. Choisissez le canal qui vous convient.

Réserver un appel de 30 min

Discutons de votre projet, de votre calendrier et de vos objectifs.

Planifier sur Calendly

Envoyez-nous un message

Préférez-vous l'asynchrone ? Utilisez notre formulaire de contact.

Visiter sevenlabs.site

Ou envoyez-nous un email à sevenlabsolutions@gmail.com

← Retour au blog

How the smart content scan works

Engineering22 juin 20254 min read

The problem it solves

When you generate a mockup, you want three pieces of information from the target site: a headline to overlay on the image, a subheadline for context, and a brand color to use as the backdrop. You could type all of this in manually - but the whole point of GenMockups is to eliminate that friction.

The smart content scan fetches all three automatically, with no input from you beyond the URL. Here is how each piece works.


Step 1: Puppeteer takes the screenshot

The scan starts with a Puppeteer-controlled Chromium instance. Puppeteer is a Node.js library that drives a headless (no-display) version of Chrome, allowing the server to load any web page exactly as a real browser would.

When you submit a URL:

  1. Puppeteer navigates to the URL with a 1440 × 900 px viewport
  2. It waits for the networkidle2 event - meaning the page has settled and most network requests have finished
  3. It captures a JPEG screenshot of the above-the-fold area
  4. The screenshot buffer is held in memory and sent to the client session

The JPEG format (rather than PNG) is used for the raw capture because it is significantly smaller at this stage - faster to transfer between the server and the client. The final export is PNG.


Step 2: Cheerio parses the HTML for metadata

In parallel with the screenshot, a separate lightweight HTTP request fetches the raw HTML of the same URL. This request uses a standard fetch call rather than a headless browser - it is faster and does not need JavaScript execution, since we only care about what is in the HTML source.

The HTML is parsed with Cheerio, a server-side jQuery-style library. Cheerio loads the HTML into a tree structure that you can query with CSS selectors, exactly like you would in a browser.

The scanner queries for the following metadata, in priority order:

For the headline:

  1. meta[property="og:title"] - Open Graph title
  2. meta[name="twitter:title"] - Twitter card title
  3. title - the HTML <title> element

For the subheadline:

  1. meta[property="og:description"] - Open Graph description
  2. meta[name="twitter:description"] - Twitter card description
  3. meta[name="description"] - standard meta description

For the brand color:

  1. meta[name="theme-color"] - the W3C theme-color meta tag, used by browsers to color the mobile status bar and tab strip

If a value is found, it is returned as-is. If not, the scanner falls through to the next option. The headline and subheadline fallbacks are usually good enough; the brand color fallback requires a different technique.


Step 3: node-vibrant extracts the dominant color

If no theme-color meta tag exists - which is the case for a large portion of websites, since many developers never set it - the scanner falls back to color extraction from the screenshot itself.

node-vibrant is a port of the Android Palette API to Node.js. It analyzes an image and identifies a palette of visually prominent swatches: Vibrant, Muted, Dark Vibrant, Dark Muted, Light Vibrant, and Light Muted.

The scanner picks the Vibrant swatch first. Vibrant represents the most colorful, saturated hue in the image - which tends to correspond to the brand color or call-to-action button color in most modern web designs.

If the Vibrant swatch is too light (luminance above 0.85) or too dark (luminance below 0.1), the scanner falls back to Dark Vibrant or Muted respectively, to ensure the extracted color will produce a usable background.

The result is a hex color string that is passed to the editor as the suggested background. You can override it at any time by clicking the color picker.


Why parallel requests

The screenshot and the metadata fetch run as parallel operations. The screenshot (Puppeteer) is the slower operation - it takes 5–10 seconds depending on the page. The metadata fetch (Cheerio) typically takes under one second.

Running them in parallel means the overall time is bounded by the slower operation (the screenshot), not the sum of both. Without parallelism, the total time would be 10–15 seconds; with parallelism, it is 5–10 seconds.


Limitations and edge cases

JavaScript-rendered metadata. Some single-page applications render their <title> and meta tags via JavaScript rather than including them in the initial HTML. Since the metadata fetch uses a plain HTTP request (no JavaScript execution), these tags will not be found. In this case, the headline and subheadline fields will be empty and you will need to fill them in manually.

Theme-color as a CSS variable. Some sites set theme-color to a value like var(--brand-primary) rather than a resolved hex or RGB value. The scanner cannot resolve CSS variables from the meta tag alone, so this will be treated as a missing theme-color and the color extraction fallback will run.

Very dark or very colorless sites. Sites with predominantly white or black designs may not yield a useful Vibrant swatch. The scanner will return whatever best approximation it can find, but the result may need manual adjustment.

Rate limiting and bot blocking. The metadata fetch sends a User-Agent header that identifies it as a conventional browser, but sites with aggressive bot detection may still block it. In that case, the headline and description fields will be empty.


The result

In the best case - a well-structured site with Open Graph tags and a theme-color meta - the scan completes in under a second alongside the Puppeteer screenshot, and the editor is pre-filled with a relevant headline, a useful subheadline, and an accurate brand color.

In the typical case - Open Graph tags present but no theme-color - the headline and subheadline are filled in automatically and the color is extracted from the screenshot via node-vibrant. The result is usually close to the brand color, though you may want to tweak the hue.

In either case, every value is editable. The scan is a shortcut, not a constraint.

S
Written by
SevenLabs
← Tous les articlesEssayer GenMockups gratuitement