EASYwalkthrough

Image Processing Pipeline

5 of 8

3 related

A user uploads a 3MB photo. We need all 4 resolution variants ready within 10 seconds or the photo cannot appear in feeds.

10x

more memory

The upload API saves the original to S3 and enqueues an event. The image processing pipeline spawns 4 independent resize jobs in parallel: 150px thumbnail, 320px small, 640px medium, and 1080px full.

“We chose an asynchronous pipeline (not synchronous in the upload request) because blocking the upload API for 10 seconds of resize work would timeout most mobile clients.”

We use libvips (not ImageMagick) because libvips processes images in streaming mode with 10x less memory. Each job applies JPEG compression at 75-85% quality and strips EXIF metadata except GPS coordinates needed for location tagging.

The pipeline also extracts a dominant color to generate a placeholder gradient shown while the photo loads, similar to how Pinterest uses BlurHash placeholders. Why does this matter for perceived performance?

Without it, users see a white flash for 200-500ms while the image loads. Trade-off: async processing means a 2-5 second delay before the photo is visible.

We accept this because users tolerate a short upload delay but not a laggy feed. If any variant fails, the job retries with exponential backoff.

Only after all 4 variants are confirmed in S3 does the photo become visible and trigger fanout.

Why it matters in interviews

Describing the parallel resize pipeline with specific tool choices (libvips over ImageMagick) and compression ratios shows you think about real constraints. Mentioning the dominant color placeholder demonstrates awareness of perceived performance, a detail that separates candidates who build for users from those who build for machines.

Related concepts

← PreviousTimeline Cache Architecture Next →Follower Graph Storage