Sunday, July 20, 2025

Reusing the same Next.js Docker image with runtime CDN assetPrefix

Recently, while investigating a production issue, I needed to deploy the production Docker image locally for debugging purposes. However, I encountered a significant architectural constraint that highlighted a fundamental deployment challenge requiring both architectural and DevOps expertise to resolve.

The Challenge: CDN Configuration

Next.js provides assetPrefix configuration to specify CDN or custom domains for static asset delivery, enabling faster content delivery through geographically distributed networks. However, this configuration presents a critical architectural constraint: it's a build-time setting that gets permanently embedded into static pages and CSS assets.

This limitation creates substantial operational challenges when reusing a Docker image:

  • Multiple Environments: Different environments like QA, stage, and production use different CDN domains.
  • Regional CDNs: Production environments could use different regional CDN domains.

The traditional approach would require building separate Docker images for each environment, violating the fundamental DevOps principle of "build once, deploy anywhere."

Architectural Solution: Runtime Asset Prefix Injection

Through careful analysis of Next.js's asset handling mechanisms and strategic use of text replacement techniques, I developed a solution that maintains build-time optimization while enabling runtime CDN configuration.

Configuration Architecture

The solution begins with environment-aware configuration that handles both basePath and assetPrefix dynamically:

Docker Build Strategy

During the Docker image creation process, we set placeholder values that will be replaced at runtime, along with a custom entrypoint script:

Runtime Replacement Logic

The entrypoint script performs intelligent text replacement, substituting placeholder values with environment-specific configurations at container startup:

Addressing Image Security Constraints

During implementation, I encountered an additional architectural challenge: Next.js image optimization returns cryptic 404 errors when serving images from CDN domains. Investigation into the Next.js source code revealed this is a security feature that only allows pre-approved domains for image serving.

The error message "url parameter is not allowed" provides insufficient context for troubleshooting, but the root cause is Next.js's domain whitelist mechanism. This requires configuring images.remotePatterns to explicitly allow CDN domains.

Advanced Runtime Configuration

The most sophisticated aspect of this solution involves dynamically updating the remotePatterns configuration at runtime. Since we're performing text replacements on configuration files, I leveraged Node.js's command-line execution capabilities to add intelligent remotePatterns generation for CDN domain whitelisting.

This approach ensures that:

  • Security policies remain enforced through domain whitelisting
  • Runtime flexibility is maintained for multi-environment deployments
  • Performance optimization continues through proper CDN utilization
  • Operational efficiency is achieved through single-image deployment

Key Architectural Benefits

This solution delivers several critical advantages:

  • Single Image Deployment: One Docker image serves all environments, reducing build complexity and storage requirements
  • Runtime Flexibility: CDN configurations adapt to deployment context without rebuild cycles
  • Performance Preservation: Static page optimization remains intact while enabling dynamic asset serving
  • Security Compliance: Domain whitelisting ensures controlled image processing
  • Operational Simplicity: Environment-specific configurations are managed through standard environment variables

Implementation Considerations

When implementing this approach, consider these architectural factors:

  • Text replacement scope: Ensure replacements target only intended configuration values
  • Environment variable validation: Implement proper fallbacks for missing or invalid CDN configurations
  • Security boundaries: Maintain strict domain whitelisting for image processing
  • Performance monitoring: Verify that runtime replacements don't impact application startup time

Conclusion

With some architectural creativity, we can resolve platform limitations while maintaining operational best practices. By combining Next.js's build-time optimizations with runtime configuration flexibility, we achieve the ideal balance of performance and deployment efficiency.

The approach enables teams to leverage CDN benefits across multiple environments while adhering to containerization principles, ultimately delivering both technical excellence and operational simplicity. For organizations deploying Next.js applications across diverse environments, this pattern provides a production-ready solution that scales with architectural complexity while maintaining deployment consistency.

Resolving Next.js Standalone Build Docker Reachability Failure on AWS ECS Fargate

While deploying a Next.js standalone build in a Docker container on AWS ECS Fargate, I encountered a subtle but critical issue that highlights the importance of understanding platform-specific runtime behaviors.

The Problem: Silent Health Check Failures

During the deployment of our Next.js application to AWS ECS Fargate, the service consistently failed health checks despite the application appearing to function correctly. The container would start successfully, but the Target Group couldn't establish connectivity, resulting in deployment failures.

Initial Investigation

Examining the container logs revealed the root cause:

- Local: http://ip-10-0-5-61.us-west-2.compute.internal:3000 - Network: http://10.0.5.61:3000

The Next.js server was binding to the container's internal hostname rather than accepting connections from external networks. This prevented the ALB health checks from reaching the application endpoint.

Root Cause Analysis

Tracing through the Next.js standalone build revealed the hostname configuration logic in server.js:

const currentPort = parseInt(process.env.PORT, 10) || 3000 const hostname = process.env.HOSTNAME || '0.0.0.0'

The application defaults to 0.0.0.0 (accepting all connections) when no HOSTNAME environment variable is present. My initial approach was to explicitly set HOSTNAME=0.0.0.0 in the ECS task definition.

However, this approach failed due to a critical AWS Fargate behavior: Fargate automatically sets the HOSTNAME environment variable at runtime, overriding any pre-configured values.

Evaluating Solution Approaches

Approach 1: Runtime Environment Override

CMD "HOSTNAME=0.0.0.0 node server.js"

While functional, this approach embeds environment variable assignments within the CMD instruction, reducing container portability and maintainability.

Approach 2: Build-Time Source Modification

Using sed during the Docker build process to directly modify the hostname assignment:

RUN sed -i "s/const hostname = process.env.HOSTNAME || '0.0.0.0'/const hostname = '0.0.0.0'/g" server.js

Approach 3: Systematic Source Patching (Recommended as it provides consistent behavior locally and in the cloud)

The most architecturally sound solution leverages patch-package to modify the Next.js build process itself. The hostname assignment originates from node_modules/next/dist/build/utils.js in the writeFile function that generates server.js.

By creating a systematic patch that modifies the server generation logic, we achieve:

  • Consistency across environments (local development and production)
  • Maintainability through version-controlled patches
  • Architectural integrity by addressing the root cause rather than symptoms

Implementation Details

The patch modifies the server template generation in Next.js build to use more relevant environment variable names. This ensures consistent behavior across all deployment targets while maintaining clean separation of concerns.

Key Architectural Insights

This experience reinforces several important principles for cloud-native application deployment:

  • Platform Behavior Awareness: Cloud platforms often inject runtime configurations that can override application-level settings
  • Health Check Design: Container applications must be designed with load balancer connectivity patterns in mind
  • Source-Level Solutions: Sometimes the most maintainable solution requires modifying the build process rather than working around runtime constraints

Conclusion

While AWS Fargate's automatic hostname assignment serves legitimate infrastructure purposes, it can create unexpected challenges for containerized applications. By understanding the platform's behavior and implementing systematic source modifications, we can create robust deployment solutions that maintain architectural integrity while meeting operational requirements.

Saturday, July 19, 2025

Access logs in Next.js production build

I frequently encounter architectural decisions that seem counterintuitive from an operational perspective. Recently, while containerizing a Next.js application, I discovered one such puzzling design choice that required a creative engineering solution.

The Problem: Silent Production Builds

During the Docker image creation process for our Next.js application, I encountered an unexpected operational blind spot: Next.js production builds generate zero access logs. This absence of fundamental observability data immediately raised concerns about our ability to monitor application behavior in production environments.

The conventional wisdom suggests deploying an nginx reverse proxy to capture access logs. However, as an architect focused on operational efficiency, introducing an additional process layer solely for logging felt architecturally unsound, particularly within containerized environments where process minimalism is a core principle.

Exploring Conventional Solutions

My initial investigation led me to application-level logging libraries such as winston and pino. While these tools excel at application logging, they operate within the application boundary and don't provide the standardized access log format that operations teams expect from web applications.

Root Cause Analysis

After extensive research into similar reported issues, I discovered the underlying cause: Vercel has intentionally omitted access logging from Next.js production builds. This architectural decision, while perhaps suitable for Vercel's managed platform, creates operational challenges for self-hosted deployments.

Deep Dive

Taking a source-code-first approach, I downloaded the Next.js repository and traced the request handling flow to its core: the async function requestListener(req, res) function. By strategically placing console.log statements within the node_modules Next.js installation, I successfully exposed the access log data we needed.

However, this manual modification approach presented obvious maintainability challenges for automated deployment pipelines.

Production-Ready Implementation

While researching sustainable patching methodologies, I discovered an excellent resource by TomUps (https://www.tomups.com/posts/log-nextjs-request-response-as-json/) that introduced patch-package, a tool designed precisely for this type of systematic source modification.

Their approach provided the foundational technique, though it captured extensive request/response metadata including headers and body content. For our operational requirements, I needed a more focused solution that provided essential access log fields: timestamp, URL, and HTTP status code.

Architectural Solution

The final implementation leverages patch-package combined with pino-http-print to deliver clean, standardized access logs that integrate seamlessly with our existing observability stack. This approach:

  • container efficiency by avoiding additional processes
  • Provides operational visibility through standard access log formats
  • Ensures deployment consistency via automated patching during image builds
  • Preserves maintainability through version-controlled patch files

Key Takeaway

This experience reinforces a fundamental architectural principle: when platform decisions conflict with operational requirements, creative engineering solutions can bridge the gap while maintaining system integrity. The key is balancing pragmatic problem-solving with long-term maintainability, exactly what patch-package enables in this scenario.

Steps

You can follow the steps in TomUps post and use the following patch.

Why Your New External Drive is Breaking Your macOS Builds

Picture this: you're running dangerously low on disk space, so you invest in a brand new external drive. Time to migrate your developmen...