Config with Env Vars, Profiles
Configuration is not code (and that is good)
In production systems, configuration changes more often than code. Endpoints, credentials, timeouts, feature flags, and limits must be adjustable without rebuilding artifacts. Treat configuration as deploy-time data:
- the same artifact runs in dev, staging, and prod,
- only configuration differs,
- configuration is versioned and auditable (at least in deployment manifests).
The 12-factor baseline: environment-driven config
A practical baseline is: use environment variables (or a secrets/config system that injects env vars) for production overrides, and keep application defaults minimal and safe. This avoids accidentally shipping a dev config into prod.
Define a precedence model (operators need predictability)
Always define which source wins when the same setting appears in multiple places. A common precedence order:
- 1) Environment variables (highest priority, set by deployment)
- 2) Config files (checked into repo for defaults or local dev)
- 3) Built-in defaults (lowest priority, safe fallbacks)
The key is not which model you pick; it is that the model is documented and consistent.
A simple, framework-agnostic config pattern
If you do not want to commit to a framework yet, you can implement a small config loader that reads env vars with defaults and validates required fields.
Example: read env vars with defaults
import java.time.Duration;
public final class AppConfig {
public final String environment;
public final int port;
public final String dbUrl;
public final String dbUser;
public final String dbPassword; // do not log
public final Duration httpTimeout;
private AppConfig(String environment, int port, String dbUrl, String dbUser, String dbPassword, Duration httpTimeout) {
this.environment = environment;
this.port = port;
this.dbUrl = dbUrl;
this.dbUser = dbUser;
this.dbPassword = dbPassword;
this.httpTimeout = httpTimeout;
}
public static AppConfig load() {
String env = getenvOrDefault("APP_ENV", "dev");
int port = parseInt(getenvOrDefault("APP_PORT", "8080"), "APP_PORT");
String dbUrl = requireEnv("DB_URL");
String dbUser = requireEnv("DB_USER");
String dbPassword = requireEnv("DB_PASSWORD");
Duration timeout = parseDurationMs(getenvOrDefault("HTTP_TIMEOUT_MS", "2000"), "HTTP_TIMEOUT_MS");
AppConfig cfg = new AppConfig(env, port, dbUrl, dbUser, dbPassword, timeout);
cfg.validate();
return cfg;
}
private void validate() {
if (port <= 0 || port > 65535) {
throw new IllegalArgumentException("Invalid APP_PORT: " + port);
}
if (!environment.equals("dev") && !environment.equals("staging") && !environment.equals("prod")) {
throw new IllegalArgumentException("Invalid APP_ENV: " + environment);
}
// Validate timeouts and endpoints early:
if (httpTimeout.isNegative() || httpTimeout.isZero()) {
throw new IllegalArgumentException("HTTP timeout must be > 0");
}
}
private static String getenvOrDefault(String key, String def) {
String v = System.getenv(key);
return (v == null || v.isBlank()) ? def : v;
}
private static String requireEnv(String key) {
String v = System.getenv(key);
if (v == null || v.isBlank()) {
throw new IllegalStateException("Missing required env var: " + key);
}
return v;
}
private static int parseInt(String v, String key) {
try { return Integer.parseInt(v); }
catch (NumberFormatException e) { throw new IllegalArgumentException("Invalid integer for " + key + ": " + v); }
}
private static Duration parseDurationMs(String v, String key) {
try { return Duration.ofMillis(Long.parseLong(v)); }
catch (NumberFormatException e) { throw new IllegalArgumentException("Invalid ms duration for " + key + ": " + v); }
}
}
Fail fast at startup (why this saves incidents)
Misconfiguration is one of the most common production outages. If a required DB URL is missing or a timeout is invalid, you want the service to refuse to start, not start and fail later under traffic.
Validate configuration boundaries, not just presence
Presence checks are not enough. Validate:
- ports are in valid ranges
- timeouts are > 0 and not absurdly large
- URLs are parseable
- feature flags have allowed values
- environments are in an allowlist
Profiles: useful, but can create drift
Profiles (dev/staging/prod) are powerful, but they can create a dangerous situation: the code behaves differently depending on a profile, and engineers stop noticing. The goal is to use profiles for:
- non-functional differences (log level, local mocks, dev-only tooling),
- default endpoints for local dev,
- safe toggles that are explicitly documented.
Profile anti-patterns
- Different business logic between dev and prod via profile branches.
- Hidden defaults: prod accidentally running with dev defaults because profile was not set.
- Permanent debug mode on prod because the wrong profile was activated.
Production rule for profiles
Make the production profile explicit and required. For example, require APP_ENV=prod in production deployments. If it is missing, fail fast.
Secrets: treat them as a separate class of config
Secrets are configuration, but they have different handling rules:
- never commit secrets to Git
- never bake secrets into container images
- never log secrets (including partial tokens)
- prefer injecting secrets at runtime (env vars, mounted files, secret managers)
Example: redacted config logging
It is useful to log configuration at startup, but you must redact secrets. Log only safe fields:
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
public final class StartupBanner {
private static final Logger log = LoggerFactory.getLogger(StartupBanner.class);
public static void logConfig(AppConfig cfg) {
log.info("startup env={} port={} dbUrl={} httpTimeoutMs={}",
cfg.environment,
cfg.port,
sanitizeJdbcUrl(cfg.dbUrl),
cfg.httpTimeout.toMillis()
);
}
private static String sanitizeJdbcUrl(String url) {
// Keep it simple: do not include user/password parameters if present.
// A real implementation may parse and strip query params.
int q = url.indexOf('?' );
return (q >= 0) ? url.substring(0, q) : url;
}
}
Configuration for external dependencies: timeouts are config
Most production failures are dependency-related: slow upstreams, partial outages, DNS issues. Your dependency clients must have explicit timeouts and retry policies, and those should be configurable. A practical baseline:
- connect timeout: small (e.g., 200-500ms) depending on network
- request timeout: aligned with your SLO (e.g., 1-3s for typical APIs)
- retry: only for idempotent operations; controlled maximum attempts
Runtime overrides and safe rollouts
When changing config in production, think like a release:
- roll out gradually (canary or small batch)
- monitor error rate and latency
- be able to roll back quickly (previous config)
Common production misconfigurations (real-world)
- Wrong environment: APP_ENV not set → defaults to dev → debug logging, wrong endpoints.
- Timeouts missing: client waits forever → thread pool exhaustion → cascading failure.
- DB URL wrong: service starts but fails every request with connection errors.
- Secrets logged: token leaked into logs and exported to third-party log systems.
Checklist
- Define precedence: env vars override files override defaults.
- Fail fast: validate required config and ranges at startup.
- Make production environment explicit (APP_ENV=prod).
- Separate secrets from normal config; never log them.
- Configure timeouts and dependency endpoints as first-class config.
- Roll out config changes gradually and monitor.