Docker and Containers (6): Debugging and Logging — When Things Go Wrong Inside a Box

Containers hide their internals by design. When something breaks, you need specific tools and techniques to see inside the box without breaking the isolation that makes containers useful.

A container that works is invisible. A container that doesn’t work is a black box. The entire point of containerization is isolation — but that same isolation makes debugging harder. You can’t just ssh into a container or browse its filesystem from the host. Docker provides a specific set of tools for inspecting, diagnosing, and understanding what happens inside running (and crashed) containers.


Reading Container Logs#

Logs are your first line of investigation. Docker captures everything a container writes to stdout and stderr.

Logging drivers

docker logs#

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
# View all logs from a container
docker logs my-container

# Follow logs in real time (like tail -f)
docker logs -f my-container

# Show the last 100 lines
docker logs --tail 100 my-container

# Show logs since a specific time
docker logs --since 2023-09-30T10:00:00 my-container

# Show logs from the last 30 minutes
docker logs --since 30m my-container

# Show timestamps with each log line
docker logs -t my-container

Example output with timestamps:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
2023-09-30T10:15:23.456789012Z [INFO] Server starting on port 8080
2023-09-30T10:15:23.567890123Z [INFO] Connected to database at postgres:5432
2023-09-30T10:15:24.678901234Z [INFO] Loading configuration from /app/config.yaml
2023-09-30T10:15:24.789012345Z [WARNING] Cache directory /tmp/cache does not exist, creating
2023-09-30T10:15:25.890123456Z [INFO] Ready to accept connections
2023-09-30T10:16:01.234567890Z [ERROR] Failed to process request: connection refused
2023-09-30T10:16:01.345678901Z Traceback (most recent call last):
2023-09-30T10:16:01.345678901Z   File "/app/handler.py", line 45, in process_request
2023-09-30T10:16:01.345678901Z     response = requests.get(upstream_url, timeout=5)
2023-09-30T10:16:01.345678901Z   File "/usr/local/lib/python3.11/site-packages/requests/api.py", line 73
2023-09-30T10:16:01.345678901Z     return session.request(method=method, url=url, **kwargs)
2023-09-30T10:16:01.345678901Z ConnectionError: ('Connection aborted.', ConnectionRefusedError(111, 'Connection refused'))

Logs from stopped containers#

This is critical: when a container crashes, it stops, but its logs are preserved until the container is removed.

1
2
# List stopped containers
docker ps -a --filter "status=exited"
1
2
CONTAINER ID   IMAGE       COMMAND            CREATED          STATUS                     NAMES
a1b2c3d4e5f6   myapp:v2    "python app.py"    10 minutes ago   Exited (1) 8 minutes ago   crashed-app
1
2
# View the crash logs
docker logs crashed-app
1
2
3
4
5
6
7
8
Traceback (most recent call last):
  File "/app/app.py", line 12, in <module>
    db = psycopg2.connect(os.environ['DATABASE_URL'])
  File "/usr/local/lib/python3.11/site-packages/psycopg2/__init__.py", line 122
    conn = _connect(dsn, connection_factory=connection_factory, **kwasync)
psycopg2.OperationalError: could not connect to server: Connection refused
    Is the server running on host "postgres" (172.18.0.2) and accepting
    TCP/IP connections on port 5432?

The container exited with code 1 (error). The logs show it couldn’t connect to PostgreSQL. This could be because the database wasn’t ready when the app started (a dependency ordering issue) or the hostname was wrong.

Exit codes#

1
2
3
# Check exit code
docker inspect crashed-app --format '{{.State.ExitCode}}'
# Output: 1

Common exit codes:

Exit CodeMeaningCommon Cause
0SuccessNormal shutdown
1General errorApplication error, exception
2Shell builtin misuseBad script syntax
126Command not executablePermission issue on entrypoint
127Command not foundWrong CMD/ENTRYPOINT, missing binary
137SIGKILL (128+9)OOM killer, docker kill, timeout
139SIGSEGV (128+11)Segmentation fault
143SIGTERM (128+15)docker stop (graceful shutdown)

Exit code 137 deserves special attention; it usually means the container was killed by the OOM (Out of Memory) killer for exceeding its memory limit.

1
2
3
# Check if a container was OOM-killed
docker inspect crashed-app --format '{{.State.OOMKilled}}'
# Output: true

Interactive Debugging with docker exec#

docker exec runs a command inside a running container. It’s the primary way to interactively debug:

Debugging workflow

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
# Open a shell inside a running container
docker exec -it my-container bash

# If bash isn't available (Alpine, distroless)
docker exec -it my-container sh

# Run a specific command
docker exec my-container cat /app/config.yaml

# Run as a different user
docker exec -u root my-container apt-get update

# Set environment variables for the command
docker exec -e DEBUG=true my-container python check.py

Once inside a container, you can investigate.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
# Check environment variables
env | sort

# Check network connectivity
curl -v http://postgres:5432
ping redis

# Check DNS resolution
nslookup postgres
cat /etc/resolv.conf

# Check running processes
ps aux

# Check file permissions
ls -la /app/
stat /app/config.yaml

# Check disk usage
df -h

# Check memory
free -m
cat /proc/meminfo

# Check open files and connections
# (if the tool is available)
ss -tlnp
netstat -tlnp

When bash isn’t available#

Minimal images (Alpine, distroless) might not have bash or common tools.

1
2
3
4
5
# Alpine uses sh, not bash
docker exec -it alpine-container sh

# Install debugging tools temporarily
docker exec -it alpine-container apk add --no-cache curl bind-tools

For distroless images, you can’t exec into them at all (no shell). Use a debug sidecar instead.

1
2
3
4
5
# Run a debug container that shares the network namespace
docker run -it --rm \
    --network container:my-distroless-container \
    nicolaka/netshoot \
    bash

The nicolaka/netshoot image contains every network debugging tool you could want (curl, nslookup, tcpdump, iptables, etc.), and --network container:my-distroless-container makes it share the target container’s network namespace.

docker inspect — The Complete Picture#

Container logging pipeline data streams flowing from contain

docker inspect returns detailed JSON metadata about a container. It’s verbose but contains everything:

exec vs attach

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
# Full output (very long)
docker inspect my-container

# Specific fields with Go template formatting
docker inspect my-container --format '{{.State.Status}}'
# Output: running

docker inspect my-container --format '{{.State.StartedAt}}'
# Output: 2023-09-30T10:15:23.123456789Z

docker inspect my-container --format '{{.Config.Image}}'
# Output: myapp:v2

# Network information
docker inspect my-container --format '{{range .NetworkSettings.Networks}}IP: {{.IPAddress}} Gateway: {{.Gateway}}{{end}}'
# Output: IP: 172.18.0.3 Gateway: 172.18.0.1

# Environment variables
docker inspect my-container --format '{{range .Config.Env}}{{println .}}{{end}}'
1
2
3
4
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
DATABASE_URL=postgresql://postgres:secret@postgres:5432/myapp
REDIS_URL=redis://redis:6379
NODE_ENV=production
1
2
# Mount points
docker inspect my-container --format '{{json .Mounts}}' | python3 -m json.tool
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
[
    {
        "Type": "volume",
        "Name": "app-data",
        "Source": "/var/lib/docker/volumes/app-data/_data",
        "Destination": "/data",
        "Driver": "local",
        "Mode": "",
        "RW": true,
        "Propagation": ""
    },
    {
        "Type": "bind",
        "Source": "/home/user/config",
        "Destination": "/app/config",
        "Mode": "ro",
        "RW": false,
        "Propagation": "rprivate"
    }
]
1
2
# Port bindings
docker inspect my-container --format '{{json .NetworkSettings.Ports}}' | python3 -m json.tool
1
2
3
4
5
6
7
8
{
    "8080/tcp": [
        {
            "HostIp": "0.0.0.0",
            "HostPort": "8080"
        }
    ]
}

Useful inspect one-liners#

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
# Get container's IP address
docker inspect -f '{{range.NetworkSettings.Networks}}{{.IPAddress}}{{end}}' my-container

# Get container's MAC address
docker inspect -f '{{range.NetworkSettings.Networks}}{{.MacAddress}}{{end}}' my-container

# Get container's restart count
docker inspect -f '{{.RestartCount}}' my-container

# Get the image used
docker inspect -f '{{.Config.Image}}' my-container

# Get the command being run
docker inspect -f '{{json .Config.Cmd}}' my-container

# Check if container is running
docker inspect -f '{{.State.Running}}' my-container

docker stats — Real-Time Resource Monitoring#

docker stats provides a live view of resource consumption:

Resource monitoring

1
docker stats
1
2
3
4
5
CONTAINER ID   NAME              CPU %   MEM USAGE / LIMIT     MEM %   NET I/O           BLOCK I/O         PIDS
a1b2c3d4e5f6   myapp-api-1       2.34%   125.4MiB / 512MiB     24.49%  15.2kB / 8.9kB    4.1MB / 0B        12
b2c3d4e5f6a7   myapp-postgres-1  0.45%   89.2MiB / 1GiB        8.71%   12.1kB / 9.8kB    28.5MB / 12.3MB   15
c3d4e5f6a7b8   myapp-redis-1     0.12%   12.5MiB / 256MiB      4.88%   8.4kB / 6.2kB     0B / 524kB        5
d4e5f6a7b8c9   myapp-worker-1    15.67%  234.5MiB / 512MiB     45.80%  5.1kB / 3.2kB     0B / 0B           8

Key columns:

ColumnWhat It ShowsWatch For
CPU %CPU usage relative to hostSustained high CPU = bottleneck
MEM USAGE / LIMITCurrent memory / configured limitApproaching limit = OOM risk
MEM %Percentage of limit used> 80% is concerning
NET I/ONetwork bytes in/outUnexpected traffic patterns
BLOCK I/ODisk read/writeHigh writes = possible log flooding
PIDSNumber of processesGrowing = possible leak
1
2
3
4
5
# Monitor a specific container
docker stats my-container --no-stream

# Format output
docker stats --format "table {{.Name}}\t{{.CPUPerc}}\t{{.MemUsage}}"

docker top — Process Listing#

1
docker top my-container

Troubleshooting decision tree

1
2
3
4
5
6
UID                 PID                 PPID                C                   STIME               TTY                 TIME                CMD
appuser             12345               12330               0                   10:15               ?                   00:00:05            gunicorn: master [app:app]
appuser             12350               12345               2                   10:15               ?                   00:01:30            gunicorn: worker [app:app]
appuser             12351               12345               2                   10:15               ?                   00:01:28            gunicorn: worker [app:app]
appuser             12352               12345               1                   10:15               ?                   00:01:25            gunicorn: worker [app:app]
appuser             12353               12345               2                   10:15               ?                   00:01:32            gunicorn: worker [app:app]

Notice the PID and PPID columns show host PIDs. Inside the container, the master process is PID 1, but on the host, it’s PID 12345. This is namespace mapping.

1
2
# Use ps-style formatting
docker top my-container -o pid,ppid,user,%cpu,%mem,comm

docker diff — Filesystem Changes#

docker diff shows what files have been added, changed, or deleted in the container’s writable layer compared to the image:

1
docker diff my-container
1
2
3
4
5
6
7
8
C /tmp
A /tmp/cache
A /tmp/cache/session_abc123
C /var/log
A /var/log/app.log
C /app
C /app/config.yaml
A /app/data/uploads/image001.png
PrefixMeaning
AAdded — file doesn’t exist in the image
CChanged — file was modified
DDeleted — file existed in the image but was removed

This is useful for:

  • Finding where an application writes data (should it be a volume instead?)
  • Detecting unexpected file modifications
  • Understanding what state accumulates in a running container

Debugging Crashed Containers#

When a container exits immediately, you can’t docker exec into it. Here are strategies:

Strategy 1: Check the logs#

1
docker logs crashed-container

This works for any stopped container (until you docker rm it).

Strategy 2: Override the command#

If the container crashes on startup, override the command to keep it running.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
# Instead of the normal command, just sleep
docker run -it --name debug-container myapp:v2 bash

# Or override entrypoint entirely
docker run -it --entrypoint bash myapp:v2

# Now you're inside the container and can investigate
ls /app/
cat /app/config.yaml
python -c "import psycopg2; print('module found')"

Strategy 3: Copy files out of a stopped container#

1
2
3
4
5
6
7
8
9
# Create the container without starting it
docker create --name debug-container myapp:v2

# Copy files out
docker cp debug-container:/app/config.yaml ./debug-config.yaml
docker cp debug-container:/var/log/ ./debug-logs/

# Clean up
docker rm debug-container

Strategy 4: Commit a stopped container as an image#

1
2
3
4
5
6
7
8
# The container crashed but still exists
docker ps -a | grep crashed

# Save its state as a new image
docker commit crashed-container debug-image:latest

# Now you can explore it
docker run -it debug-image:latest bash

Ephemeral Debug Containers#

Debugging inside a container detective with magnifying glass

Sometimes you need networking tools, strace, or other debugging utilities that aren’t in your application image. Run a separate debug container that shares the target’s network.

1
2
3
4
5
# Debug container sharing target's network namespace
docker run -it --rm \
    --network container:my-app-container \
    nicolaka/netshoot \
    bash

Inside netshoot, you have access to the target container’s network:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
# DNS resolution (from netshoot, using my-app-container's network)
nslookup postgres

# TCP connection test
nc -zv postgres 5432

# HTTP request
curl -v http://localhost:8080/health

# Packet capture
tcpdump -i eth0 -n port 5432

# Trace route
traceroute postgres

You can also share other namespaces.

1
2
3
4
# Share PID namespace (see the target container's processes)
docker run -it --rm \
    --pid container:my-app-container \
    ubuntu bash -c "ps aux"

Log Drivers#

By default, Docker stores logs as JSON files on disk. You can configure different log drivers.

1
2
3
# Check current log driver
docker info --format '{{.LoggingDriver}}'
# Output: json-file

Available log drivers#

DriverDestinationdocker logs SupportUse Case
json-fileLocal JSON filesYesDefault, development
localOptimized local storageYesProduction single-host
syslogSyslog daemonNoTraditional Linux logging
journaldsystemd journalYessystemd-based hosts
fluentdFluentd collectorNoCentralized logging (ELK/EFK)
awslogsCloudWatch LogsNoAWS deployments
gcplogsGoogle Cloud LoggingNoGCP deployments
splunkSplunk HTTP Event CollectorNoEnterprise environments
noneDiscard all logsNoHigh-throughput, logs handled internally

Configuring log drivers per container#

1
2
3
4
5
6
7
# Use a specific log driver for one container
docker run -d \
    --log-driver json-file \
    --log-opt max-size=10m \
    --log-opt max-file=3 \
    --name my-container \
    myapp

Configuring log rotation (critical for production)#

Without log rotation, json-file logs grow unbounded and will eventually fill your disk:

1
2
# Docker daemon configuration (/etc/docker/daemon.json)
cat /etc/docker/daemon.json
1
2
3
4
5
6
7
8
{
    "log-driver": "json-file",
    "log-opts": {
        "max-size": "10m",
        "max-file": "5",
        "compress": "true"
    }
}
OptionEffectRecommended
max-sizeMaximum size per log file10m - 50m
max-fileNumber of rotated log files to keep3 - 5
compressCompress rotated filestrue

Without these settings, a busy container can generate gigabytes of logs.

In Docker Compose#

1
2
3
4
5
6
7
8
services:
  api:
    image: myapp
    logging:
      driver: json-file
      options:
        max-size: "10m"
        max-file: "3"

Common Failure Patterns#

Here’s a diagnostic table for the most frequent container issues:

SymptomLikely CauseDiagnostic CommandFix
Container exits immediatelyApplication crash on startupdocker logs containerCheck config, dependencies
Exit code 137Out of memory (OOM killed)docker inspect -f '{{.State.OOMKilled}}'Increase --memory or fix memory leak
Exit code 127Command not founddocker inspect -f '{{json .Config.Cmd}}'Check CMD/ENTRYPOINT spelling, verify binary exists
Exit code 126Permission denied on commanddocker exec container ls -la /entrypoint.shchmod +x the entrypoint, check USER
“Address already in use”Port conflictdocker ps (check port mappings)Use a different host port
“Connection refused” to serviceService not ready or wrong hostnamedocker exec container curl http://service:portCheck depends_on, healthchecks, network
DNS resolution failureNot on custom networkdocker network inspect bridgeCreate custom network with docker network create
File permission errorsUID mismatch between host and containerdocker exec container id + ls -la /pathMatch UIDs or use named volumes
Slow performance on macOSBind mount I/O overheaddocker statsUse named volumes for dependencies, bind mount only source
Container restarts in loopCrashLoopBackOff (crash → restart → crash)docker logs --tail 50 containerFix the underlying crash, set restart: on-failure

Debugging Workflow Checklist#

When a container isn’t working, follow this sequence:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
# 1. Is the container running?
docker ps -a | grep my-container

# 2. What do the logs say?
docker logs --tail 100 my-container

# 3. What's the exit code?
docker inspect my-container --format '{{.State.ExitCode}} OOM:{{.State.OOMKilled}}'

# 4. What's the resource usage?
docker stats my-container --no-stream

# 5. Can I get inside?
docker exec -it my-container bash  # or sh

# 6. What processes are running?
docker top my-container

# 7. What changed in the filesystem?
docker diff my-container

# 8. What's the full configuration?
docker inspect my-container

# 9. Can the container reach its dependencies?
docker exec my-container ping postgres
docker exec my-container curl -v http://api:8000/health

# 10. Is the network configured correctly?
docker network inspect $(docker inspect my-container --format '{{range $k, $v := .NetworkSettings.Networks}}{{$k}}{{end}}')

Docker Events — System-Wide Activity#

docker events streams real-time events from the Docker daemon:

1
docker events
1
2
3
4
2023-09-30T10:30:15.123456789Z container start a1b2c3d4e5f6 (image=myapp:v2, name=api)
2023-09-30T10:30:45.234567890Z container die a1b2c3d4e5f6 (exitCode=1, image=myapp:v2, name=api)
2023-09-30T10:30:46.345678901Z container start a1b2c3d4e5f6 (image=myapp:v2, name=api)
2023-09-30T10:31:16.456789012Z container die a1b2c3d4e5f6 (exitCode=1, image=myapp:v2, name=api)

This shows the container is in a restart loop (die → start → die). Filter events to reduce noise:

1
2
3
4
5
6
7
8
# Only container events
docker events --filter type=container

# Only events for a specific container
docker events --filter container=my-container

# Only specific event types
docker events --filter event=die --filter event=oom

What’s Next#

Now you can find out what’s wrong when containers misbehave. But debugging is reactive — ideally, you prevent problems before they happen. The next article covers security: running containers as non-root, dropping capabilities, scanning for vulnerabilities, and following best practices that prevent the most common security mistakes in containerized applications.

In this series

Docker and Containers 8 parts

  1. 01 Docker and Containers (1): Why Containers — The Problem VMs Didn't Solve
  2. 02 Docker and Containers (2): Images and Layers — What docker pull Actually Downloads
  3. 03 Docker and Containers (3): Dockerfile Patterns — From Naive to Production
  4. 04 Docker and Containers (4): Networking and Volumes — How Containers Talk and Persist
  5. 05 Docker and Containers (5): Docker Compose — Multi-Container Applications
  6. 06 Docker and Containers (6): Debugging and Logging — When Things Go Wrong Inside a Box you are here
  7. 07 Docker and Containers (7): Security — Running Containers Without Giving Away the Keys
  8. 08 Docker and Containers (8): Beyond Docker — Kubernetes, Swarm, and What Comes Next

Liked this piece?

Follow on GitHub for the next one — usually one a week.

GitHub