-
Notifications
You must be signed in to change notification settings - Fork 3
Troubleshooting
This comprehensive guide helps diagnose and resolve common issues with crashupload. Use the quick navigation below to jump to specific problems.
Common Issues:
By Component:
Tools:
Check list:
- Verify dumps exist:
# For broadband/extender
ls -lh /minidumps/*.dmp
# For video devices
ls -lh /opt/minidumps/*.dmp
ls -lh /var/lib/systemd/coredump/core.*- Check service status:
# Systemd services
systemctl status coredump-upload.service
systemctl status minidump-on-bootup-upload.service
# Check if running
ps aux | grep -E 'crashupload|uploadDumps'- Verify network connectivity:
# Ping upload server (extract from config)
UPLOAD_SERVER=$(grep POTOMAC_SVR /etc/device.properties | cut -d= -f2 | sed 's|https\?://||' | cut -d/ -f1)
ping -c 3 "$UPLOAD_SERVER"
# Test DNS resolution
nslookup "$UPLOAD_SERVER"
# Test HTTPS connection
curl -I "https://$UPLOAD_SERVER"- Check logs:
# Main log file
tail -100 /var/log/core_log.txt # or /rdklogs/logs/core_log.txt
# Systemd journal
journalctl -u coredump-upload.service -n 50
journalctl -u minidump-on-bootup-upload.service -n 50
# Look for errors
grep -i error /var/log/core_log.txt | tail -20Cause: Insufficient permissions on dump directories or lock files
Solution:
# Fix dump directory permissions
chmod 755 /minidumps /opt/minidumps /var/lib/systemd/coredump
chown root:root /minidumps /opt/minidumps
# Fix lock file permissions
chmod 666 /tmp/.minidump_upload.lock
chmod 666 /tmp/.coredump_upload.lock
# Fix log file permissions
chmod 666 /var/log/core_log.txtCause: User has opted out of telemetry via RFC
Check:
# Check opt-out status
tr181Set Device.DeviceInfo.X_RDKCENTRAL-COM_RFC.Feature.TelemetryOptOut.Enable
# Check if feature is enabled
grep -r "TelemetryOptOut" /opt/Solution:
- This is expected behavior if user opted out
- To override for testing:
export TELEMETRY_OPTOUT=0 - For production, respect user preference
| HTTP Code | Meaning | Solution |
|---|---|---|
| 400 | Bad Request | Check dump file format, verify metadata |
| 401 | Unauthorized | Verify mTLS certificates if enabled |
| 403 | Forbidden | Check server-side IP whitelist/firewall |
| 413 | Payload Too Large | Dump exceeds server limit, check archivecompression |
| 500 | Server Error | Server-side issue, retry later |
| 503 | Service Unavailable | Server down/overloaded, retry with backoff |
Debug:
# Test upload manually with verbose output
export CURL_VERBOSE=1
export DEBUG=1
/lib/rdk/uploadDumps.sh /opt/minidumps/test.dmp 0
# Check HTTP response code
cat /tmp/httpcodeCause: More than 10 uploads in 10 minutes
Check current state:
# View rate limit history
cat /tmp/.upload_ratelimit
# Count uploads in last 10 minutes
CURRENT=$(date +%s)
CUTOFF=$((CURRENT - 600))
awk -v cutoff="$CUTOFF" '$1 > cutoff' /tmp/.upload_ratelimit | wc -lSolutions:
- Wait for window to expire:
# Find oldest upload in window
OLDEST=$(head -1 /tmp/.upload_ratelimit)
EXPIRES=$((OLDEST + 600))
NOW=$(date +%s)
WAIT=$((EXPIRES - NOW))
echo "Rate limit expires in $WAIT seconds"- Clear rate limit (testing only):
rm /tmp/.upload_ratelimit- Increase limit (configuration):
# Add to /opt/coredump.properties (non-prod)
MAX_UPLOADS=20
RATE_WINDOW=600- Bypass for recovery:
export NO_RATELIMIT=1
/lib/rdk/uploadDumps.sh /opt/minidumps/critical.dmp 0Cause: Same process crashing repeatedly (5+ times in 5 minutes)
Check:
# View recent crashes
tail -50 /var/log/core_log.txt | grep "Crash loop"
# Identify looping process
awk '{print $NF}' /tmp/.upload_ratelimit | sort | uniq -c | sort -rnSolutions:
- Fix the crashing application
- Temporarily disable rate limiting for debugging
- Check for infinite restart loops in systemd
Cause: Previous instance didn't clean up, or still running
Diagnose:
# Check if process is actually running
LOCK_FILE="/tmp/.minidump_upload.lock" # or .coredump_upload.lock
if [ -f "$LOCK_FILE" ]; then
PID=$(cat "$LOCK_FILE")
echo "Lock file contains PID: $PID"
if ps -p $PID > /dev/null 2>&1; then
echo "Process $PID is running"
ps -p $PID -o cmd=
else
echo "Process $PID is NOT running (stale lock)"
fi
fiSolutions:
- If process is running - wait or kill:
# Wait for completion
while [ -f /tmp/.minidump_upload.lock ]; do
sleep 2
echo "Waiting for upload to complete..."
done
# Or kill if hung
PID=$(cat /tmp/.minidump_upload.lock)
kill $PID
sleep 2
kill -9 $PID # Force kill if needed- If stale lock - remove:
rm -f /tmp/.minidump_upload.lock
rm -f /tmp/.coredump_upload.lock- Prevent future stale locks:
# Add trap to cleanup on exit
trap 'rm -f /tmp/.minidump_upload.lock' EXIT INT TERMCause: Script crashed before cleanup
Check:
# Find crashed instances
journalctl | grep -A5 "crashupload.*killed"
journalctl | grep -A5 "uploadDumps.*terminated"
# Check for signals
dmesg | grep -i "killed process"Solution: Ensure SIGTERM handler is working:
# Script should have:
trap 'remove_lock "$LOCK_FILE"' EXIT INT TERM SIGTERMCause: Missing development dependencies
Solution:
# Debian/Ubuntu
sudo apt-get update
sudo apt-get install -y \
build-essential \
autoconf \
automake \
libtool \
libcurl4-openssl-dev \
libssl-dev \
pkg-config
# Yocto/RDK build
bitbake -c populate_sdk rdk-generic-imageCommon errors and fixes:
| Error | Cause | Fix |
|---|---|---|
undefined reference to 'curl_easy_init' |
Missing -lcurl | Add to LDFLAGS in Makefile.am |
error: 'CURLOPT_SSLVERSION' undeclared |
Old libcurl | Update libcurl >= 7.58.0 |
warning: implicit declaration of function |
Missing header | Add #include |
error: 'for' loop initial declarations |
Old C standard | Use gcc with -std=c99 |
Debug:
# Verbose compilation
make V=1
# Check compiler version
gcc --version
# Check library versions
pkg-config --modversion libcurl
pkg-config --modversion opensslSolutions:
# Regenerate autotools files
cd c_sourcecode
autoreconf --install --force
# Clean and reconfigure
make distclean
./configure
makeDiagnose:
cd unittest
# Run tests with verbose output
make check VERBOSE=1
# Run specific test
./network_utils_gtest --gtest_filter="*GetMACAddress*"
# Check test logs
cat test-suite.logCommon test failures:
- Mock failures:
Expected: mock_curl.easy_perform() called 3 times
Actual: called 2 times
Fix: Update test expectations or implementation
- Assertion failures:
Expected: result == 0
Actual: -1
Fix: Check error paths in implementation
- Memory leaks:
LEAK SUMMARY:
definitely lost: 64 bytes in 1 blocks
Fix: Free allocated memory
Run with valgrind:
valgrind --leak-check=full ./network_utils_gtestCheck:
# View L2 test logs
cat /tmp/l2_test_report/report.html
# Run specific scenario
cd test/functional-tests
python3 -m pytest tests/test_basic_upload.py -v -sCommon issues:
- Mock server not running
- Incorrect test environment setup
- Network dependencies
Cause: DNS resolution failure
Diagnose:
# Check DNS servers
cat /etc/resolv.conf
# Test DNS
nslookup crash.rdkcentral.com
dig crash.rdkcentral.com
# Check network interface
ifconfig
ip addr showSolutions:
# Restart networking
/etc/init.d/networking restart
# Use alternate DNS temporarily
echo "nameserver 8.8.8.8" > /etc/resolv.conf
# Check if interface is up
ifconfig erouter0 up # or appropriate interfaceCause: Network unreachable or slow
Diagnose:
# Test connectivity
ping -c 5 crash.rdkcentral.com
# Trace route
traceroute crash.rdkcentral.com
# Test with longer timeout
curl -v --connect-timeout 30 --max-time 120 https://crash.rdkcentral.comSolutions:
# Increase timeout
export CURL_UPLOAD_TIMEOUT=180
# Or in configuration
echo "CURL_UPLOAD_TIMEOUT=180" >> /opt/coredump.propertiesCause: Expired CA certificates or clock skew
Diagnose:
# Check system time
date
# Check certificate validity
openssl s_client -connect crash.rdkcentral.com:443 -showcerts
# Check CA bundle
ls -l /etc/ssl/certs/ca-certificates.crtSolutions:
# Update CA certificates
update-ca-certificates
# Sync system time
ntpdate pool.ntp.org
# Temporary workaround (NOT for production)
export CURL_OPTIONS="--insecure"Cause: TLS version mismatch or cipher incompatibility
Diagnose:
# Test TLS versions
curl -v --tlsv1.2 https://crash.rdkcentral.com
curl -v --tlsv1.3 https://crash.rdkcentral.com
# Check OpenSSL version
openssl versionSolution:
# Force TLS 1.2
export TLS="--tlsv1.2"
# Update OpenSSL if too old
opkg update && opkg upgrade opensslDiagnose:
# Check available space
df -h /tmp
df -h /minidumps
# Check permissions
ls -ld /tmp
ls -ld /minidumps
# Test compression manually
tar czf /tmp/test.tar.gz /opt/minidumps/*.dmpSolutions:
- No space:
# Clean /tmp
rm -rf /tmp/*.tar.gz /tmp/*.log
# Increase tmpfs size
mount -o remount,size=100M /tmp- Permission denied:
chmod 1777 /tmp # Sticky bit + world writable- Compression too slow/large:
# Use faster compression
export GZIP=-1 # Faster, less compression
# Or skip compression (testing only)
tar cf archive.tar file.dmp # No compressionCause: breakpad-logmapper.conf not configured
Check:
# Verify logmapper file
cat /etc/breakpad-logmapper.conf
# Check if logs exist
ls -lh /opt/logs/Solution:
# Create/update logmapper
cat > /etc/breakpad-logmapper.conf << 'EOF'
Receiver|/opt/logs/receiver.log
WPEFramework|/opt/logs/wpeframework.log
xre-receiver|/opt/logs/xre.log
EOFDiagnose:
# Check device type
grep DEVICE_TYPE /etc/device.properties
# Verify expected paths
case "$(grep DEVICE_TYPE /etc/device.properties | cut -d= -f2)" in
broadband)
echo "Expected: /minidumps, /rdklogs/logs"
;;
video)
echo "Expected: /opt/minidumps, /var/log"
;;
extender)
echo "Expected: /minidumps, /var/log/messages"
;;
mediaclient)
echo "Expected: /opt/minidumps, /opt/logs"
;;
esacSolution:
# Fix device.properties
vi /etc/device.properties
# Set correct DEVICE_TYPE
# Create missing directories
mkdir -p /minidumps /opt/minidumps /rdklogs/logsCause: Priority order - higher priority source overriding
Debug:
# Print full configuration resolution
#!/bin/bash
echo "=== Configuration Resolution ==="
# Start with defaults
VALUE="default"
echo "1. Default: $VALUE"
# Load device.properties
if [ -f /etc/device.properties ]; then
source /etc/device.properties
echo "2. device.properties: ${SOME_VAR:-$VALUE}"
fi
# Load overrides
if [ -f /opt/coredump.properties ]; then
source /opt/coredump.properties
echo "3. coredump.properties: ${SOME_VAR:-$VALUE}"
fi
# Check environment
if [ -n "$SOME_VAR" ]; then
echo "4. Environment: $SOME_VAR"
fi
echo "Final value: ${SOME_VAR:-$VALUE}"Measure:
# Time an upload
time /lib/rdk/uploadDumps.sh /opt/minidumps/test.dmp 0
# Profile with timestamps
export DEBUG=1
/lib/rdk/uploadDumps.sh /opt/minidumps/test.dmp 0 2>&1 | while read line; do
echo "$(date +%T.%3N) $line"
doneOptimize:
- Large files: Use compiled binary instead of shell script
- Slow compression: Use faster compression level
- Network slow: Check bandwidth, use closer server
- Many files: Process in batches
Check:
# Monitor memory during upload
while true; do
ps aux | grep -E 'crashupload|uploadDumps' | grep -v grep
free -h
sleep 1
doneSolutions:
- Use compiled binary (4-6MB vs shell 8-10MB)
- Process smaller batch sizes
- Increase available RAM if possible
#!/bin/bash
# crashupload_diag.sh - Complete diagnostic script
echo "=== Crashupload Diagnostics ==="
echo ""
echo "1. System Information:"
uname -a
cat /etc/device.properties | grep -E "DEVICE_TYPE|MODEL|BUILD_TYPE"
echo ""
echo "2. Dump File Status:"
ls -lh /minidumps/*.dmp 2>/dev/null | head -5 || echo " No dumps in /minidumps"
ls -lh /opt/minidumps/*.dmp 2>/dev/null | head -5 || echo " No dumps in /opt/minidumps"
echo ""
echo "3. Service Status:"
systemctl is-active coredump-upload.service 2>/dev/null || echo " coredump-upload: unknown"
systemctl is-active minidump-on-bootup-upload.service 2>/dev/null || echo " minidump-upload: unknown"
echo ""
echo "4. Process Status:"
ps aux | grep -E 'crashupload|uploadDumps' | grep -v grep || echo " No upload process running"
echo ""
echo "5. Lock Files:"
ls -l /tmp/.*.lock 2>/dev/null || echo " No lock files"
echo ""
echo "6. Rate Limit Status:"
if [ -f /tmp/.upload_ratelimit ]; then
COUNT=$(wc -l < /tmp/.upload_ratelimit)
echo " $COUNT uploads in rate limit file"
else
echo " No rate limit file"
fi
echo ""
echo "7. Recent Logs:"
tail -20 /var/log/core_log.txt 2>/dev/null || tail -20 /rdklogs/logs/core_log.txt 2>/dev/null || echo " No logs found"
echo ""
echo "8. Network Connectivity:"
ping -c 2 8.8.8.8 > /dev/null 2>&1 && echo " Internet: ✅" || echo " Internet: ❌"
echo ""
echo "9. Disk Space:"
df -h | grep -E "Filesystem|/tmp|/opt|/minidumps"
echo ""
echo "=== End Diagnostics ==="# Watch uploads in real-time
tail -f /var/log/core_log.txt | grep -E "CRASH_UPLOAD|Upload|Error"
# Monitor network traffic
tcpdump -i any -nn 'tcp port 443' -A
# Watch file changes
inotifywait -m -r /minidumps /opt/minidumpsSuccessful upload:
[CRASH_UPLOAD] Upload successful: test.dmp (HTTP 200)
[CRASH_UPLOAD] Deleted file: /opt/minidumps/test.dmp
Rate limit hit:
[CRASH_UPLOAD] Rate limit exceeded: 10 uploads in 10 minutes
[CRASH_UPLOAD] Skipping upload for: app.dmp
Network failure:
[CRASH_UPLOAD] curl error (6): Could not resolve host
[CRASH_UPLOAD] Retry 1/3 in 5 seconds...
Crashloop detection:
[CRASH_UPLOAD] Crash loop detected for process 'Receiver'
[CRASH_UPLOAD] 5 crashes in last 5 minutes
#!/bin/bash
# analyze_logs.sh - Parse crashupload logs
LOG_FILE="/var/log/core_log.txt"
echo "=== Crashupload Log Analysis ==="
echo ""
echo "Upload Statistics:"
echo " Total attempts: $(grep -c "Starting upload" "$LOG_FILE")"
echo " Successful: $(grep -c "Upload successful" "$LOG_FILE")"
echo " Failed: $(grep -c "Upload failed" "$LOG_FILE")"
echo ""
echo "Top Error Types:"
grep -i error "$LOG_FILE" | awk '{print $NF}' | sort | uniq -c | sort -rn | head -5
echo ""
echo "Rate Limit Events:"
grep -c "Rate limit exceeded" "$LOG_FILE"
echo ""
echo "Recent Failures (last 10):"
grep "Upload failed" "$LOG_FILE" | tail -10
echo ""# Set environment variables
export DEBUG=1
export LOG_LEVEL="DEBUG"
export CURL_VERBOSE=1
export KEEP_TEMP_FILES=1
# Run with debug
/lib/rdk/uploadDumps.sh /opt/minidumps/test.dmp 0
# Or for compiled binary
RDK_LOG_LEVEL=DEBUG /usr/bin/crashupload /opt/minidumps/test.dmp 0# Shell tracing
bash -x /lib/rdk/uploadDumps.sh /opt/minidumps/test.dmp 0 2>&1 | tee /tmp/trace.log
# System call tracing
strace -o /tmp/strace.log -ff /usr/bin/crashupload /opt/minidumps/test.dmp 0
# Library call tracing
ltrace -o /tmp/ltrace.log /usr/bin/crashupload /opt/minidumps/test.dmp 0If you've tried the solutions above and still have issues:
- Gather diagnostic information:
./crashupload_diag.sh > /tmp/diagnostic_report.txt- Collect relevant logs:
tar czf /tmp/crashupload_logs.tar.gz \
/var/log/core_log.txt \
/rdklogs/logs/core_log.txt \
/tmp/.upload_ratelimit \
/etc/device.properties- Report the issue:
- GitHub Issues: https://github.com/rdkcentral/crashupload/issues
- Include diagnostic report
- Describe expected vs actual behavior
- List troubleshooting steps already tried
- Home - Project overview
- Configuration Guide - Configuration details
- Crashupload - Compiled Code - C implementation
- Crashupload - Script Method - Shell script implementation
Navigation: Home | Configuration Guide