Related Code Files:
code-intelligence-toolkit/common_utils.py- The enhanced utilities modulecode-intelligence-toolkit/error_logger.py- Error logging integrationcode-intelligence-toolkit/*.py- All Python tools that can use these utilities
The enhanced common_utils.py provides comprehensive shared functionality for security, subprocess handling, file operations, and more. This module serves as the foundation for all Python tools in the code-intelligence-toolkit, ensuring consistent behavior and enterprise-grade safety across all operations.
Path Validation and Sanitization:
validate_path()- Prevents path traversal attacks with null byte detectionsanitize_command_arg()- Removes dangerous characters from command argumentssafe_path_join()- Safely joins paths without allowing directory traversalis_safe_filename()- Validates filenames against safe character set
# Example usage
from common_utils import validate_path, safe_path_join
# Validate a path
safe_path = validate_path("./user_input.txt", base_dir="/allowed/dir")
# Safely join paths
output_path = safe_path_join("output", "results", "report.txt")Safe Process Execution:
run_subprocess()- Execute commands with timeout, memory limits, and sanitizationSubprocessTimeout- Custom exception for timeout handlingrun_with_timeout()- Execute Python functions with timeout using threading
from common_utils import run_subprocess, SubprocessTimeout
try:
result = run_subprocess(
['grep', '-r', 'pattern', '.'],
timeout=30,
max_memory_mb=256
)
print(result.stdout)
except SubprocessTimeout:
print("Command timed out")Atomic and Safe File Handling:
atomic_write()- Context manager for atomic file writessafe_file_backup()- Create timestamped backupscalculate_file_hash()- Compute file checksumssafe_file_operation()- Perform operations with rollback capability
from common_utils import atomic_write, safe_file_backup
# Atomic write ensures file is either fully written or not at all
with atomic_write('config.json') as f:
json.dump(config_data, f)
# Create backup before modifying
backup_path = safe_file_backup('important.txt')Context-Aware Error Management:
ErrorContext- Context manager for capturing errors with metadata- Integration with
error_logger.pyfor centralized error tracking
from common_utils import ErrorContext
with ErrorContext("database_operation", {"table": "users", "action": "update"}):
# If this fails, error is logged with context
perform_database_update()Operation Tracking for Reversibility:
OperationManifest- Track file operations for potential rollback- Supports undo functionality for move, copy, and delete operations
from common_utils import OperationManifest, safe_file_operation
manifest = OperationManifest()
# Perform operation with tracking
result = safe_file_operation('move', 'old.txt', 'new.txt')
op_id = manifest.add_operation(result)
# Later, rollback if needed
manifest.rollback_operation(op_id)Existing Utilities (Preserved):
- Language detection and file classification
- Safe file content reading with encoding fallback
- Binary file detection
- File statistics and formatting
Platform-Aware Resource Limits:
resource_limit()- Context manager for CPU and memory limits- Automatically handles platform differences (Linux vs macOS)
from common_utils import resource_limit
# Limit CPU time and memory usage
with resource_limit(cpu_seconds=60, memory_mb=512):
process_large_dataset()Standardized Argument Handling:
- Pre-defined common arguments for consistency
add_common_args()- Helper to add standard arguments to parsers
import argparse
from common_utils import add_common_args
parser = argparse.ArgumentParser()
add_common_args(parser, 'scope', 'timeout', 'verbose', 'dry_run')The module is designed to work across different platforms:
- Linux: Full support for all features including memory limits
- macOS: Most features supported, memory limits gracefully degrade
- Windows: Core functionality works, Unix-specific features disabled
-
Import what you need:
from common_utils import ( validate_path, run_subprocess, atomic_write, ErrorContext, add_common_args )
-
Use security functions for all user input:
user_path = validate_path(user_input, base_dir=allowed_dir)
-
Prefer atomic operations for file modifications:
with atomic_write(output_file) as f: f.write(processed_data)
-
Always use timeouts for external processes:
result = run_subprocess(cmd, timeout=30)
For existing tools using the original common_utils.py:
- No breaking changes - All original functions preserved
- New imports available - Add security and subprocess utilities as needed
- Consider upgrading file operations - Use atomic_write for critical files
- Add error context - Wrap operations with ErrorContext for better debugging
- Always validate paths from user input or external sources
- Use atomic operations for configuration files and critical data
- Set reasonable timeouts for all subprocess and long-running operations
- Create backups before destructive operations
- Track operations with OperationManifest when reversibility is needed
- Handle platform differences gracefully (test on target platforms)
The module integrates with the error logging system:
- Errors are automatically logged to
~/.pytoolserrors/ - Context information is preserved for debugging
- Use
analyze_errors.pyto review logged errors
- Resource limits may impact performance on Linux
- Atomic writes use temporary files (ensure adequate disk space)
- File hashing reads entire file (consider for large files)
- Subprocess sanitization has minimal overhead
- Path validation prevents directory traversal attacks
- Command sanitization removes shell metacharacters
- Atomic operations prevent partial file corruption
- Resource limits prevent runaway processes (Linux)
This enhanced module provides a solid foundation for building secure, reliable Python tools with consistent behavior and enterprise-grade safety features.