Python CLI for harvesting files from Telegram channels, extracting archives, deduplicating results, and producing channel-scoped outputs. It supports both channels with known static passwords and funnel channels where the archive password is written in the Telegram post itself.
- Sequential channel download workflow using
tdl - Archive extraction with
7z - Static passwords via
password,password1,password2, and similar CSV columns - Post-derived passwords via
password_source=password_in_post - Same-message password matching for funnel channels
- Cross-channel and per-channel deduplication with
rdfind - Text aggregation and sorted unique combo output
- Optional stealer-log processing into
credentials.csvandautofills.csv - Channel validation mode that can comment out inactive CSV rows
- Python 3.10+
- Optional:
tqdmfor progress bars
tdl7zrdfindsort
In normal processing mode all four are required. In --process-only mode, tdl is not required because the script only processes files already present in the downloads directory.
python3 telegram_processor.py \
--input channels.csv \
--start 01-04-2026 \
--end 22-04-2026Useful options:
--output-dir: output directory for final results--download-dir: directory for downloaded and extracted channel files--settings: path tosettings.json--verbose: logtdland extraction command details--process-only: skip Telegram downloads and process files already in--download-dir--auto-clean: remove processed channel directories without prompting
python3 telegram_processor.py \
--input channels.csv \
--check-channelsTo comment out inactive rows in place:
python3 telegram_processor.py \
--input channels.csv \
--check-channels \
--comment-missing--check-channels does not require --start or --end.
Inactive channels are classified from tdl output as:
not_found: channel username or target no longer existsinaccessible: private, forbidden, or otherwise unreachable from the current accounterror: an unexpected failure while checking
When --comment-missing is used, inactive rows are rewritten like this:
# SomeChannel,@somechannel,secret # inaccessibleThe CSV must start with:
namechannel
Supported optional columns:
password_sourcepassword,password1,password2, and any additional password columns afterchannel
Static-password example:
name,channel,password
Channel1,@channel1,password123
Channel2,@channel2,Funnel-channel example:
name,channel,password_source
OnlyLogsCloud,@OnlyLogsCloud,password_in_post- The processor reads the exported Telegram message for each downloaded archive.
- It only accepts passwords found in the same message as the archive file.
- It looks for case-insensitive
passorpasswordmarkers followed by a delimiter such as:,=,-, or whitespace. - Supported examples include:
pass: 123Password: @OnlyLogsCloudPassword FULL LOGS - @BurnCloudLogs
- Unlabeled promo links and reserve-channel links are ignored.
- If one message contains multiple archives and one matching password, that password is applied to each archive from that message.
- Static password columns are ignored when
password_source=password_in_post. - If no same-message password is found, or extraction still fails, that archive is skipped and processing continues.
Already-commented CSV rows are ignored by the channel checker.
- Load channel definitions from the CSV.
- In normal mode, export channel messages with
tdl chat exportand download matching files. - Deduplicate downloaded files across channels with
rdfind. - Extract archives with
7z. - For passworded archives, try either:
- configured static passwords, then no password
- or the same-message post password when
password_in_postis enabled
- Deduplicate extracted files inside each channel directory.
- Process stealer-log outputs when archives are present.
- Combine and sort unique text output into a
*-combo.csvfile. - Move final result files into the output directory.
- Optionally clean channel working directories.
Depending on the channel contents, the script may emit:
{channel_name}-{month-year}-combo.csv{channel_name}-{month-year}-credentials.csv{channel_name}-{month-year}-autofills.csv
All final outputs are written to the configured output directory.
settings.json controls download, extraction, sorting, and subprocess behavior. The tracked file in this repo is a sample configuration; update the stealer_log_processor.path for your machine if you use that integration.
Current sections:
stealer_log_processortdlsortarchiveprocessingloggingsubprocess
Notable keys:
tdl.max_parallel_downloadstdl.export_channel_threadstdl.excluded_extensionsarchive.extract_patternsarchive.supported_extensionsarchive.extract_timeoutarchive.max_parallel_extractionssort.temp_dirprocessing.max_workers
channels.csvis intentionally gitignored and treated as local operator data.- If
tqdmis not installed, the script still works; it only falls back to plain logging.