A Python tool for analyzing self-citation patterns in Google Scholar profiles.
Scholar Citations is a powerful tool that helps researchers and evaluators analyze self-citation patterns in Google Scholar profiles. Self-citations (when authors cite their own previous work) are a normal part of academic publishing, but excessive self-citation can sometimes skew metrics like h-index and citation counts.
This tool allows you to:
- Analyze any Google Scholar profile to identify self-citations
- Calculate self-citation percentages and metrics
- Generate detailed reports of self-citation patterns
- Estimate self-citation counts for highly-cited papers using sampling
doi: https://doi.org/10.1038/d41586-019-02479-7Vaidyanathan, a computer scientist at the Vel Tech R&D Institute of Technology, a privately run institute, is an extreme example: he has received 94% of his citations from himself or his co- authors up to 2017, according to a study in PLoS Biology this month.
https://github.com/rahvis/scholar_citations/blob/main/notes/vannoorden2019.pdf
pip install scholar-citations- Python 3.7 or higher
- Google Chrome browser (for Selenium automation)
scholar-citations "https://scholar.google.com/citations?user=USER_ID"Replace USER_ID with the ID from the Google Scholar profile URL you want to analyze.
# Analyze only the first 20 papers
scholar-citations "https://scholar.google.com/citations?user=USER_ID" --max-papers 20
# Check only 10 citations per paper (for faster analysis)
scholar-citations "https://scholar.google.com/citations?user=USER_ID" --max-citations 10
# Save detailed results to a JSON file
scholar-citations "https://scholar.google.com/citations?user=USER_ID" --output results.json
# Show the browser window (useful for solving CAPTCHAs)
scholar-citations "https://scholar.google.com/citations?user=USER_ID" --visible
# Enable debug logging
scholar-citations "https://scholar.google.com/citations?user=USER_ID" --debugscholar-citations --help
usage: scholar-citations [-h] [--max-papers MAX_PAPERS] [--max-citations MAX_CITATIONS] [--output OUTPUT] [--visible] [--debug] url
Analyze self-citations on Google Scholar
positional arguments:
url Google Scholar profile URL
optional arguments:
-h, --help show this help message and exit
--max-papers MAX_PAPERS
Maximum number of papers to analyze
--max-citations MAX_CITATIONS
Maximum number of citations to check per paper
--output OUTPUT Output file for detailed results (JSON)
--visible Show browser window during analysis
--debug Enable debug logging- Anti-detection measures: Uses sophisticated browser fingerprinting techniques to avoid detection
- Robust author matching: Intelligently matches different formats of author names to detect self-citations
- Progress saving: Saves intermediate results to avoid losing progress if the process is interrupted
- Sampling: For papers with many citations, examines a representative sample and extrapolates results
- Detailed reporting: Provides both summary statistics and detailed paper-by-paper analysis
- CAPTCHA handling: When run with
--visible, allows you to solve CAPTCHAs if they appear
======= RESULTS =======
Author: Author XYZ
Papers analyzed: 101 of 101
Total citations: 129
Self-citations: 21
Self-citation percentage: 16.28%
- The tool visits the specified Google Scholar profile
- It extracts the list of publications by the author
- For each publication, it analyzes the "Cited by" list
- It compares author lists to identify overlaps (self-citations)
- It calculates statistics and generates a report
git clone https://github.com/yourusername/scholar_citations.git
cd scholar_citations
pip install -e .pip install pytest
pytest tests/
================================================================== test session starts ==================================================================
platform darwin -- Python 3.9.21, pytest-8.3.4, pluggy-1.5.0
rootdir: /Users/rahul/Downloads/scholar_citations
configfile: pyproject.toml
plugins: cov-6.0.0
collected 3 items
tests/test_analyzer.py ... [100%]
=================================================================== 3 passed in 0.03s ===================================================================This project is licensed under the MIT License - see the LICENSE file for details.
If you use this tool in your research, please cite it as:
Vishwakarma, R. (2025). Scholar Citations: A tool for analyzing self-citation patterns in Google Scholar profiles. [Software]. Available from https://pypi.org/project/scholar-citations/
This tool is meant for academic and research purposes only. Please use responsibly and respect Google Scholar's terms of service. The tool includes rate limiting and anti-detection features to minimize impact on Google's servers.