Skip to content

Add lint to prevent (most) URLs #104#106

Open
KoxyG wants to merge 9 commits intoethereum:masterfrom
KoxyG:add_lint
Open

Add lint to prevent (most) URLs #104#106
KoxyG wants to merge 9 commits intoethereum:masterfrom
KoxyG:add_lint

Conversation

@KoxyG
Copy link
Copy Markdown

@KoxyG KoxyG commented Sep 4, 2024

Pls review and let me know if there is anything that needs changes @SamWilsn

// Report the issue
let source = self.ctx.source_for_text(ast.sourcepos.start.line, text);
let offending_url = self.re.find(text).unwrap().as_str();
let suggestion_label = format!("Avoid using backticks in URLs: `{}`", offending_url);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haven't investigated the code yet, but this text seems to indicate it does the opposite of what we need.

The lint is supposed to make sure authors don't put URLs in backticks, not that they don't put backticks in URLs.

Comment on lines +15 to +16
#[derive(Debug, Serialize, Deserialize, Clone)]
pub struct PreventUrlsNoBackticks<S>(pub S);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably look something like:

Suggested change
#[derive(Debug, Serialize, Deserialize, Clone)]
pub struct PreventUrlsNoBackticks<S>(pub S);
#[derive(Debug, Serialize, Deserialize, Clone)]
pub struct PreventUrlsNoBackticks<S> {
allowed_domains: Vec<S>,
}

Then the default configuration in lib.rs would look like:

        (
            "markdown-prevent-url",
            MarkdownPreventUrl {
                allowed_domains: vec![
                    r"(.+\.)?example.com",
                    r"(.+\.)?invalid",
                    // And so on...
                ]
            }
        ), 

header: value1
---

Check this URL: `http://example.com/page?query=foo`
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should write tests based on the examples in #104. This URL is very close to one of the examples I gave that should not trigger the lint.

In other words, this URL should be allowed because its domain is example.com—one of the allowed example domains at the top of #104.

@KoxyG
Copy link
Copy Markdown
Author

KoxyG commented Sep 7, 2024

Hi @SamWilsn I just made changes. Kindly check and help review.

Copy link
Copy Markdown
Contributor

@SamWilsn SamWilsn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick review. You'll need to resolve the compilation issues before we can go any further.


// Regex to match URLs not from allowed domains and containing backtick

let re = Regex::new(r"https?://(?:example\.com|example\.net|example\.org|example|invalid|test)(?:/[^`\s]*)?").unwrap();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't quite figured out what this regex is for yet, but you shouldn't hardcode the list of domains here. It could get out of sync with self.allowed_domains.

// If the URL is in backticks, report it
if in_backticks {
let source = self.ctx.source_for_text(ast.sourcepos.start.line, text);
let message = format!("URLs are not allowed in backticks: `{}`", url);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let message = format!("URLs are not allowed in backticks: `{}`", url);
let message = "Valid URLs are not allowed in backticks";

I think we can use a simpler error message here.

title: Some(Annotation {
annotation_type: self.ctx.annotation_type(),
id: Some(self.slug),
label: Some("not from allowed domain found"),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
label: Some("not from allowed domain found"),
label: Some("non-reserved domain"),

How about that?

Comment on lines +110 to +133
// Report other disallowed URLs
let source = self.ctx.source_for_text(ast.sourcepos.start.line, text);
let suggestion_label = format!("This URL must be hyperlinked or from an allowed domain: `{}`", url);

self.ctx.report(Snippet {
opt: Default::default(),
title: Some(Annotation {
annotation_type: self.ctx.annotation_type(),
id: Some(self.slug),
label: Some("Disallowed URL in plain text or code fence"),
}),
slices: vec![Slice {
fold: false,
line_start: ast.sourcepos.start.line,
origin: self.ctx.origin(),
source: &source,
annotations: vec![],
}],
footer: vec![Annotation {
annotation_type: AnnotationType::Help,
id: None,
label: Some(&suggestion_label),
}],
})?;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not immediately obvious why this section is separate from the above one. Could you explain?

Comment on lines +145 to +171
impl<'a, 'b, 'c> tree::Visitor for Visitor<'a, 'b, 'c> {
type Error = Error;

fn enter_link(&mut self, _ast: &Ast, _link: &str) -> Result<Next, Self::Error> {
Ok(Next::TraverseChildren)
}

fn enter_code(&mut self, ast: &Ast, code: &comrak::nodes::NodeCode) -> Result<Next, Self::Error> {
self.check(ast, &code.literal, false, false)
}

fn enter_text(&mut self, ast: &Ast, text: &str) -> Result<Next, Self::Error> {
self.check(ast, text, false, false)
}

fn enter_html_inline(&mut self, ast: &Ast, html: &str) -> Result<Next, Self::Error> {
// For inline HTML links like <https://link>
self.check(ast, html, true, false)
}

fn enter_code_block(&mut self, ast: &Ast, block: &comrak::nodes::NodeCodeBlock) -> Result<Next, Self::Error> {
let code_text = String::from_utf8_lossy(&block.literal);
self.check(ast, &code_text, false, false)
}


} No newline at end of file
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're missing a few cases here I think.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants