Skip to content

HRINT-4790/4788/4789: SEO analysis, GenAI spam check, social media posting#138

Open
ayende wants to merge 17 commits into
DOTNET8-MIGRATIONfrom
HRINT-4790-seo-analysis-rebased
Open

HRINT-4790/4788/4789: SEO analysis, GenAI spam check, social media posting#138
ayende wants to merge 17 commits into
DOTNET8-MIGRATIONfrom
HRINT-4790-seo-analysis-rebased

Conversation

@ayende

@ayende ayende commented May 15, 2026

Copy link
Copy Markdown
Owner

Summary

  • HRINT-4790: AI-powered SEO analysis — GenAI task generates meta descriptions, keywords, and JSON-LD structured data for blog posts
  • HRINT-4788: Replace Akismet with GenAI-based pending comment moderation — comments start as Pending, evaluated by AI, with daily spam digest emails
  • HRINT-4789: Social media auto-publishing — GenAI generates platform-tailored text for Twitter/Reddit, posted via RavenDB subscription with @social tag convention
  • Fix post body heading styles so section headings sit below the post title

Key changes

SEO (HRINT-4790)

  • Post model: SeoMetaDescription, SeoKeywords, SeoLastAnalyzedAt
  • SeoHelper: JSON-LD BlogPosting structured data
  • Public view: AI meta description with fallback, keywords meta tag, JSON-LD
  • Admin views: read-only SEO panel in post details and edit form
  • GenAI seo-analysis task on Posts collection

Spam check (HRINT-4788)

  • SpamCheckStatus enum (Pending, Valid, Spam) on PostComments.Comment
  • New comments start as Pending, visible only to poster or authenticated users
  • GenAI spam-filter task evaluates comments — Valid triggers email, Spam decrements count and accumulates in daily digest
  • Sets IsTrustedCommenter on valid comments (CAPTCHA bypass for returning commenters)
  • Remove Akismet, TaskExecutor, AddCommentTask, SendEmailTask infrastructure
  • New EmailSubscription worker via RavenDB data subscription

Social media (HRINT-4789)

  • SocialMedia model: TwitterText, RedditTitle, GeneratedAt, DisableAutoPublish
  • GenAI social-media task generates text on post updates, skips historical backlog
  • SocialPostingSubscription via RavenDB subscription (replaces FluentScheduler polling)
  • @refresh metadata for future posts triggers posting at publish time
  • Tag convention: @social (all), @social/reddit, @social/twitter, @social/disable
  • @-prefixed tags filtered from all public views
  • Reddit posting pending RedditSharp 2.0 migration; Twitter is a stub

Heading fix

  • CSS: demote h1-h6 inside .text-wrapper so body headings sit below post title

Manual setup required in RavenDB Studio

  • ai-chat AI connection string (chat-capable model)
  • email-worker data subscription (EmailCommands collection)
  • social-posting data subscription (Posts collection)

Migration patch for existing comments

Run in RavenDB Studio to set SpamCheckStatus on existing comments:

from PostComments
update {
    for (var i = 0; i < this.Comments.length; i++) {
        if (!this.Comments[i].SpamCheckStatus) {
            this.Comments[i].SpamCheckStatus = 'Valid';
        }
    }
    for (var i = 0; i < this.Spam.length; i++) {
        if (!this.Spam[i].SpamCheckStatus) {
            this.Spam[i].SpamCheckStatus = 'Spam';
            this.Spam[i].IsSpam = true;
        }
    }
}

Model compatibility

All changes are additive (new fields only). Old code ignores new fields; RavenDB preserves unknown JSON fields on load/save. Safe for rolling deployment.

Test plan

  • Verify build passes (dotnet build)
  • Deploy and configure ai-chat connection string in RavenDB Studio
  • Create email-worker and social-posting data subscriptions
  • Run the SpamCheckStatus migration patch
  • Verify existing comments still display correctly
  • Submit a new comment — should appear as Pending, then be evaluated by GenAI
  • Verify SEO meta tags and JSON-LD appear on post detail pages
  • Add @social tag to a post and verify social text is generated
  • Verify @-prefixed tags are hidden from public views
  • Check post body headings are visually smaller than post title

ayende and others added 17 commits May 15, 2026 16:19
SEO analysis (HRINT-4790):
- Post model: SeoMetaDescription, SeoKeywords, SeoLastAnalyzedAt fields
- SeoHelper: JSON-LD BlogPosting structured data for public views
- Public view: AI meta description with fallback, keywords meta tag, JSON-LD
- Admin views: SEO analysis panel (read-only) in post details and edit
- GenAI task on Posts collection generates SEO metadata via "ai-chat" connection

Replace Akismet with GenAI spam check (HRINT-4788):
- SpamCheckStatus enum (Pending, Valid, Spam) on PostComments.Comment
- Comments start as Pending, visible only to poster (by cookie) or authenticated users
- GenAI task evaluates pending comments, marks Valid or Spam
- Valid: sets IsTrustedCommenter, triggers email notification
- Spam: moves to Spam list, decrements CommentsCount, accumulates in daily SpamDigest
- Remove Akismet dependency, AkismetService, TaskExecutor, AddCommentTask infrastructure
- EmailSubscription worker consumes EmailCommands via RavenDB subscription
- PostComments_CreationDate index filters out Pending comments
- Admin UI: PENDING/SPAM labels on comments, SpamCheckStatus in MarkSpam/MarkHam

Infrastructure:
- All GenAI tasks use "ai-chat" connection string (configured manually in RavenDB Studio)
- Document refresh enabled for @refresh-based future post triggers
- "email-worker" data subscription expected in RavenDB Studio
Social media GenAI task:
- Generates platform-tailored TwitterText and RedditTitle for published posts
- Regenerates on post updates, skips historical backlog on initial deployment
- Sets @refresh metadata for future posts to trigger posting at publish time

Social posting subscription:
- RavenDB subscription on Posts collection replaces FluentScheduler polling
- Posts to Reddit (pending RedditSharp 2.0 migration) and Twitter (stub)
- Checks: published, AI text ready, correct tags, not disabled

Tag convention (stored as slugs):
  @social         → post to all platforms
  @social/reddit  → Reddit only
  @social/twitter → Twitter only
  @social/disable → kill all social posting
  Legacy "reddit" tag still accepted for backward compatibility

Filter @-prefixed tags from all public views (post details, post list, related posts)

Model: SocialMedia (TwitterText, RedditTitle, GeneratedAt, DisableAutoPublish) on Post
Demote h1-h6 sizes inside .text-wrapper so body headings are visually
subordinate to the post title (h2 at 36px desktop / 24px mobile):
  h1 → 26px, h2 → 22px, h3 → 18px, h4+ → 14px
Both EmailSubscription and SocialPostingSubscription now call
EnsureSubscriptionExists() before getting the worker. If the named
subscription doesn't exist in RavenDB, it's created automatically.
No more manual setup required in RavenDB Studio.
- email-worker: filter 'where SendTo != null', start from LastDocument
- social-posting: filter 'where Social.GeneratedAt != null', start from LastDocument

ChangeVector = "LastDocument" ensures the subscription only processes
new documents, not the entire backlog. The query filters ensure the
subscription only fires for documents that are actually ready to process.
- Subscriptions trigger on lack of @refresh metadata
- refactoring spam-filter gen ai task
Fixes in spam filter UpdateScript:
- Fix SpamDigest creation: load || default syntax was broken (comma operator),
  use explicit if/null check with put() passing metadata separately
- Fix undefined blogName: load BlogConfig in Valid branch before referencing it

EmailSubscription resolves SendTo from BlogConfig.OwnerEmail when not set
on the command. The GenAI UpdateScript no longer sets SendTo — the email
worker handles recipient resolution, keeping the UpdateScript simpler.

Subscription query changed from 'SendTo != null' to 'Subject != null'
since SendTo is no longer set by the UpdateScript.
- SpamDigest uses load(id) || {} pattern instead of if/else
- Remove BlogName from UpdateScript email commands — the EmailSubscription
  resolves it from BlogConfig at send time (falls back to cmd.BlogName
  for backward compat with existing documents)
- BlogConfig is loaded once per command for both SendTo and BlogName
Fix MarkHam in admin PostsController: the old code iterated all comments
with IsSpam=true instead of just the selected ones. Now only processes
the ham array (comments selected by the admin).

Clean up subscription workers: replace async lambda + await
Task.CompletedTask with synchronous lambda + return Task.CompletedTask.
GenAI tasks extracted to separate files:
- Infrastructure/GenAiTasks/SpamFilterGenAiTask.cs
- Infrastructure/GenAiTasks/SeoAnalysisGenAiTask.cs
- Infrastructure/GenAiTasks/SocialMediaGenAiTask.cs

Social posting subscription:
- Set @refresh on future posts so subscription fires at publish time
- Move Social.GeneratedAt and DisableAutoPublish checks to subscription query
- Twitter posting via X API v2 with Bearer token from BlogConfig
- Reddit posting logged as pending RedditSharp 2.0 migration

Email templates:
- Switch from inline HTML to Scriban 7.2.0 templates
- Proper HTML email layout matching blog style (dark header #2c3e50)
- NewComment template: post link, comment body, author info, action buttons
- SpamDigest template: date summary, spam entries with preview, admin link

Spam rejection for repeat offenders:
- Commenter with >4 spam comments: return success but don't save comment
- User sees "posted soon" message but comment is silently discarded

BlogConfig: add TwitterBearerToken field + admin settings UI
RavenDB patch put() signature is put(id, doc) or put(id, doc, changeVector).
The third argument must be a string (change vector), not a metadata object.
Set metadata via @metadata property on the document itself instead.

Fixes: "The change vector must be a string or null" error when creating
SpamDigest and NewComment EmailCommand documents from the UpdateScript.
Post model: replace flat SeoMetaDescription/SeoKeywords/SeoLastAnalyzedAt
with Post.Seo (SeoMetadata) — matches the Social/Integration pattern.

GenAI UpdateScript writes this.Seo = { MetaDescription, Keywords, LastAnalyzedAt }.
AutoMapper bridges nested model to flat view model properties.
PostInput stays flat for form binding, mapped to SeoMetadata on save.
All three GenAI tasks now try Add first, and if the task already exists,
fall back to Update (keeping the existing change vector). This ensures
prompt/script changes are deployed on app restart without requiring
manual deletion of the task in RavenDB Studio.

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR replaces Akismet-based spam filtering with a RavenDB GenAI pipeline, adds AI-generated SEO metadata (meta description, keywords, JSON-LD structured data) to posts, introduces auto-publishing of posts to Twitter/Reddit driven by @social tag conventions and RavenDB subscriptions, and tweaks LESS so in-body headings sit below the post title. Comments are now created in a Pending state, evaluated by a spam-filter GenAI task, and either accepted (sending a NewComment email) or moved to spam (accumulating into a daily digest). Email delivery and social posting move from FluentScheduler/TaskExecutor to RavenDB data subscriptions (email-worker, social-posting).

Changes:

  • Add GenAI tasks (spam-filter, seo-analysis, social-media) registered at startup, plus EmailSubscription and SocialPostingSubscription workers; remove Akismet service, AddCommentTask, SendEmailTask, BackgroundTask/TaskExecutor.
  • Extend Post with Seo/Social sub-models, PostComments.Comment with SpamCheckStatus, BlogConfig with TwitterBearerToken; surface SEO data in admin and public views, and filter @-prefixed tags from public mappers.
  • Delete Web.config/Web.Debug.config/Web.Release.config, drop the Joel.Net.Akismet reference, add Scriban 7.2.0, and add LESS rules demoting h1..h6 inside .text-wrapper.

Reviewed changes

Copilot reviewed 41 out of 42 changed files in this pull request and generated 13 comments.

Show a summary per file
File Description
RaccoonBlog.Web/wwwroot/css/styles.less Demote body heading sizes inside .text-wrapper.
RaccoonBlog.Web/Web.Release.config / Web.Debug.config / Web.config Delete IIS/ASP.NET classic config files.
RaccoonBlog.Web/Views/Welcome/Index.cshtml Remove Akismet key field from welcome form.
RaccoonBlog.Web/Views/PostDetails/Details.cshtml Use AI meta description with fallback, render keywords + JSON-LD, hide @-prefixed related tags.
RaccoonBlog.Web/ViewModels/SpamDigestEmailViewModel.cs New view model for daily spam digest email.
RaccoonBlog.Web/ViewModels/PostViewModel.cs Add SEO fields to public post view model.
RaccoonBlog.Web/ViewModels/NewCommentEmailViewModel.cs Add SpamCheckStatus field.
RaccoonBlog.Web/ViewModels/AdminPostDetailsViewModel.cs Surface SpamCheckStatus on admin comment and SEO fields on post.
RaccoonBlog.Web/Services/AkismetService.cs Delete Akismet service.
RaccoonBlog.Web/RaccoonBlog.Web.csproj Drop Joel.Net.Akismet reference; add Scriban package.
RaccoonBlog.Web/Program.cs Register GenAI tasks, enable document refresh, start subscriptions; drop AutoMapper/auth/index using/cert callback.
RaccoonBlog.Web/Models/SocialMedia.cs New embedded model with Twitter/Reddit text and auto-publish flags.
RaccoonBlog.Web/Models/SeoMetadata.cs New embedded SEO model.
RaccoonBlog.Web/Models/SendEmailCommand.cs New EmailCommands document shape (incl. SpamDigest entries).
RaccoonBlog.Web/Models/PostComments.cs Add SpamCheckStatus enum and field on Comment.
RaccoonBlog.Web/Models/Post.cs Add Social, Seo, and SEO-related PostInput properties.
RaccoonBlog.Web/Models/BlogConfig.cs Add TwitterBearerToken setting.
RaccoonBlog.Web/Infrastructure/Tasks/* Delete TaskExecutor, BackgroundTask, SendEmailTask, AddCommentTask.
RaccoonBlog.Web/Infrastructure/SocialPostingSubscription.cs New subscription worker that posts to Twitter (stub Reddit).
RaccoonBlog.Web/Infrastructure/Indexes/PostComments_CreationDate.cs Exclude Pending comments from the index.
RaccoonBlog.Web/Infrastructure/GenAiTasks/SpamFilterGenAiTask.cs Register spam-filter GenAI task + JS update script (creates emails/digest docs).
RaccoonBlog.Web/Infrastructure/GenAiTasks/SocialMediaGenAiTask.cs Register social-media GenAI task.
RaccoonBlog.Web/Infrastructure/GenAiTasks/SeoAnalysisGenAiTask.cs Register SEO-analysis GenAI task.
RaccoonBlog.Web/Infrastructure/EmailSubscription.cs New subscription worker + Scriban templates for emails.
RaccoonBlog.Web/Infrastructure/AutoMapper/Profiles/Resolvers/TagsResolver.cs Guard against null/empty tag collections.
RaccoonBlog.Web/Infrastructure/AutoMapper/Profiles/PostViewModelMapperProfile.cs Map SEO fields and filter @ tags from public mapping.
RaccoonBlog.Web/Infrastructure/AutoMapper/Profiles/PostsViewModelMapperProfile.cs Filter @ tags.
RaccoonBlog.Web/Infrastructure/AutoMapper/Profiles/PostsAdminViewModelMapperProfile.cs Map SEO fields between PostInput/Post/Admin view models.
RaccoonBlog.Web/Infrastructure/AutoMapper/AutoMapperConfiguration.cs Remove HttpRequest -> AddCommentTask.RequestValues map.
RaccoonBlog.Web/Helpers/SeoHelper.cs New helper to emit JSON-LD BlogPosting.
RaccoonBlog.Web/Helpers/RedditHelper.cs Include new social/social-reddit tags and respect Social.DisableAutoPublish.
RaccoonBlog.Web/Controllers/PostDetailsController.cs Inline comment creation as Pending; hide pending comments from non-authors.
RaccoonBlog.Web/Areas/Admin/Views/Settings/Index.cshtml Drop Akismet field; add Twitter bearer token field.
RaccoonBlog.Web/Areas/Admin/Views/Posts/Edit.cshtml Read-only SEO panel.
RaccoonBlog.Web/Areas/Admin/Views/Posts/Details.cshtml Show SEO summary and SpamCheckStatus badges.
RaccoonBlog.Web/Areas/Admin/Controllers/PostsController.cs Replace Akismet calls with SpamCheckStatus updates in admin moderation.
Comments suppressed due to low confidence (1)

RaccoonBlog.Web/Web.config:1

  • Web.config, Web.Debug.config, and Web.Release.config are being deleted as part of this PR, but the PR description focuses on SEO/spam/social media work and contains no rationale for removing the IIS/ASP.NET configuration files. These files configure the Raven connection string for production, mail/SMTP settings, MIME types, assembly binding redirects, dotless handler, machine key, etc. Removing them outright will break Release deployments (no Raven/Urls, no MIME maps, no binding redirects). If the project has migrated entirely off System.Web/IIS classic config to appsettings.json this should be called out explicitly in the PR description; otherwise these deletions look unrelated to the stated scope and likely unintentional.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 44 to +51
.ForMember(x => x.TagsAsSlugs, o => o.Ignore())
.ForMember(x => x.Tags, o => o.MapFrom(m => TagsResolver.ResolveTagsInput(m.Tags)))
.ForMember(x => x.Seo, o => o.MapFrom(m => new SeoMetadata
{
MetaDescription = m.SeoMetaDescription,
Keywords = TagsResolver.ResolveTagsInput(m.SeoKeywords),
LastAnalyzedAt = m.SeoLastAnalyzedAt
}))
Comment on lines +109 to +120
var dig = load(digestId) || {
SpamComments: [],
Count: 0,
'@metadata': {
'@refresh': tomorrow.toISOString(),
'@collection': 'EmailCommands'
}
};
dig.Count++;
dig.SpamComments.push(spamEntry);

put(digestId, dig);
Comment on lines +132 to +155
var post = load(this.Post.Id);
var postTitle = post ? post.Title : '';

var emailCmd = {
Type: 'NewComment',
View: 'NewComment',
ReplyTo: $input.Email || '',
Subject: 'Comment on: ' + postTitle + ' from ' + $input.Author,
CommentId: $input.Id,
Author: $input.Author || '',
CommentBody: $input.Body || '',
CommentEmail: $input.Email || '',
CommentUrl: $input.Url || '',
CreatedAt: new Date().toISOString(),
IpAddress: $input.UserHostAddress || '',
UserAgent: $input.UserAgent || '',
CommenterId: $input.CommenterId || '',
PostId: this.Post.Id || '',
PostTitle: postTitle,
PostSlug: post ? post.Slug : '',
Key: post.ShowPostEvenIfPrivate,
'@metadata': { '@collection': 'EmailCommands' }
};
put('EmailCommands/new-comment-' + $input.Id, emailCmd);
};

comments.Comments.Add(comment);
post.CommentsCount = comments.Comments.Count;

// Create or update commenter
var newCommenter = commenter ?? new Commenter { Key = Guid.Parse(input.CommenterKey) };
newCommenter.IsTrustedCommenter = commenter?.IsTrustedCommenter;
Comment on lines 66 to 67

<div class="editor-label">
@Html.LabelFor(model => model.AkismetKey)
</div>
<div class="editor-field">
@Html.TextBoxFor(model => model.AkismetKey)
@Html.ValidationMessageFor(model => model.AkismetKey)
</div>

<div class="editor-label">
Comment on lines +164 to +167
catch
{
store.Maintenance.Send(new UpdateGenAiOperation(config.TaskId, config));
_log.Info("GenAI spam filter task updated.");
Comment on lines +149 to +160
store.Subscriptions.GetSubscriptionState(SubscriptionName);
}
catch (SubscriptionDoesNotExistException)
{
store.Subscriptions.Create(new SubscriptionCreationOptions
{
Name = SubscriptionName,
Query = "from Posts where Social.GeneratedAt != null and (Social.DisableAutoPublish == false or Social.DisableAutoPublish == null) and not exists(@metadata.@refresh)",
ChangeVector = "LastDocument"
});
_log.Info("Created data subscription '{Name}'.", SubscriptionName);
}
Comment on lines +101 to +129
private static string BuildEmailBody(SendEmailCommand cmd, string blogName)
{
var template = cmd.Type switch
{
"NewComment" => NewCommentTemplate,
"SpamDigest" => SpamDigestTemplate,
_ => "{{ subject }}"
};

var scribanTemplate = Scriban.Template.Parse(template);
return scribanTemplate.Render(new
{
cmd.Subject,
cmd.Author,
cmd.CommentBody,
cmd.CommentEmail,
cmd.CommentUrl,
cmd.IpAddress,
cmd.UserAgent,
cmd.PostTitle,
cmd.PostId,
cmd.PostSlug,
cmd.Key,
cmd.DigestDate,
cmd.SpamComments,
blog_name = blogName,
comment_count = cmd.SpamComments?.Count ?? 0
});
}
Comment on lines +74 to +91
using (var client = new SmtpClient())
{
var message = new MailMessage
{
IsBodyHtml = true,
Body = BuildEmailBody(cmd, blogName),
Subject = cmd.Subject
};

if (!string.IsNullOrEmpty(cmd.ReplyTo))
{
try { message.ReplyToList.Add(new MailAddress(cmd.ReplyTo)); }
catch { }
}

message.To.Add(sendTo);
client.Send(message);
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants