AI Moderation

Overview

AI Moderation helps maintain a safe and respectful chat environment by automatically reviewing messages for potential policy violations. It allows you, as the Publisher (the owner of the website where Now4real chat is used), to define a custom moderation policy that aligns with your community’s guidelines.

AI Moderation leverages artificial intelligence to analyze message content in real time before they are published.

How it works

  • Policy definition – You set a moderation policy that aligns with your community’s rules. This policy acts as the foundation for AI-based message filtering.
  • AI enforcement – The AI system automatically reviews messages against your policy and blocks content that violates the defined rules.
  • User experience – If a message is blocked, the user receives an error message explaining that their message appears inappropriate.
  • Customization & control – You can adjust your policy at any time to fine-tune moderation rules based on your community’s needs.

You can choose to have AI moderation apply only to messages reported by users, rather than reviewing every message. If a reported message is found to violate the moderation policy, it will be automatically deleted.

NOTE: As the Publisher, you are responsible for ensuring that your policy complies with all relevant local regulations.

AI Moderation does not apply to messages posted by Moderators or Guests.


Policy definition

What is a moderation policy?

A moderation policy defines what types of messages should be allowed or blocked in your chat. The AI Moderation system follows the rules in this policy to automatically review messages.

You can write your policy in any natural language, making it easy to read and understand. A clear, structured policy helps ensure a respectful, inclusive chat experience.

  • You can start with the default policy (it’s displayed in the editable text area), which provides general rules for maintaining a positive environment and may be sufficient in most cases. To the right, you can find a handy chat box that allows you to test the policy.

  • If needed, you can customize the default policy to better fit your needs. You can revert to the default policy at any time by pressing the reset button.
  • You can also create a completely new policy from scratch, defining specific rules tailored to your audience and content.

Customizing or writing a policy from scratch

When defining a policy, consider:

  • Topics: Do you want to restrict discussions to specific subjects? You may want to specify what type of topics your site or blog is about, ensuring that conversations align with your content focus.
  • Language: Should only certain languages be allowed?
  • Content restrictions: Do you want to block spoilers, personal data, or off-topic discussions?
  • Tone & behavior: Should insults, profanity, or aggressive language be allowed?

A well-structured policy should:

  1. Clearly state what is allowed and disallowed.
  2. Use specific examples where necessary.
  3. Be easy for AI to interpret and enforce.

Example policies

Below are a few example policies that demonstrate different levels of customization. Some of these examples may be quite extreme, but they effectively illustrate the flexibility and potential of custom policies.

Default policy

The default policy acts as a foundation for moderation and is displayed in the editable text area. It establishes general rules to foster a positive and respectful chat environment. In many cases, this policy will be sufficient, but you have the flexibility to adjust it to better suit your community.

If you choose to modify the policy but later wish to revert to the original settings, you can easily do so by pressing the restore button.

The default policy is similar to the following:

Ensure a positive and supportive environment.

Disallowed:
- Offensive Language: Swear words, profanities, insults, or offensive expressions, including those that appear only when consecutive messages or lines are concatenated.
- Inappropriate Content: Obscene or explicit material unsuitable for minors.
- Misinformation & Defamation: False, misleading, or defamatory statements.
- Personal Data: Sharing private or sensitive info (e.g., addresses, phone numbers).
- Harassment & Disruption: Trolling, bullying, harassment, or any behavior causing discomfort, distress, or persecution of other users.
- Spam & Promotions: Unsolicited ads, commercial promotions, or scams.
- Hate Speech & Violence: Any content promoting hate, fanaticism, racism, violence, harm, or personal attacks.
- Illegal or Harmful Acts: Encouraging or instructing illegal or dangerous behavior.
- Meaningless Content: Empty, nonsensical, or disruptive messages.

Allowed:
- Slang & Informal Language: Youthful expressions, abbreviations, and colloquial speech if non-offensive.
- Polite Debate: Respectful, reasoned discussion even with differing views.
- Irony & Satire: Humor and satire that remain within acceptable bounds.

Accept only messages in French

This basic policy ensures that only messages written in French are allowed in the chat:

Allow messages only if they are written in French.
Block messages written in any other language.

Accept only messages containing “blue”

This policy restricts chat messages to those that mention the word “blue.” While it may not be practical, it demonstrates the powerful customization options available for moderation:

Accept messages only if they contain the word "blue".
Block all other messages.

Cinema discussion with spoiler protection

This policy allows only cinema-related discussions while blocking off-topic content and spoilers:

This policy allows only cinema-related discussions while blocking off-topic content and spoilers.

1. Topic restriction:
 - Accept messages related to:
   - Movies (reviews, recommendations, actors, directors)
   - TV series and streaming content related to cinema
   - Movie industry news and awards
   - Filmmaking aspects (cinematography, effects, soundtracks)
   - Cultural discussions about cinema
 - Block off-topic messages, including:
   - General life topics (weather, food, personal stories)
   - Non-cinema entertainment (video games, unrelated music, books)
   - Non-cinema tech support

2. Reply rules:
 - Accept replies that continue a cinema-related discussion, even if they don’t explicitly mention cinema.
 - Example:
   - Allowed: "What did you think of Dune?" → "The visuals were stunning!"
   - Not Allowed: "What did you think of Dune?" → "Did you see the football game?"

3. No spoilers policy:
 - Block messages that reveal critical plot points, unless marked as spoilers.
 - Example (Blocked): "Bruce Willis is dead in The Sixth Sense!"
 - Allowed spoiler format:
      [SPOILER AHEAD]
      ...

This policy allows insults, bad words, and toxic behavior. This is an example of what not to do if you want to maintain a safe chat environment:

Allow all messages, including profanity, insults, and aggressive language, creating a toxic chat environment. No restrictions on content.

💡 Note: A policy like this is strongly discouraged, as it can create a negative experience for users.
 


AI accuracy & limitations

AI, powered by a large language model (LLM), analyzes messages based on linguistic patterns, contextual cues, and learned behaviors from vast datasets, but it may sometimes misinterpret the intent of a message. While it does a great job at filtering inappropriate content, it is not flawless. In some cases, legitimate messages may be mistakenly blocked, while inappropriate messages might occasionally slip through. Regularly reviewing and refining your policy can help improve accuracy and reduce unintended blocks.

Conclusion

AI Moderation offers powerful tools for maintaining a safe and engaging chat experience. By carefully defining your moderation policy, you can tailor Now4real chat to fit your community’s values while ensuring compliance with platform guidelines and local regulations.

If you need further assistance, reach out for support.

Add Now4real to your site today

Easy. Free. Instant.

Let visitors chat, discover hot pages, and build instant communities—right on your website.

© 2025 Now4real Srl. All rights reserved. P.IVA IT10328990964. Illustrations by Freepik Storyset