What is Slop Score?
Slop Score is a leaderboard and analysis tool for computing how much a given text looks/smells like AI "slop". It looks specifically at words and patterns that occur more frequently in AI text than in human text.
Not an AI Detector!
This tool works differently than most AI detectors: It looks for glaringly over-used patterns, rather than trying to classify the text as AI or human. It will detect words and phrases that smell of AI, but it won't reliably help you avoid AI detection. The best way to avoid AI detectors is to write in your own words.
Tips for the Analysis Tool
- Analysis works best on several pieces of writing on different topics. The more the better! It will still work on a single story or essay, but the results will be skewed by the topic.
- The analysis is optimised for creative writing and essays. It may work for other domains, but will likely under-report slop.
Slop Score Calculation
The Slop Score is a weighted composite metric designed to detect AI-generated text patterns:
- 45% - Slop Words: Frequency of individual words that appear unnaturally often in LLM outputs
- 40% - Not-x-but-y Patterns: Frequency of contrast patterns like "not just X, but Y" which are overused by AI
- 15% - Slop Trigrams: Frequency of 3-word phrases that appear unnaturally often in LLM outputs
Slop Lists
The slop word and trigram lists are produced using the slop-forensics toolkit. This tool identifies words and n-grams that are statistically over-represented in LLM outputs compared to human writing.
The lists were computed by analyzing outputs from 10 different language models on a selection of essay and creative writing prompts, then comparing them against human-authored text.
View the lists: Slop Words | Slop Trigrams
Over-represented Words & Trigrams
These sections show words and phrases that occur more frequently in the analyzed text than in typical human writing, based on:
- Words: Compared against the wordfreq baseline (a large corpus of human language)
- Trigrams: Compared against a human baseline corpus of essays and creative writing
Additional Metrics
The tool also provides various writing style metrics including lexical diversity (MATTR-500), vocabulary level (Flesch-Kincaid), sentence and paragraph length, and dialogue frequency to give a comprehensive view of the text's characteristics.
Leaderboard
The leaderboard shows how different language models score on the Slop metric, based on their outputs from a standardized set of prompts. Lower scores indicate more human-like writing patterns.