Building FormBouncer: A Layered Spam Filter for Contact Forms
The spam started almost immediately after I launched my contact form.
Within hours I had SEO agencies promising page one rankings, dropshippers offering miracle dog harnesses, crypto traders guaranteeing 40 percent daily returns, and at least one distant royal relative who urgently needed my assistance.
I tried the obvious solutions. Some were bloated. Some were heavily cloud dependent. Some quietly nudged me toward Google reCAPTCHA and the wider Google ecosystem. None of that felt right. I did not want to bolt a third party surveillance widget onto a simple contact form.
So I built my own.
FormBouncer, internally called FormShield because I had not picked a name when I started, is a simple API. You send it a submission. It gives you back a spam score and a verdict. That is it.
No widgets. No tracking scripts. Just a decision engine.
Why Not Just Use One Big Model?
Because spam is messy.
Relying entirely on machine learning feels clever, but it also introduces latency, operational complexity, and opaque decisions. On the other hand, relying purely on heuristics misses more subtle spam.
So I built it as a pipeline.
Each submission moves through four layers. Each layer adds signal. None of them alone are allowed to dominate.
The final score is just the sum of those signals, capped at 1.0. If it crosses the configured threshold, it is spam.
Simple in principle. Surprisingly effective in practice.
Layer One: The Honeypot
The first layer is almost insultingly simple.
A hidden field in the form. Real users never see it. Bots that blindly fill every field do.
If that field contains anything, the score immediately jumps to 1.0 and the pipeline stops.
It is not sophisticated. It does not need to be. It quietly eliminates a chunk of low effort spam before anything more expensive runs.
Layer Two: Timing
Humans take time to type.
Bots do not.
The frontend sends two timestamps, when the form loaded and when it was submitted. Submissions under two seconds get a score bump. Submissions that claim to be sent before the page even loaded are flagged as invalid. Forms submitted more than 24 hours after loading are treated as suspicious, usually the result of automated retries with stale payloads.
It turns out time is a surprisingly strong signal.
Layer Three: Heuristics
This is where most of the practical filtering happens.
Instead of one brittle rule, there are multiple small signals:
- Link density
- Known spam phrase matches
- Disposable email domains
- Excessive uppercase usage
- Repeated submissions from the same IP
Each heuristic contributes a capped amount to the overall score. No single rule can push a submission over 0.5 by itself. The idea is accumulation, not domination.
A message with one link might be fine. A message with three links, spammy keywords, and an IP that has posted 15 times in an hour probably is not.
Layer Four: Machine Learning
For Pro+ users, a fourth layer runs.
A Multinomial Naive Bayes classifier trained on TF-IDF features evaluates the text content and returns a probability between 0 and 1.
The ML component lives in a separate Flask microservice written in Python. The core API is Spring Boot on Java 21. They talk over HTTP.
Keeping them separate was deliberate. It means I can retrain and redeploy the model independently. If the ML service goes down, the API does not. It simply falls back to the heuristic layers.
The expensive part is optional. The fundamentals still stand on their own.
Training the Model Properly
One of the harder problems was data.
Most public spam datasets are email corpora. Contact form spam is different. Shorter. Less structured. Often more blunt.
The training data pulls from several sources:
- SpamAssassin
- Enron-Spam
- Ling-Spam
- Real contact form submissions from my own site
The last source is the most valuable. Domain specific data matters more than academic purity.
The preprocessing pipeline deduplicates by content hash, applies a stratified train test split to preserve class balance, and fits a TF-IDF vectorizer capped at 30,000 features with bigrams enabled. The model uses a smoothing alpha of 0.1. Both the vectorizer and classifier are serialised and loaded at service startup.
Nothing exotic. Just solid fundamentals.
Architecture Decisions I Am Glad I Made
The core API runs on Spring Boot with MySQL for persistence, Flyway for migrations, and Redis for rate limiting. Everything is containerised with Docker Compose.
Two authentication paths exist:
- JWT for dashboard users
- API keys for form submissions
Plans are stored in the database rather than hardcoded, which means features like ML scoring, threshold configuration, webhooks, and usage limits can be adjusted without redeploying.
But the decision I am most satisfied with is about data retention.
FormBouncer does not store PII.
It logs the score, verdict, threshold used, reason codes, IP, and user agent. It does not store the message content, the email address, or the sender’s name.
If a decision needs to be audited, the reason codes tell the story:
- excessive_links
- submission_too_fast
- keyword_match:bitcoin_profit
That is enough to understand why something was flagged without becoming a data warehouse for other people’s messages.
What Started as Frustration
This project started because I was annoyed.
I did not want my inbox polluted. I did not want to bolt someone else’s JavaScript onto my site. I did not want to pay per thousand requests just to block obvious garbage.
Now it has evolved into something that feels like a real product.
There is more to build. Webhooks. False positive reporting. Cleaner data ingestion for retraining. Better feedback loops.
But the core idea is already there.
Spam is not one problem. It is layers of weak signals.
Treat it that way and the system becomes both simpler and more robust.