KW Clusterized ^v1.0

Client-side keyword clustering tool using semantic similarity

Overview

KW Clusterized is a frontend-first keyword clustering application built to turn raw keyword lists into clean topical groups in seconds. It uses Jaccard similarity, word overlap analysis, and greedy agglomerative clustering to help SEOs, content strategists, and growth teams organize search intent without sending data to a server.

Features

Flexible Input — Paste comma-separated, newline-delimited, or upload CSV, TXT, and TSV files

Batch Processing — Handles large keyword sets in a single pass, deduplicates entries, and groups them into reviewable clusters

Semantic Clustering — Groups keywords by word overlap and Jaccard similarity scoring instead of relying on exact-match rules

Agglomerative Grouping Logic — Uses greedy single-linkage clustering to merge related phrases into the most relevant existing cluster

Similarity Threshold Tuning — The core clustering engine supports configurable similarity thresholds, making it easy to adjust grouping strictness in code

Color-Coded Clusters — Visual cluster cards and auto-generated labels make cluster review fast and intuitive

Structured Export Format — Download cluster assignments as CSV with cluster ID, label, and keyword columns for spreadsheets and planning workflows

Client-Side Only — All analysis runs in the browser with zero server round-trips

Instant Results — No API calls, no loading spinners, immediate output

Algorithm

KW Clusterized uses a lightweight, explainable clustering approach designed for practical keyword grouping rather than opaque black-box scoring.

Normalize and tokenize keywords — Each keyword is lowercased, punctuation is removed, and low-signal stop words are filtered out so the algorithm focuses on meaningful terms.
Calculate Jaccard similarity — For every comparison, the app converts each keyword into a set of significant words and scores overlap with the Jaccard similarity coefficient:

J(A, B) = |A ∩ B| / |A ∪ B|

A score closer to 1 means two keywords share more meaningful vocabulary; a score closer to 0 means they are topically farther apart.
Apply semantic overlap bias — When two keywords share a more meaningful term of four or more characters, the score receives a small boost. This improves practical grouping for phrases like content marketing strategy and content creation tips, where topical overlap matters more than raw token count alone.
Cluster with greedy agglomerative logic — Keywords are processed from longer phrases to shorter phrases. For each keyword, the algorithm measures similarity against existing clusters using single-linkage logic, meaning it compares against the most similar keyword already inside that cluster.
Respect a similarity threshold — If the best matching cluster meets the similarity threshold, the keyword joins that cluster. Otherwise, it seeds a new cluster. The result is a fast, deterministic form of agglomerative clustering that works well for SEO and content-planning workflows.

Why KW Clusterized?

Most keyword clustering workflows still revolve around Python notebooks, scripts, or backend-heavy pipelines. KW Clusterized takes a different approach: a browser-native implementation built with Next.js and TypeScript, making it a strong portfolio piece as well as a practical tool.

Frontend-native architecture — No Python runtime, notebook workflow, or server queue required

Private by design — Keywords stay in the browser, which is useful for sensitive client datasets

Modern web deployment — Easy to run locally, share as a live demo, and deploy on Vercel

Accessible to frontend teams — Easier to extend for developers already working in React, Next.js, and TypeScript

Differentiated positioning — In a category dominated by Python implementations, KW Clusterized shows how keyword clustering can feel like a polished product instead of a script

Tech Stack

Layer	Technology
Framework	Next.js 14 (App Router)
Language	TypeScript 5
UI	React 18
Styling	Tailwind CSS 3
Clustering Engine	Custom Jaccard-based keyword clustering
File Handling	Browser FileReader API
Deployment	Vercel

Getting Started

Prerequisites

Node.js 18.17 or later
npm 9 or later

Installation

git clone https://github.com/seankrux/kw-clusterized.git
cd kw-clusterized
npm install
npm run dev

Open http://localhost:3000 in your browser.

Production Build

npm run build
npm run start

Project Structure

src/
  app/
    page.tsx                  # Main page
  components/
    KeywordClusterer.tsx      # Core clustering UI
  lib/
    clustering.ts             # Clustering algorithm

Deployment

vercel deploy

The live application is available at kw-clusterized.vercel.app.

Contributing

Contributions are welcome if they improve clustering quality, usability, documentation, or developer experience.

Fork the repository
Create a feature branch
Make focused, well-documented changes
Run a local sanity check with npm run build
Open a pull request with a clear summary of the improvement

High-value contribution areas include:

Exposing similarity threshold controls in the UI
Adding more export targets or data views
Improving cluster labeling heuristics
Expanding test coverage around clustering edge cases

_{Built by Sean G}

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
src		src
.env.example		.env.example
.eslintrc.json		.eslintrc.json
.gitignore		.gitignore
README.md		README.md
next.config.mjs		next.config.mjs
package-lock.json		package-lock.json
package.json		package.json
postcss.config.mjs		postcss.config.mjs
tailwind.config.ts		tailwind.config.ts
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

KW Clusterized ^v1.0

Overview

Features

Algorithm

Why KW Clusterized?

Tech Stack

Getting Started

Prerequisites

Installation

Production Build

Project Structure

Deployment

Contributing

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

KW Clusterized v1.0

Overview

Features

Algorithm

Why KW Clusterized?

Tech Stack

Getting Started

Prerequisites

Installation

Production Build

Project Structure

Deployment

Contributing

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

KW Clusterized ^v1.0

Packages