diff --git a/README.md b/README.md index e215bc4..776021c 100644 --- a/README.md +++ b/README.md @@ -1,36 +1,80 @@ -This is a [Next.js](https://nextjs.org) project bootstrapped with [`create-next-app`](https://nextjs.org/docs/app/api-reference/cli/create-next-app). +# followprint -## Getting Started +> Instagram 데이터 export ZIP 한 개를 끌어다 놓으면 팔로우 관계와 활동 패턴을 +> 그 자리에서 분석한다. **모든 처리는 브라우저 안에서 끝난다 — 서버 없음, 업로드 없음, 로그인 없음.** -First, run the development server: +## 무엇을 보여주는가 + +| 영역 | 내용 | +| --- | --- | +| 관계 분석 | 맞팔(`mutual`) / 내가만 따르는(`nonMutual`) / 팬만(`fansOnly`) / 보류 / 최근 언팔 / 친한 친구 / 차단 / 제한 | +| 캐릭터 카드 | 6개 캐릭터 타입 (Influencer / Butterfly / Observer / Selective / Explorer / Minimalist) + 4개 점수 (Social / Loyalty / Curiosity / Selectivity) + 활동 시간대 + 월간 팔로우 속도 | +| 인사이트 | 좋아요 많이 누른 계정 Top 20, 저장 게시물 Top 20, 프로필 검색 / 단어 검색 기록, 24시간 로그인 분포, 채팅 상대 | + +## 데이터를 어떻게 받는가 + +1. Instagram 앱 → **설정 → 내 정보 및 권한 → 정보 다운로드** +2. 형식: **JSON** (HTML도 호환) +3. 데이터 종류: 모두 또는 `followers_and_following + activity` 만 +4. 받은 ZIP 파일을 followprint 페이지에 끌어다 놓는다 + +## 개인정보 + +- ZIP 안의 모든 파일은 **`JSZip` 으로 브라우저에서 직접 풀고 파싱한다** +- 네트워크 요청은 폰트와 정적 자산 외에 **0건** +- HTML 파싱 단계는 모두 `DOMPurify` 의 명시적 화이트리스트 (a, div, span, p, td, tr, table, ...) 를 통과한 뒤에만 DOMParser에 도달한다 +- 새로고침하면 데이터는 메모리에서 사라진다 + +## 기술 스택 + +- **Next.js 16** (App Router, `output: "export"` — 정적 사이트) +- **React 19** + TypeScript strict +- **Tailwind v4** +- **JSZip** + **DOMPurify** + **vitest** + **jsdom** +- **i18n**: 한국어 / 영어 토글, Instagram export 의 KO/EN 날짜 포맷 모두 파싱 + +## 개발 ```bash -npm run dev -# or -yarn dev -# or -pnpm dev -# or -bun dev +npm install +npm run dev # 개발 서버 +npm run build # 정적 사이트 빌드 (out/ 에 떨어짐) +npm test # vitest run +npm run lint # eslint ``` -Open [http://localhost:3000](http://localhost:3000) with your browser to see the result. +## 테스트 + +`src/lib/__tests__/` 안에 vitest 케이스가 있다: -You can start editing the page by modifying `app/page.tsx`. The page auto-updates as you edit the file. +- `parser.test.ts` — JSON / HTML 양 포맷 + mutual / nonMutual / fansOnly 계산 + INVALID_ZIP / UNSUPPORTED_FORMAT / malformed entries + 7종 분류 (pending / unfollowed / closeFriends / blocked / restricted) +- `parse-utils.test.ts` — KO / EN 날짜 (오전·오후·12시 경계) + DOMPurify XSS 회귀 (script / onclick stripping) +- `character.test.ts` — 6개 캐릭터 타입 분류 + 점수 0~100 범위 + highlight 매칭 + 빈 입력 / 동률 케이스 +- `insights-parser.test.ts` — likedPosts / savedPosts / profileSearches / wordSearches / loginActivity / chatList 회귀 가드 -This project uses [`next/font`](https://nextjs.org/docs/app/building-your-application/optimizing/fonts) to automatically optimize and load [Geist](https://vercel.com/font), a new font family for Vercel. +CI (`.github/workflows/ci.yml`) 에서 push / PR 마다 자동 실행한다. -## Learn More +## 캐릭터 분류 기준 -To learn more about Next.js, take a look at the following resources: +| 타입 | 조건 | +| --- | --- | +| **Influencer** | followers / following 비율 > 3 AND followers > 500 | +| **Selective** | following < 200 AND mutual / following > 0.6 | +| **Explorer** | pending / (pending + following) > 0.1 | +| **Butterfly** | following > 300 AND mutual / following > 0.5 (또는 default with mutualRate > 0.5) | +| **Observer** | following > 300 AND mutual / following < 0.3 (또는 default) | +| **Minimalist** | following < 100 AND followers < 100 | -- [Next.js Documentation](https://nextjs.org/docs) - learn about Next.js features and API. -- [Learn Next.js](https://nextjs.org/learn) - an interactive Next.js tutorial. +`src/lib/character.ts` 에 정의되어 있다. -You can check out [the Next.js GitHub repository](https://github.com/vercel/next.js) - your feedback and contributions are welcome! +## Instagram 포맷 변경 대응 -## Deploy on Vercel +Instagram 은 가끔 export 디렉토리 구조와 HTML 클래스명을 바꾼다. 회귀가 발생하면 +`src/lib/__tests__/parser.test.ts` 와 `insights-parser.test.ts` 가 먼저 깨지고, +`parser.ts` 의 `validateInstagramZip` 가 새로운 경로 패턴을 받아들이지 못하면 +사용자에게 `INVALID_ZIP` 또는 `EMPTY_DATA` 가 노출된다. 두 함수 중 하나가 +fail 하면 IG export 형식 변경을 의심해야 한다. -The easiest way to deploy your Next.js app is to use the [Vercel Platform](https://vercel.com/new?utm_medium=default-template&filter=next.js&utm_source=create-next-app&utm_campaign=create-next-app-readme) from the creators of Next.js. +## License -Check out our [Next.js deployment documentation](https://nextjs.org/docs/app/building-your-application/deploying) for more details. +MIT diff --git a/eslint.config.mjs b/eslint.config.mjs index 05e726d..12cc271 100644 --- a/eslint.config.mjs +++ b/eslint.config.mjs @@ -12,6 +12,9 @@ const eslintConfig = defineConfig([ "out/**", "build/**", "next-env.d.ts", + // Local debug / fixture generation scripts — CommonJS one-offs that + // are never bundled into the site and don't need the Next lint rules. + "scripts/**", ]), ]); diff --git a/src/lib/__tests__/insights-parser.test.ts b/src/lib/__tests__/insights-parser.test.ts new file mode 100644 index 0000000..e76c798 --- /dev/null +++ b/src/lib/__tests__/insights-parser.test.ts @@ -0,0 +1,205 @@ +// Regression guards for the insights HTML parsers. The Instagram export +// format is not stable — class names and label text change every few months — +// and these parsers are the most fragile surface in the project. Any test +// that goes red here is a strong signal that IG changed their layout. +// +// The fixtures below are minimal extracts of real exports, simplified to the +// shape that each parser actually walks. They intentionally include the +// extra wrapper divs and class noise that IG ships, so that selector changes +// (e.g. dropping `_2piu`) are caught. + +import { describe, it, expect } from "vitest"; +import JSZip from "jszip"; +import { parseInsights } from "@/lib/insights-parser"; + +async function buildZip(files: Record): Promise { + const zip = new JSZip(); + for (const [path, content] of Object.entries(files)) { + zip.file(path, content); + } + // Round-trip through generateAsync so that the resulting JSZip behaves the + // same as one loaded from disk (file metadata, not just in-memory shortcut). + const blob = await zip.generateAsync({ type: "blob" }); + return JSZip.loadAsync(blob); +} + +describe("parseInsights — likedPosts (KO label)", () => { + it("extracts usernames from `사용자 이름` rows", async () => { + const html = ` + +
+ + +
사용자 이름alice
+
+
+ + +
사용자 이름bob
+
+ + `; + const zip = await buildZip({ + "your_instagram_activity/likes/liked_posts.html": html, + }); + const insights = await parseInsights(zip); + const names = insights.topLikedAccounts.map((r) => r.name).sort(); + expect(names).toEqual(["alice", "bob"]); + }); +}); + +describe("parseInsights — likedPosts (EN label)", () => { + it("extracts usernames from `Username` rows", async () => { + const html = ` + + + + +
Usernamecarol
Usernamedave
+ + `; + const zip = await buildZip({ + "your_instagram_activity/likes/liked_posts.html": html, + }); + const insights = await parseInsights(zip); + const names = insights.topLikedAccounts.map((r) => r.name).sort(); + expect(names).toEqual(["carol", "dave"]); + }); +}); + +describe("parseInsights — savedPosts (h2 usernames)", () => { + it("collects single-token h2 entries", async () => { + const html = ` + +

spaceship_one

+

not a username

+

cometchaser

+

this_is_too_long_to_be_a_real_instagram_handle_xxxxxxxx

+ + `; + const zip = await buildZip({ + "your_instagram_activity/saved/saved_posts.html": html, + }); + const insights = await parseInsights(zip); + const names = insights.topSavedAccounts.map((r) => r.name).sort(); + // "not a username" rejected (whitespace), 50+ char string rejected. + expect(names).toEqual(["cometchaser", "spaceship_one"]); + }); +}); + +describe("parseInsights — profileSearches", () => { + it("returns h2 names with extracted timestamps", async () => { + const html = ` + +
+

searched_user_1

+
3월 16, 2026 6:41 오후
+
+
+

searched_user_2

+
4월 1, 2026 9:00 오전
+
+ + `; + const zip = await buildZip({ + "your_instagram_activity/recent_searches/profile_searches.html": html, + }); + const insights = await parseInsights(zip); + expect(insights.profileSearches).toHaveLength(2); + expect(insights.profileSearches[0].name).toBe("searched_user_1"); + expect(insights.profileSearches[0].timestamp).toBeGreaterThan(0); + }); +}); + +describe("parseInsights — wordSearches", () => { + it("extracts query text from 검색 / Search rows", async () => { + const html = ` + + + + + +
검색
코딩
3월 16, 2026 6:41 오후
+ + + + +
Search
music
4월 1, 2026 9:00 오전
+ + `; + const zip = await buildZip({ + "your_instagram_activity/recent_searches/word_or_phrase_searches.html": html, + }); + const insights = await parseInsights(zip); + const queries = insights.wordSearches.map((r) => r.name).sort(); + expect(queries).toEqual(["music", "코딩"]); + }); +}); + +describe("parseInsights — loginActivity", () => { + it("counts ISO timestamps in h2 elements per hour", async () => { + const html = ` + +

2026-04-01T09:23:00Z

+

2026-04-01T09:45:00Z

+

2026-04-01T18:01:00Z

+ + `; + const zip = await buildZip({ + "security_and_login_information/login_activity.html": html, + }); + const insights = await parseInsights(zip); + expect(insights.loginHours[9]).toBe(2); + expect(insights.loginHours[18]).toBe(1); + expect(insights.loginHours.reduce((a, b) => a + b, 0)).toBe(3); + }); + + it("counts KO 오전/오후 cells in 12-hour clock", async () => { + const html = ` + + + + + +
3월 16, 2026 6:41 오후
3월 16, 2026 6:50 오후
3월 16, 2026 9:00 오전
+ + `; + const zip = await buildZip({ + "security_and_login_information/login_activity.html": html, + }); + const insights = await parseInsights(zip); + expect(insights.loginHours[18]).toBe(2); + expect(insights.loginHours[9]).toBe(1); + }); +}); + +describe("parseInsights — chats", () => { + it("extracts chat partner names from h2 a", async () => { + const html = ` + +

alice

+

bob

+ + `; + const zip = await buildZip({ + "your_instagram_activity/messages/chats.html": html, + }); + const insights = await parseInsights(zip); + expect(insights.chatNames.sort()).toEqual(["alice", "bob"]); + }); +}); + +describe("parseInsights — empty / missing files", () => { + it("returns zeros when none of the source files exist", async () => { + const zip = await buildZip({ + "followers_and_following/followers_1.html": "", + }); + const insights = await parseInsights(zip); + expect(insights.topLikedAccounts).toEqual([]); + expect(insights.topSavedAccounts).toEqual([]); + expect(insights.profileSearches).toEqual([]); + expect(insights.wordSearches).toEqual([]); + expect(insights.chatNames).toEqual([]); + expect(insights.loginHours).toEqual(new Array(24).fill(0)); + }); +}); diff --git a/src/lib/__tests__/parser.test.ts b/src/lib/__tests__/parser.test.ts index 8ce6f7f..cf26671 100644 --- a/src/lib/__tests__/parser.test.ts +++ b/src/lib/__tests__/parser.test.ts @@ -126,17 +126,16 @@ describe("parseInstagramZip", () => { await expect(parseInstagramZip(file)).rejects.toThrow("INVALID_ZIP"); }); - it("handles empty followers_and_following directory", async () => { + it("rejects an Instagram-shaped zip with no actual records as EMPTY_DATA", async () => { + // Validation passes (path contains "followers") but no parseable data + // exists. The parser surfaces this as EMPTY_DATA so the UI can tell the + // user "this looks like an IG export but the format may have changed", + // which is more actionable than rendering an empty dashboard. const zip = new JSZip(); - // Directory marker exists but no actual data files inside zip.file("followers_and_following/readme.txt", "empty export"); const file = await zipToFile(zip); - const result = await parseInstagramZip(file); - - expect(result.followers).toHaveLength(0); - expect(result.following).toHaveLength(0); - expect(result.mutual).toHaveLength(0); + await expect(parseInstagramZip(file)).rejects.toThrow("EMPTY_DATA"); }); it("parses pending, unfollowed, closeFriends, blocked, restricted", async () => { diff --git a/src/lib/parser.ts b/src/lib/parser.ts index 61ee6eb..8446724 100644 --- a/src/lib/parser.ts +++ b/src/lib/parser.ts @@ -151,6 +151,18 @@ function validateInstagramZip(zip: JSZip): void { if (!isInstagram) throw new Error("INVALID_ZIP"); } +function isAnalysisEmpty(a: AnalysisResult): boolean { + return ( + a.followers.length === 0 && + a.following.length === 0 && + a.pendingRequests.length === 0 && + a.recentlyUnfollowed.length === 0 && + a.closeFriends.length === 0 && + a.blockedAccounts.length === 0 && + a.restrictedAccounts.length === 0 + ); +} + // ── Main entry ── export async function parseInstagramZip( @@ -158,7 +170,11 @@ export async function parseInstagramZip( ): Promise { const zip = await JSZip.loadAsync(file); validateInstagramZip(zip); - return analyzeZip(zip); + const analysis = await analyzeZip(zip); + if (isAnalysisEmpty(analysis)) { + throw new Error("EMPTY_DATA"); + } + return analysis; } export async function parseFileFull(file: File): Promise { @@ -172,5 +188,14 @@ export async function parseFileFull(file: File): Promise { parseInsights(zip), ]); + // The validate step only checks that *some* path mentions followers / + // following — that catches "you uploaded the wrong zip" — but it can still + // produce 0 records if Instagram changed their export schema. Surface that + // as a distinct error so the UI can tell the user "this looks like an IG + // export but the format may have changed" instead of an empty dashboard. + if (isAnalysisEmpty(analysis)) { + throw new Error("EMPTY_DATA"); + } + return { analysis, insights }; } diff --git a/src/locales/en.json b/src/locales/en.json index ca10d19..aeb9c06 100644 --- a/src/locales/en.json +++ b/src/locales/en.json @@ -48,6 +48,7 @@ "INVALID_ZIP": "This doesn't look like an Instagram data export. Make sure you downloaded the ZIP from Instagram.", "UNSUPPORTED_FORMAT": "Please upload a .zip file from Instagram's data export.", "FILE_TOO_LARGE": "File is too large. Maximum allowed size is 500 MB.", + "EMPTY_DATA": "We could read the ZIP, but it didn't contain any followers or following data. Instagram may have changed their export format — please request a fresh export, and if the problem persists, open an issue on GitHub.", "default": "Something went wrong. Please try again with a valid Instagram data export." } }, diff --git a/src/locales/ko.json b/src/locales/ko.json index 7b4ab85..e3643b4 100644 --- a/src/locales/ko.json +++ b/src/locales/ko.json @@ -48,6 +48,7 @@ "INVALID_ZIP": "인스타그램 데이터 내보내기 파일이 아닌 것 같습니다. 인스타그램에서 다운로드한 ZIP 파일인지 확인해주세요.", "UNSUPPORTED_FORMAT": "인스타그램 데이터 내보내기에서 받은 .zip 파일을 올려주세요.", "FILE_TOO_LARGE": "파일이 너무 큽니다. 최대 허용 크기는 500 MB입니다.", + "EMPTY_DATA": "ZIP은 정상적으로 읽었지만 팔로워/팔로잉 데이터가 들어있지 않습니다. 인스타그램이 내보내기 형식을 바꿨을 수 있습니다 — 다시 다운로드해보시고, 그래도 같은 문제라면 GitHub 이슈로 알려주세요.", "default": "문제가 발생했습니다. 유효한 인스타그램 데이터 파일로 다시 시도해주세요." } },