Skip to content
View ManasaDeshagouni's full-sized avatar
🔪
cooking something up!!!
🔪
cooking something up!!!
  • 16:20 (UTC -12:00)

Block or report ManasaDeshagouni

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
ManasaDeshagouni/Readme.MD


~/about

I'm Manasa, a Master's student in Computer Science at San José State University and a Graduate Research Assistant working on zero-shot malware attribution with metric-learned embeddings.

Previously, I spent 2+ years at Optum (UnitedHealth Group) shipping production ML, GenAI features, and backend systems for a secure file-transfer platform.

I care about the part of ML that actually gets used: retrieval, inference, guardrails, evaluation, rollout safety, and the systems that turn predictions into action.

  • 🔬 Currently researching: zero-shot retrieval, metric learning, FAISS-based search
  • 🧠 Interested in: production ML, GenAI/RAG, embeddings, backend systems
  • 🤝 Open to: SWE, MLE, Applied ML, and Research Engineer roles

projects/featured

  ╭──────────────────────────────────────────╮
  │  🔍  "What was that paper about          │
  │       attention mechanisms I read         │
  │       last month?"                        │
  │                                           │
  │  Found 3 results in 47ms                  │
  │  ├── 📄 attention_is_all_you_need.pdf     │
  │  │   ✦ 0.94 relevance · chunk 3/12       │
  │  ├── 📝 transformer_notes.md              │
  │  │   ✦ 0.89 relevance · tagged: #nlp     │
  │  └── 🖼️ architecture_diagram.png          │
  │      ✦ 0.81 relevance · CLIP matched     │
  ╰──────────────────────────────────────────╯

Your second brain, with semantic superpowers.

A local-first AI agent that ingests everything — PDFs, notes, receipts, images, code snippets — and makes it all searchable by meaning, not keywords. Multimodal embeddings (MiniLM for text, CLIP for images), FAISS HNSW index, adaptive retrieval with temporal reranking.

What it handles How it performs
100k+ documents indexed < 500ms retrieval
5 features: search, Q&A, summarize, rank, discover Metadata filtering + temporal reranking


  03:14:22 ⚠  ALERT  disk_full on prod-db-03
  03:14:22 ⚠  ALERT  disk_full on prod-db-03   ← duplicate, suppressed
  03:14:23 ⚠  ALERT  disk_full on prod-db-03   ← duplicate, suppressed
  03:14:24 🔔 PAGED  @sarah (on-call: infra)    ← 1.9s from first alert
  03:14:26 ✅ ACK    @sarah acknowledged         ← 220ms ack→resolve
  
  ┌─────────────────────────────┐
  │ 3 alerts → 1 page → 1 ack  │
  │ 58% noise eliminated       │
  └─────────────────────────────┘

Pages the right engineer. Kills the noise.

Multi-tenant on-call alerting SaaS with real-time dedup, correlation, and idempotent workers. JWT/HMAC-secured ingest. React + WebSocket console for live ack/resolve. Runs standalone or as a pre-filter ahead of PagerDuty/Opsgenie.

Metric Result
First-notify p95 1.9s @ 350 req/s
Duplicate suppression 58%
Delivery success 77% → 94% (retries + backoff + DLQ < 0.6%)
Ack→resolve p95 220ms


          AUDIO                          TEXT
            │                              │
   ┌────────▼────────┐          ┌──────────▼──────────┐
   │  Acoustic feats  │          │  DistilBERT + cues  │
   │  → BiLSTM        │          │  → text embedding    │
   └────────┬────────┘          └──────────┬──────────┘
            │         confidence            │
            └──────────┐  ┌────────────────┘
                    ┌──▼──▼──┐
                    │ XGBoost │  ← late fusion
                    │  Fuser  │
                    └────┬───┘
                         │
                    ┌────▼────┐
                    │ TRUTH or │
                    │ DECEPTION│
                    └─────────┘
                    
   accuracy: 89.4%  ·  precision: 93.5%

Your voice says more than your words.

Multimodal deception detection that fuses what you say with how you say it. Temporal acoustic features encoded via BiLSTM, transcript text encoded via DistilBERT with explicit lexical/linguistic cues, late-fused through XGBoost for robust classification on short, noisy clips.


🎮 More builds → QuizChronicles
  ┌─ ROOM: "algo-arena" ──────── 4/10 players ─┐
  │                                              │
  │  🟢 alice    142 pts   solving Q3...         │
  │  🟢 bob      138 pts   submitted ✓  180ms   │
  │  🟡 charlie  120 pts   idle                  │
  │  🟢 you      155 pts   🏆 leading            │
  │                                              │
  │  ⏱️ 02:34 remaining                          │
  │  fan-out: 120ms p95 across 800 sockets       │
  └──────────────────────────────────────────────┘

Interactive coding + quiz platform with modular Spring Boot backend, sandboxed code execution, React + Monaco editor, real-time rooms, leaderboards, proctor controls over WebSockets, and a timed game-themed solo coding mode.

Metric Result
Submission p95 180ms @ 200 users
Fan-out p95 120ms @ 800 sockets
Dropped updates < 0.5%
Cache speedup 230 → 140ms (−39%)


work/production

Note

Can't open-source proprietary code — but here's what I built and what it did.

Predictive Reliability Engine
Kafka → FastAPI+ONNX → Spring Boot gates
     scoring: p95 85ms @ 1.2k msgs/s

shadow mode → canary (14d, 0 FP) → prod


🤖 GenAI Product Features
config/logs → sanitize → RAG retrieve
  → LLM (LLaMA-2 / Mistral-7B + LoRA)
  → validate schema → serve


🔧 ECG Platform Services
20+ UIs + Spring Boot APIs
dual-schema rollout → 0 breakages
correlation IDs: UI→API→workers



research/active

🦠 Neural Fingerprints for Malware

Graduate Research Assistant @ SJSU

malware binary → image → ResNet encoder
  → L2 normalize → hypersphere embedding
  → FAISS HNSW → nearest family match

  train: 47 families (MalNet + MalImg)
  test:  17 unseen families (zero-shot)

Learning domain-robust embeddings with ProxyAnchor, Triplet, and SupCon losses so unseen malware can be clustered and retrieved — without retraining.

Key insight: Strong in-domain separation ≠ cross-domain generalization. The bottleneck is representation stability under dataset shift, not loss function choice.

Cross-domain retrieval 88.02% (MalNet→MalImg)
Strict zero-shot 57–67% (17 unseen families)
Best loss ProxyAnchor

🏃 Cross-Domain HAR

Zero-shot Pocket Activity Recognition

4 source datasets (heterogeneous phones)
  → unified calibration pipeline
  → physics-aware temporal model
  → 3 zero-shot target datasets

  standing recall: 0% ──fix pipeline──→ ~99%

Multi-source domain adaptation for Sitting / Standing / Walking. Physics-aware calibrator auto-detects sampling rate, units, and orientation of unseen sensors.

Key insight: Standing recall collapsed to ~0% from preprocessing-distribution mismatch — not model weakness. Fixing the pipeline fixed the model.

Source-domain F1 94.1% (subject-disjoint)
Zero-shot transfer ~95.9% (UTwente)
Standing recovery 0% → ~99%

papers/

Year Paper Venue
🏆 2024 Brain Tumor Detection using Machine Learning ICCDS 2024 · Best Paper Award
2023 Deep Learning Techniques for Detection of Deepfakes IJSRSET (ICSCR 2023)

stack.yml

languages:
  - Java
  - Python
  - Go
  - C++
  - TypeScript
  - SQL

ml_and_ai:
  - PyTorch
  - TensorFlow
  - Transformers
  - ONNX Runtime
  - FAISS
  - PEFT / LoRA
  - LangChain
  - scikit-learn
  - XGBoost

backend_and_systems:
  - Spring Boot
  - Spring Security
  - FastAPI
  - Kafka
  - Redis
  - PostgreSQL
  - MongoDB
  - Docker
  - Kubernetes
  - GCP
  - AWS

frontend:
  - React
  - Angular
  - Tailwind
  - Vite

observability:
  - Grafana
  - Prometheus
  - Selenium / Cucumber
  - JUnit
  - k6

"Research matters. Production proves it."

Pinned Loading

  1. Mnemosyne Mnemosyne Public

    A Personal Knowledge Intelligence System for Structured Memory, Semantic Retrieval, and Proactive Recall

    Python

  2. neural-fingerprints neural-fingerprints Public

    Image-based malware attribution using metric learning

    Python

  3. TruthReaper TruthReaper Public

    A dual-track deception detection system designed to classify spoken statements as truthful or deceptive.

    Python

  4. FetiiAI FetiiAI Public

    Intelligent Rideshare Analytics Platform

    Python