1
0
forked from icd/rentgen
rentgen/CLAUDE.md
root 6c1df64b40 chore: dodaj wsparcie Docker i dokumentację Claude Code
- Dodano Dockerfile z multi-stage build (artifacts + dev environment)
- Dodano .dockerignore dla optymalizacji budowania
- Dodano CLAUDE.md z dokumentacją architektury i workflow dla Claude Code

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-25 13:27:29 +02:00

11 KiB

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Project Overview

Rentgen is a privacy-focused browser extension for Firefox and Chrome that intercepts web traffic, identifies third-party tracking, and visualizes stolen data (cookies, browsing history, etc.). It generates GDPR-compliant reports and email templates for Polish website administrators and the Personal Data Protection Office.

Language Note: The codebase is in English, but the extension UI and generated reports are in Polish. Comments and documentation may be bilingual.

Build & Development Commands

Standard Build Workflow

npm install                    # Install dependencies
npm run build                  # Build for Firefox (default)
npm run build:firefox          # Build for Firefox explicitly
npm run build:chrome           # Build for Chrome
npm run create-package         # Package into web-ext-artifacts/ directory
npm run build-addon            # Complete build: install + build + package

Development Workflow

npm run watch                  # Watch mode - auto-rebuild on file changes
npm run watch:firefox          # Watch for Firefox
npm run watch:chrome           # Watch for Chrome
npm run ext-test               # Run extension in temporary Firefox profile (web-ext run)

Quality Checks

npm run typecheck              # Run TypeScript type checking (tsc --noEmit)
npm run lint                   # Lint extension with web-ext lint

Testing in Browser

After building, load the temporary add-on:

  1. Firefox: Navigate to about:debugging → This Firefox → Load Temporary Add-on
  2. Chrome: Navigate to chrome://extensions/ → Enable Developer Mode → Load unpacked

Note: There are no automated test suites in this codebase. Testing is manual via browser extension loading.

Architecture Overview

Core Pattern: Event-Driven Singleton with Observer Pattern

Rentgen uses a centralized Memory singleton (background script) that:

  • Intercepts all HTTP requests via webRequest API
  • Maintains hierarchical data structure: origin → shorthost → RequestCluster → StolenDataEntry
  • Emits 'change' events when data updates
  • Drives UI re-renders in React components via custom useEmitter hook

Key Components

Background Service (background.ts + memory.ts)

  • Memory class (extends SaferEmitter): Central orchestrator managing all extension state
    • Listens to webRequest.onBeforeRequest and webRequest.onBeforeSendHeaders
    • Maintains clusters map: origin → Map<shorthost, RequestCluster>
    • Emits 'change' events to notify UI components
    • Updates browser badge with domain count and color indicators
    • Accessible globally via getMemory() singleton

Network Interception Layer

  • ExtendedRequest class (extended-request.ts): Wraps individual HTTP requests

    • Static registry: ExtendedRequest.by_id[requestId] for fast lookup
    • Two-phase initialization: constructor (body) + init() method (headers)
    • Detects third-party requests by comparing origins
    • Extracts "stolen data" from: cookies, query params, pathname, headers, request body
    • Generates HAR (HTTP Archive) format for reports
    • Calculates priority scores based on data sensitivity
  • RequestCluster class (request-cluster.ts): Groups requests by origin + shorthost

    • Aggregates StolenDataEntry items with deduplication
    • Tracks expanded/collapsed state for UI
    • Auto-marks suspicious entries (history exposure, tracking IDs)
    • Emits 'change' events on modifications

Data Classification

  • StolenDataEntry class (stolen-data-entry.ts): Individual data points
    • Sources: cookie, pathname, queryparams, header, request_body
    • Classifications: id (tracking ID), history (browsing history), location (geolocation)
    • Smart value parsing: recursively decodes Base64, JSON, URLs, nested structures
    • Priority calculation: combines value length, origin exposure, data type
    • Mark/unmark system for user selection in reports

Browser API Abstraction (lib/browser-api/)

  • Cross-browser compatibility layer selected at build time via TARGET env var
  • types.ts: Unified interface for tabs, badge, webRequest, cookies APIs
  • index.ts: Exports Chrome or Firefox implementation based on process.env.TARGET
  • Standardizes differences (e.g., browserAction vs action)

Data Flow

HTTP Request Initiated
    ↓
Memory.onBeforeRequest → Create ExtendedRequest (capture body)
    ↓
Memory.onBeforeSendHeaders → ExtendedRequest.init() (capture headers)
    ↓
Extract stolen data (cookies, params, headers, body)
    ↓
Memory.register() → Check if third-party → Add to RequestCluster
    ↓
Memory.emit('change', shorthost) → Broadcast event
    ↓
React components (via useEmitter hook) → UI re-renders

UI Components

Sidebar (components/sidebar/)

  • sidebar.tsx: Main extension UI listing third-party domains
  • stolen-data.tsx: Renders RequestClusters with filtering options
  • stolen-data-cluster.tsx: Expandable cluster showing individual StolenDataEntry items
  • Filters: minValueLength, cookiesOnly, cookiesOrOriginOnly
  • Real-time updates via useEmitter(Memory) hook

Report Window (components/report-window/)

  • Multi-stage report generation workflow:
    1. Survey: User questionnaire (role, tone, gender pronouns) via survey-react
    2. Screenshot: External service generates domain screenshots
    3. Preview: Final email/report content with GDPR violation analysis
  • deduce-problems.tsx: Analyzes survey answers to identify GDPR violations
  • har-converter.tsx: Generates filtered HAR archives
  • email-content.tsx: Renders Polish email template (polite or harsh tone)

Toolbar (components/toolbar/)

  • Browser action popup (top-right icon)

Build System

  • esbuild (esbuild.config.js): TypeScript → JavaScript bundler

    • Entry points: toolbar.tsx, sidebar.tsx, report-window.tsx, background.ts, diag.tsx, styles
    • External React libs loaded via globals (globalThis.React, globalThis.ReactDOM)
    • SCSS plugin for styling
    • Define flags: PLUGIN_NAME, PLUGIN_URL
    • Watch mode available for development
  • Target Selection: Set TARGET=firefox or TARGET=chrome before build to select browser API implementation

  • Manifest: Currently uses Manifest V2 (Firefox). Chrome support is partial and being expanded.

Important Implementation Details

Third-Party Detection Heuristics

When determining if a request is third-party (in extended-request.ts):

  1. Compare request origin with tab origin
  2. Check documentUrl and originUrl from webRequest details
  3. Use urlClassification.thirdParty if available
  4. Analyze frameAncestors for nested iframe scenarios
  5. Fall back to comparing hostnames

Stolen Data Extraction Strategy

Data is extracted from multiple sources (priority order):

  1. Cookies: Via browser.cookies.getAll()
  2. Query Parameters: Parsed from URL search string
  3. Pathname: URL path segments
  4. Headers: Request headers (Cookie, Referer, etc.)
  5. Request Body: POST/PUT data (form data, JSON)

Value Parsing Chain

StolenDataEntry recursively decodes values:

  1. Detect Base64 encoding → decode
  2. Detect URL encoding → decode
  3. Detect JSON → parse
  4. Detect nested URLs → extract
  5. Stop at maximum recursion depth

Auto-Marking Rules

RequestClusters automatically mark entries as suspicious if:

  • Value exposes browsing history (referrer, path info)
  • Cookie length > 100 characters
  • Known trackers: Google Analytics, Facebook, DoubleClick, etc.
  • Classified as id or history type

Event System

  • SaferEmitter (safer-emitter.ts): EventEmitter wrapper with async emission
    • Uses setTimeout(..., 0) to decouple events from synchronous request handling
    • Prevents errors in listeners from breaking request flow
  • useEmitter Hook (components/sidebar/sidebar.tsx): React integration
    • Increments counter state on each event to trigger re-renders
    • Automatically subscribes/unsubscribes via useEffect

Common Development Tasks

Adding a New Data Source

  1. Add extraction logic to ExtendedRequest.getAllData() in extended-request.ts
  2. Define new source type in StolenDataEntry.sources in stolen-data-entry.ts
  3. Update classification logic in StolenDataEntry.getClassification()
  4. Update UI components to display the new source (if needed)

Supporting a New Browser

  1. Create implementation in lib/browser-api/ (e.g., safari.ts)
  2. Update lib/browser-api/index.ts to export based on TARGET env var
  3. Add corresponding npm scripts in package.json (build:safari, etc.)
  4. Test browser-specific APIs for compatibility

Modifying Report Templates

  • Email templates: email-template-polite.js, email-template-harsh.js
  • Problem deduction: components/report-window/deduce-problems.tsx
  • Survey questions: components/report-window/questions.tsx
  • Templates are in Polish and follow GDPR complaint structure

Debugging Request Interception

  • Check ExtendedRequest.by_id registry for request lookup issues
  • Verify Memory.clusters structure for data organization
  • Use browser DevTools → Extensions → Background Page for logging
  • Enable console.log in memory.ts event listeners

Project Constraints

  • No automated tests: Manual testing only via browser extension loading
  • Polish language: UI and reports are Polish-focused (English i18n is future work)
  • Manifest V2: Primary target is Firefox; Chrome V3 migration is in progress
  • External screenshot service: Report generation depends on external API
  • No minification: Currently disabled in esbuild config (commented out)
  • Node.js 16.x requirement: Specified in README

Repository Information

File Organization

rentgen/
├── background.ts              # Extension entry point
├── memory.ts                  # Central state manager (Memory singleton)
├── extended-request.ts        # HTTP request wrapper and data extraction
├── request-cluster.ts         # Request aggregation by domain
├── stolen-data-entry.ts       # Individual data point representation
├── safer-emitter.ts           # EventEmitter wrapper
├── util.ts                    # Utility functions
├── components/
│   ├── sidebar/               # Main extension UI
│   ├── toolbar/               # Browser action popup
│   └── report-window/         # Report generation workflow
├── lib/
│   └── browser-api/           # Cross-browser API abstraction
├── email-template-*.js        # Polish email templates
├── esbuild.config.js          # Build configuration
├── manifest.json              # Extension manifest (V2)
└── assets/                    # Icons and screenshots