# CLAUDE.md This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. ## Project Overview **Rentgen** is a privacy-focused browser extension for Firefox and Chrome that intercepts web traffic, identifies third-party tracking, and visualizes stolen data (cookies, browsing history, etc.). It generates GDPR-compliant reports and email templates for Polish website administrators and the Personal Data Protection Office. **Language Note**: The codebase is in English, but the extension UI and generated reports are in Polish. Comments and documentation may be bilingual. ## Build & Development Commands ### Standard Build Workflow ```bash npm install # Install dependencies npm run build # Build for Firefox (default) npm run build:firefox # Build for Firefox explicitly npm run build:chrome # Build for Chrome npm run create-package # Package into web-ext-artifacts/ directory npm run build-addon # Complete build: install + build + package ``` ### Development Workflow ```bash npm run watch # Watch mode - auto-rebuild on file changes npm run watch:firefox # Watch for Firefox npm run watch:chrome # Watch for Chrome npm run ext-test # Run extension in temporary Firefox profile (web-ext run) ``` ### Quality Checks ```bash npm run typecheck # Run TypeScript type checking (tsc --noEmit) npm run lint # Lint extension with web-ext lint ``` ### Testing in Browser After building, load the temporary add-on: 1. Firefox: Navigate to `about:debugging` → This Firefox → Load Temporary Add-on 2. Chrome: Navigate to `chrome://extensions/` → Enable Developer Mode → Load unpacked **Note**: There are no automated test suites in this codebase. Testing is manual via browser extension loading. ## Architecture Overview ### Core Pattern: Event-Driven Singleton with Observer Pattern Rentgen uses a centralized **Memory** singleton (background script) that: - Intercepts all HTTP requests via `webRequest` API - Maintains hierarchical data structure: `origin → shorthost → RequestCluster → StolenDataEntry` - Emits `'change'` events when data updates - Drives UI re-renders in React components via custom `useEmitter` hook ### Key Components #### Background Service (`background.ts` + `memory.ts`) - **Memory class** (extends SaferEmitter): Central orchestrator managing all extension state - Listens to `webRequest.onBeforeRequest` and `webRequest.onBeforeSendHeaders` - Maintains `clusters` map: `origin → Map` - Emits `'change'` events to notify UI components - Updates browser badge with domain count and color indicators - Accessible globally via `getMemory()` singleton #### Network Interception Layer - **ExtendedRequest class** (`extended-request.ts`): Wraps individual HTTP requests - Static registry: `ExtendedRequest.by_id[requestId]` for fast lookup - Two-phase initialization: constructor (body) + `init()` method (headers) - Detects third-party requests by comparing origins - Extracts "stolen data" from: cookies, query params, pathname, headers, request body - Generates HAR (HTTP Archive) format for reports - Calculates priority scores based on data sensitivity - **RequestCluster class** (`request-cluster.ts`): Groups requests by `origin + shorthost` - Aggregates StolenDataEntry items with deduplication - Tracks expanded/collapsed state for UI - Auto-marks suspicious entries (history exposure, tracking IDs) - Emits `'change'` events on modifications #### Data Classification - **StolenDataEntry class** (`stolen-data-entry.ts`): Individual data points - Sources: `cookie`, `pathname`, `queryparams`, `header`, `request_body` - Classifications: `id` (tracking ID), `history` (browsing history), `location` (geolocation) - Smart value parsing: recursively decodes Base64, JSON, URLs, nested structures - Priority calculation: combines value length, origin exposure, data type - Mark/unmark system for user selection in reports #### Browser API Abstraction (`lib/browser-api/`) - **Cross-browser compatibility layer** selected at build time via `TARGET` env var - `types.ts`: Unified interface for tabs, badge, webRequest, cookies APIs - `index.ts`: Exports Chrome or Firefox implementation based on `process.env.TARGET` - Standardizes differences (e.g., `browserAction` vs `action`) ### Data Flow ``` HTTP Request Initiated ↓ Memory.onBeforeRequest → Create ExtendedRequest (capture body) ↓ Memory.onBeforeSendHeaders → ExtendedRequest.init() (capture headers) ↓ Extract stolen data (cookies, params, headers, body) ↓ Memory.register() → Check if third-party → Add to RequestCluster ↓ Memory.emit('change', shorthost) → Broadcast event ↓ React components (via useEmitter hook) → UI re-renders ``` ### UI Components #### Sidebar (`components/sidebar/`) - **sidebar.tsx**: Main extension UI listing third-party domains - **stolen-data.tsx**: Renders RequestClusters with filtering options - **stolen-data-cluster.tsx**: Expandable cluster showing individual StolenDataEntry items - Filters: `minValueLength`, `cookiesOnly`, `cookiesOrOriginOnly` - Real-time updates via `useEmitter(Memory)` hook #### Report Window (`components/report-window/`) - Multi-stage report generation workflow: 1. **Survey**: User questionnaire (role, tone, gender pronouns) via `survey-react` 2. **Screenshot**: External service generates domain screenshots 3. **Preview**: Final email/report content with GDPR violation analysis - **deduce-problems.tsx**: Analyzes survey answers to identify GDPR violations - **har-converter.tsx**: Generates filtered HAR archives - **email-content.tsx**: Renders Polish email template (polite or harsh tone) #### Toolbar (`components/toolbar/`) - Browser action popup (top-right icon) ### Build System - **esbuild** (`esbuild.config.js`): TypeScript → JavaScript bundler - Entry points: toolbar.tsx, sidebar.tsx, report-window.tsx, background.ts, diag.tsx, styles - External React libs loaded via globals (`globalThis.React`, `globalThis.ReactDOM`) - SCSS plugin for styling - Define flags: `PLUGIN_NAME`, `PLUGIN_URL` - Watch mode available for development - **Target Selection**: Set `TARGET=firefox` or `TARGET=chrome` before build to select browser API implementation - **Manifest**: Currently uses Manifest V2 (Firefox). Chrome support is partial and being expanded. ## Important Implementation Details ### Third-Party Detection Heuristics When determining if a request is third-party (in `extended-request.ts`): 1. Compare request origin with tab origin 2. Check `documentUrl` and `originUrl` from webRequest details 3. Use `urlClassification.thirdParty` if available 4. Analyze `frameAncestors` for nested iframe scenarios 5. Fall back to comparing hostnames ### Stolen Data Extraction Strategy Data is extracted from multiple sources (priority order): 1. **Cookies**: Via `browser.cookies.getAll()` 2. **Query Parameters**: Parsed from URL search string 3. **Pathname**: URL path segments 4. **Headers**: Request headers (Cookie, Referer, etc.) 5. **Request Body**: POST/PUT data (form data, JSON) ### Value Parsing Chain `StolenDataEntry` recursively decodes values: 1. Detect Base64 encoding → decode 2. Detect URL encoding → decode 3. Detect JSON → parse 4. Detect nested URLs → extract 5. Stop at maximum recursion depth ### Auto-Marking Rules RequestClusters automatically mark entries as suspicious if: - Value exposes browsing history (referrer, path info) - Cookie length > 100 characters - Known trackers: Google Analytics, Facebook, DoubleClick, etc. - Classified as `id` or `history` type ### Event System - **SaferEmitter** (`safer-emitter.ts`): EventEmitter wrapper with async emission - Uses `setTimeout(..., 0)` to decouple events from synchronous request handling - Prevents errors in listeners from breaking request flow - **useEmitter Hook** (`components/sidebar/sidebar.tsx`): React integration - Increments counter state on each event to trigger re-renders - Automatically subscribes/unsubscribes via `useEffect` ## Common Development Tasks ### Adding a New Data Source 1. Add extraction logic to `ExtendedRequest.getAllData()` in `extended-request.ts` 2. Define new source type in `StolenDataEntry.sources` in `stolen-data-entry.ts` 3. Update classification logic in `StolenDataEntry.getClassification()` 4. Update UI components to display the new source (if needed) ### Supporting a New Browser 1. Create implementation in `lib/browser-api/` (e.g., `safari.ts`) 2. Update `lib/browser-api/index.ts` to export based on `TARGET` env var 3. Add corresponding npm scripts in `package.json` (`build:safari`, etc.) 4. Test browser-specific APIs for compatibility ### Modifying Report Templates - **Email templates**: `email-template-polite.js`, `email-template-harsh.js` - **Problem deduction**: `components/report-window/deduce-problems.tsx` - **Survey questions**: `components/report-window/questions.tsx` - Templates are in Polish and follow GDPR complaint structure ### Debugging Request Interception - Check `ExtendedRequest.by_id` registry for request lookup issues - Verify `Memory.clusters` structure for data organization - Use browser DevTools → Extensions → Background Page for logging - Enable `console.log` in `memory.ts` event listeners ## Project Constraints - **No automated tests**: Manual testing only via browser extension loading - **Polish language**: UI and reports are Polish-focused (English i18n is future work) - **Manifest V2**: Primary target is Firefox; Chrome V3 migration is in progress - **External screenshot service**: Report generation depends on external API - **No minification**: Currently disabled in esbuild config (commented out) - **Node.js 16.x requirement**: Specified in README ## Repository Information - **Primary Repository**: https://git.internet-czas-dzialac.pl/icd/rentgen (Gitea) - **Mirror**: GitHub (issues not accepted there) - **Issue Tracking**: Email kontakt@internet-czas-dzialac.pl - **License**: GPL-3.0-or-later - **Authors**: Kuba Orlik, Arkadiusz Wieczorek (Internet. Time to act! Foundation) ## File Organization ``` rentgen/ ├── background.ts # Extension entry point ├── memory.ts # Central state manager (Memory singleton) ├── extended-request.ts # HTTP request wrapper and data extraction ├── request-cluster.ts # Request aggregation by domain ├── stolen-data-entry.ts # Individual data point representation ├── safer-emitter.ts # EventEmitter wrapper ├── util.ts # Utility functions ├── components/ │ ├── sidebar/ # Main extension UI │ ├── toolbar/ # Browser action popup │ └── report-window/ # Report generation workflow ├── lib/ │ └── browser-api/ # Cross-browser API abstraction ├── email-template-*.js # Polish email templates ├── esbuild.config.js # Build configuration ├── manifest.json # Extension manifest (V2) └── assets/ # Icons and screenshots ```