1
0
forked from icd/rentgen

chore: dodaj wsparcie Docker i dokumentację Claude Code

- Dodano Dockerfile z multi-stage build (artifacts + dev environment)
- Dodano .dockerignore dla optymalizacji budowania
- Dodano CLAUDE.md z dokumentacją architektury i workflow dla Claude Code

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
root 2025-10-25 13:27:29 +02:00
parent cf94d45ee1
commit 6c1df64b40
3 changed files with 311 additions and 0 deletions

16
.dockerignore Normal file
View File

@ -0,0 +1,16 @@
.log
node_modules
sidebar.js
web-ext-artifacts/
lib/*
yarn-error.log
rentgen.zip
# Generated PNG icons (build artifacts)
assets/icons/*.png
assets/icon-addon-*.png
# Exception: do not ignore the `browser-api` directory inside `lib`
!/lib/browser-api/
Dockerfile

250
CLAUDE.md Normal file
View File

@ -0,0 +1,250 @@
# CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
## Project Overview
**Rentgen** is a privacy-focused browser extension for Firefox and Chrome that intercepts web traffic, identifies third-party tracking, and visualizes stolen data (cookies, browsing history, etc.). It generates GDPR-compliant reports and email templates for Polish website administrators and the Personal Data Protection Office.
**Language Note**: The codebase is in English, but the extension UI and generated reports are in Polish. Comments and documentation may be bilingual.
## Build & Development Commands
### Standard Build Workflow
```bash
npm install # Install dependencies
npm run build # Build for Firefox (default)
npm run build:firefox # Build for Firefox explicitly
npm run build:chrome # Build for Chrome
npm run create-package # Package into web-ext-artifacts/ directory
npm run build-addon # Complete build: install + build + package
```
### Development Workflow
```bash
npm run watch # Watch mode - auto-rebuild on file changes
npm run watch:firefox # Watch for Firefox
npm run watch:chrome # Watch for Chrome
npm run ext-test # Run extension in temporary Firefox profile (web-ext run)
```
### Quality Checks
```bash
npm run typecheck # Run TypeScript type checking (tsc --noEmit)
npm run lint # Lint extension with web-ext lint
```
### Testing in Browser
After building, load the temporary add-on:
1. Firefox: Navigate to `about:debugging` → This Firefox → Load Temporary Add-on
2. Chrome: Navigate to `chrome://extensions/` → Enable Developer Mode → Load unpacked
**Note**: There are no automated test suites in this codebase. Testing is manual via browser extension loading.
## Architecture Overview
### Core Pattern: Event-Driven Singleton with Observer Pattern
Rentgen uses a centralized **Memory** singleton (background script) that:
- Intercepts all HTTP requests via `webRequest` API
- Maintains hierarchical data structure: `origin → shorthost → RequestCluster → StolenDataEntry`
- Emits `'change'` events when data updates
- Drives UI re-renders in React components via custom `useEmitter` hook
### Key Components
#### Background Service (`background.ts` + `memory.ts`)
- **Memory class** (extends SaferEmitter): Central orchestrator managing all extension state
- Listens to `webRequest.onBeforeRequest` and `webRequest.onBeforeSendHeaders`
- Maintains `clusters` map: `origin → Map<shorthost, RequestCluster>`
- Emits `'change'` events to notify UI components
- Updates browser badge with domain count and color indicators
- Accessible globally via `getMemory()` singleton
#### Network Interception Layer
- **ExtendedRequest class** (`extended-request.ts`): Wraps individual HTTP requests
- Static registry: `ExtendedRequest.by_id[requestId]` for fast lookup
- Two-phase initialization: constructor (body) + `init()` method (headers)
- Detects third-party requests by comparing origins
- Extracts "stolen data" from: cookies, query params, pathname, headers, request body
- Generates HAR (HTTP Archive) format for reports
- Calculates priority scores based on data sensitivity
- **RequestCluster class** (`request-cluster.ts`): Groups requests by `origin + shorthost`
- Aggregates StolenDataEntry items with deduplication
- Tracks expanded/collapsed state for UI
- Auto-marks suspicious entries (history exposure, tracking IDs)
- Emits `'change'` events on modifications
#### Data Classification
- **StolenDataEntry class** (`stolen-data-entry.ts`): Individual data points
- Sources: `cookie`, `pathname`, `queryparams`, `header`, `request_body`
- Classifications: `id` (tracking ID), `history` (browsing history), `location` (geolocation)
- Smart value parsing: recursively decodes Base64, JSON, URLs, nested structures
- Priority calculation: combines value length, origin exposure, data type
- Mark/unmark system for user selection in reports
#### Browser API Abstraction (`lib/browser-api/`)
- **Cross-browser compatibility layer** selected at build time via `TARGET` env var
- `types.ts`: Unified interface for tabs, badge, webRequest, cookies APIs
- `index.ts`: Exports Chrome or Firefox implementation based on `process.env.TARGET`
- Standardizes differences (e.g., `browserAction` vs `action`)
### Data Flow
```
HTTP Request Initiated
Memory.onBeforeRequest → Create ExtendedRequest (capture body)
Memory.onBeforeSendHeaders → ExtendedRequest.init() (capture headers)
Extract stolen data (cookies, params, headers, body)
Memory.register() → Check if third-party → Add to RequestCluster
Memory.emit('change', shorthost) → Broadcast event
React components (via useEmitter hook) → UI re-renders
```
### UI Components
#### Sidebar (`components/sidebar/`)
- **sidebar.tsx**: Main extension UI listing third-party domains
- **stolen-data.tsx**: Renders RequestClusters with filtering options
- **stolen-data-cluster.tsx**: Expandable cluster showing individual StolenDataEntry items
- Filters: `minValueLength`, `cookiesOnly`, `cookiesOrOriginOnly`
- Real-time updates via `useEmitter(Memory)` hook
#### Report Window (`components/report-window/`)
- Multi-stage report generation workflow:
1. **Survey**: User questionnaire (role, tone, gender pronouns) via `survey-react`
2. **Screenshot**: External service generates domain screenshots
3. **Preview**: Final email/report content with GDPR violation analysis
- **deduce-problems.tsx**: Analyzes survey answers to identify GDPR violations
- **har-converter.tsx**: Generates filtered HAR archives
- **email-content.tsx**: Renders Polish email template (polite or harsh tone)
#### Toolbar (`components/toolbar/`)
- Browser action popup (top-right icon)
### Build System
- **esbuild** (`esbuild.config.js`): TypeScript → JavaScript bundler
- Entry points: toolbar.tsx, sidebar.tsx, report-window.tsx, background.ts, diag.tsx, styles
- External React libs loaded via globals (`globalThis.React`, `globalThis.ReactDOM`)
- SCSS plugin for styling
- Define flags: `PLUGIN_NAME`, `PLUGIN_URL`
- Watch mode available for development
- **Target Selection**: Set `TARGET=firefox` or `TARGET=chrome` before build to select browser API implementation
- **Manifest**: Currently uses Manifest V2 (Firefox). Chrome support is partial and being expanded.
## Important Implementation Details
### Third-Party Detection Heuristics
When determining if a request is third-party (in `extended-request.ts`):
1. Compare request origin with tab origin
2. Check `documentUrl` and `originUrl` from webRequest details
3. Use `urlClassification.thirdParty` if available
4. Analyze `frameAncestors` for nested iframe scenarios
5. Fall back to comparing hostnames
### Stolen Data Extraction Strategy
Data is extracted from multiple sources (priority order):
1. **Cookies**: Via `browser.cookies.getAll()`
2. **Query Parameters**: Parsed from URL search string
3. **Pathname**: URL path segments
4. **Headers**: Request headers (Cookie, Referer, etc.)
5. **Request Body**: POST/PUT data (form data, JSON)
### Value Parsing Chain
`StolenDataEntry` recursively decodes values:
1. Detect Base64 encoding → decode
2. Detect URL encoding → decode
3. Detect JSON → parse
4. Detect nested URLs → extract
5. Stop at maximum recursion depth
### Auto-Marking Rules
RequestClusters automatically mark entries as suspicious if:
- Value exposes browsing history (referrer, path info)
- Cookie length > 100 characters
- Known trackers: Google Analytics, Facebook, DoubleClick, etc.
- Classified as `id` or `history` type
### Event System
- **SaferEmitter** (`safer-emitter.ts`): EventEmitter wrapper with async emission
- Uses `setTimeout(..., 0)` to decouple events from synchronous request handling
- Prevents errors in listeners from breaking request flow
- **useEmitter Hook** (`components/sidebar/sidebar.tsx`): React integration
- Increments counter state on each event to trigger re-renders
- Automatically subscribes/unsubscribes via `useEffect`
## Common Development Tasks
### Adding a New Data Source
1. Add extraction logic to `ExtendedRequest.getAllData()` in `extended-request.ts`
2. Define new source type in `StolenDataEntry.sources` in `stolen-data-entry.ts`
3. Update classification logic in `StolenDataEntry.getClassification()`
4. Update UI components to display the new source (if needed)
### Supporting a New Browser
1. Create implementation in `lib/browser-api/` (e.g., `safari.ts`)
2. Update `lib/browser-api/index.ts` to export based on `TARGET` env var
3. Add corresponding npm scripts in `package.json` (`build:safari`, etc.)
4. Test browser-specific APIs for compatibility
### Modifying Report Templates
- **Email templates**: `email-template-polite.js`, `email-template-harsh.js`
- **Problem deduction**: `components/report-window/deduce-problems.tsx`
- **Survey questions**: `components/report-window/questions.tsx`
- Templates are in Polish and follow GDPR complaint structure
### Debugging Request Interception
- Check `ExtendedRequest.by_id` registry for request lookup issues
- Verify `Memory.clusters` structure for data organization
- Use browser DevTools → Extensions → Background Page for logging
- Enable `console.log` in `memory.ts` event listeners
## Project Constraints
- **No automated tests**: Manual testing only via browser extension loading
- **Polish language**: UI and reports are Polish-focused (English i18n is future work)
- **Manifest V2**: Primary target is Firefox; Chrome V3 migration is in progress
- **External screenshot service**: Report generation depends on external API
- **No minification**: Currently disabled in esbuild config (commented out)
- **Node.js 16.x requirement**: Specified in README
## Repository Information
- **Primary Repository**: https://git.internet-czas-dzialac.pl/icd/rentgen (Gitea)
- **Mirror**: GitHub (issues not accepted there)
- **Issue Tracking**: Email kontakt@internet-czas-dzialac.pl
- **License**: GPL-3.0-or-later
- **Authors**: Kuba Orlik, Arkadiusz Wieczorek (Internet. Time to act! Foundation)
## File Organization
```
rentgen/
├── background.ts # Extension entry point
├── memory.ts # Central state manager (Memory singleton)
├── extended-request.ts # HTTP request wrapper and data extraction
├── request-cluster.ts # Request aggregation by domain
├── stolen-data-entry.ts # Individual data point representation
├── safer-emitter.ts # EventEmitter wrapper
├── util.ts # Utility functions
├── components/
│ ├── sidebar/ # Main extension UI
│ ├── toolbar/ # Browser action popup
│ └── report-window/ # Report generation workflow
├── lib/
│ └── browser-api/ # Cross-browser API abstraction
├── email-template-*.js # Polish email templates
├── esbuild.config.js # Build configuration
├── manifest.json # Extension manifest (V2)
└── assets/ # Icons and screenshots
```

45
Dockerfile Normal file
View File

@ -0,0 +1,45 @@
# Rentgen Browser Extension - Docker Build
#
# Usage:
# Build and extract artifacts directly:
# docker buildx build . --output artifacts
#
# Or traditional build (creates full development environment):
# docker build -t rentgen .
# docker run --rm rentgen ls -lh /app/web-ext-artifacts/
#
# Run commands in the container:
# docker run --rm rentgen npm run build:chrome
# docker run --rm rentgen npm run typecheck
# Build stage
FROM node:lts AS builder
WORKDIR /app
# Copy package files for dependency installation (better layer caching)
COPY package.json package-lock.json ./
# Install dependencies
RUN npm install
# Copy source code (respecting .dockerignore)
COPY . .
# Build the extension for Firefox (default)
RUN npm run build
# Create the package
RUN npm run create-package
# Artifacts stage - only contains the built artifacts (for --output)
FROM scratch AS artifacts
# Copy only the built extension zip file to root
COPY --from=builder /app/web-ext-artifacts/*.zip /
# Default stage - full development environment
FROM builder
# Default command shows the built artifact
CMD ["ls", "-lh", "/app/web-ext-artifacts/"]