forked from icd/rentgen
chore: dodaj wsparcie Docker i dokumentację Claude Code
- Dodano Dockerfile z multi-stage build (artifacts + dev environment) - Dodano .dockerignore dla optymalizacji budowania - Dodano CLAUDE.md z dokumentacją architektury i workflow dla Claude Code 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
parent
cf94d45ee1
commit
6c1df64b40
16
.dockerignore
Normal file
16
.dockerignore
Normal file
@ -0,0 +1,16 @@
|
||||
.log
|
||||
node_modules
|
||||
sidebar.js
|
||||
web-ext-artifacts/
|
||||
lib/*
|
||||
yarn-error.log
|
||||
rentgen.zip
|
||||
|
||||
# Generated PNG icons (build artifacts)
|
||||
assets/icons/*.png
|
||||
assets/icon-addon-*.png
|
||||
|
||||
# Exception: do not ignore the `browser-api` directory inside `lib`
|
||||
!/lib/browser-api/
|
||||
|
||||
Dockerfile
|
||||
250
CLAUDE.md
Normal file
250
CLAUDE.md
Normal file
@ -0,0 +1,250 @@
|
||||
# CLAUDE.md
|
||||
|
||||
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
|
||||
|
||||
## Project Overview
|
||||
|
||||
**Rentgen** is a privacy-focused browser extension for Firefox and Chrome that intercepts web traffic, identifies third-party tracking, and visualizes stolen data (cookies, browsing history, etc.). It generates GDPR-compliant reports and email templates for Polish website administrators and the Personal Data Protection Office.
|
||||
|
||||
**Language Note**: The codebase is in English, but the extension UI and generated reports are in Polish. Comments and documentation may be bilingual.
|
||||
|
||||
## Build & Development Commands
|
||||
|
||||
### Standard Build Workflow
|
||||
```bash
|
||||
npm install # Install dependencies
|
||||
npm run build # Build for Firefox (default)
|
||||
npm run build:firefox # Build for Firefox explicitly
|
||||
npm run build:chrome # Build for Chrome
|
||||
npm run create-package # Package into web-ext-artifacts/ directory
|
||||
npm run build-addon # Complete build: install + build + package
|
||||
```
|
||||
|
||||
### Development Workflow
|
||||
```bash
|
||||
npm run watch # Watch mode - auto-rebuild on file changes
|
||||
npm run watch:firefox # Watch for Firefox
|
||||
npm run watch:chrome # Watch for Chrome
|
||||
npm run ext-test # Run extension in temporary Firefox profile (web-ext run)
|
||||
```
|
||||
|
||||
### Quality Checks
|
||||
```bash
|
||||
npm run typecheck # Run TypeScript type checking (tsc --noEmit)
|
||||
npm run lint # Lint extension with web-ext lint
|
||||
```
|
||||
|
||||
### Testing in Browser
|
||||
After building, load the temporary add-on:
|
||||
1. Firefox: Navigate to `about:debugging` → This Firefox → Load Temporary Add-on
|
||||
2. Chrome: Navigate to `chrome://extensions/` → Enable Developer Mode → Load unpacked
|
||||
|
||||
**Note**: There are no automated test suites in this codebase. Testing is manual via browser extension loading.
|
||||
|
||||
## Architecture Overview
|
||||
|
||||
### Core Pattern: Event-Driven Singleton with Observer Pattern
|
||||
|
||||
Rentgen uses a centralized **Memory** singleton (background script) that:
|
||||
- Intercepts all HTTP requests via `webRequest` API
|
||||
- Maintains hierarchical data structure: `origin → shorthost → RequestCluster → StolenDataEntry`
|
||||
- Emits `'change'` events when data updates
|
||||
- Drives UI re-renders in React components via custom `useEmitter` hook
|
||||
|
||||
### Key Components
|
||||
|
||||
#### Background Service (`background.ts` + `memory.ts`)
|
||||
- **Memory class** (extends SaferEmitter): Central orchestrator managing all extension state
|
||||
- Listens to `webRequest.onBeforeRequest` and `webRequest.onBeforeSendHeaders`
|
||||
- Maintains `clusters` map: `origin → Map<shorthost, RequestCluster>`
|
||||
- Emits `'change'` events to notify UI components
|
||||
- Updates browser badge with domain count and color indicators
|
||||
- Accessible globally via `getMemory()` singleton
|
||||
|
||||
#### Network Interception Layer
|
||||
- **ExtendedRequest class** (`extended-request.ts`): Wraps individual HTTP requests
|
||||
- Static registry: `ExtendedRequest.by_id[requestId]` for fast lookup
|
||||
- Two-phase initialization: constructor (body) + `init()` method (headers)
|
||||
- Detects third-party requests by comparing origins
|
||||
- Extracts "stolen data" from: cookies, query params, pathname, headers, request body
|
||||
- Generates HAR (HTTP Archive) format for reports
|
||||
- Calculates priority scores based on data sensitivity
|
||||
|
||||
- **RequestCluster class** (`request-cluster.ts`): Groups requests by `origin + shorthost`
|
||||
- Aggregates StolenDataEntry items with deduplication
|
||||
- Tracks expanded/collapsed state for UI
|
||||
- Auto-marks suspicious entries (history exposure, tracking IDs)
|
||||
- Emits `'change'` events on modifications
|
||||
|
||||
#### Data Classification
|
||||
- **StolenDataEntry class** (`stolen-data-entry.ts`): Individual data points
|
||||
- Sources: `cookie`, `pathname`, `queryparams`, `header`, `request_body`
|
||||
- Classifications: `id` (tracking ID), `history` (browsing history), `location` (geolocation)
|
||||
- Smart value parsing: recursively decodes Base64, JSON, URLs, nested structures
|
||||
- Priority calculation: combines value length, origin exposure, data type
|
||||
- Mark/unmark system for user selection in reports
|
||||
|
||||
#### Browser API Abstraction (`lib/browser-api/`)
|
||||
- **Cross-browser compatibility layer** selected at build time via `TARGET` env var
|
||||
- `types.ts`: Unified interface for tabs, badge, webRequest, cookies APIs
|
||||
- `index.ts`: Exports Chrome or Firefox implementation based on `process.env.TARGET`
|
||||
- Standardizes differences (e.g., `browserAction` vs `action`)
|
||||
|
||||
### Data Flow
|
||||
|
||||
```
|
||||
HTTP Request Initiated
|
||||
↓
|
||||
Memory.onBeforeRequest → Create ExtendedRequest (capture body)
|
||||
↓
|
||||
Memory.onBeforeSendHeaders → ExtendedRequest.init() (capture headers)
|
||||
↓
|
||||
Extract stolen data (cookies, params, headers, body)
|
||||
↓
|
||||
Memory.register() → Check if third-party → Add to RequestCluster
|
||||
↓
|
||||
Memory.emit('change', shorthost) → Broadcast event
|
||||
↓
|
||||
React components (via useEmitter hook) → UI re-renders
|
||||
```
|
||||
|
||||
### UI Components
|
||||
|
||||
#### Sidebar (`components/sidebar/`)
|
||||
- **sidebar.tsx**: Main extension UI listing third-party domains
|
||||
- **stolen-data.tsx**: Renders RequestClusters with filtering options
|
||||
- **stolen-data-cluster.tsx**: Expandable cluster showing individual StolenDataEntry items
|
||||
- Filters: `minValueLength`, `cookiesOnly`, `cookiesOrOriginOnly`
|
||||
- Real-time updates via `useEmitter(Memory)` hook
|
||||
|
||||
#### Report Window (`components/report-window/`)
|
||||
- Multi-stage report generation workflow:
|
||||
1. **Survey**: User questionnaire (role, tone, gender pronouns) via `survey-react`
|
||||
2. **Screenshot**: External service generates domain screenshots
|
||||
3. **Preview**: Final email/report content with GDPR violation analysis
|
||||
- **deduce-problems.tsx**: Analyzes survey answers to identify GDPR violations
|
||||
- **har-converter.tsx**: Generates filtered HAR archives
|
||||
- **email-content.tsx**: Renders Polish email template (polite or harsh tone)
|
||||
|
||||
#### Toolbar (`components/toolbar/`)
|
||||
- Browser action popup (top-right icon)
|
||||
|
||||
### Build System
|
||||
|
||||
- **esbuild** (`esbuild.config.js`): TypeScript → JavaScript bundler
|
||||
- Entry points: toolbar.tsx, sidebar.tsx, report-window.tsx, background.ts, diag.tsx, styles
|
||||
- External React libs loaded via globals (`globalThis.React`, `globalThis.ReactDOM`)
|
||||
- SCSS plugin for styling
|
||||
- Define flags: `PLUGIN_NAME`, `PLUGIN_URL`
|
||||
- Watch mode available for development
|
||||
|
||||
- **Target Selection**: Set `TARGET=firefox` or `TARGET=chrome` before build to select browser API implementation
|
||||
|
||||
- **Manifest**: Currently uses Manifest V2 (Firefox). Chrome support is partial and being expanded.
|
||||
|
||||
## Important Implementation Details
|
||||
|
||||
### Third-Party Detection Heuristics
|
||||
When determining if a request is third-party (in `extended-request.ts`):
|
||||
1. Compare request origin with tab origin
|
||||
2. Check `documentUrl` and `originUrl` from webRequest details
|
||||
3. Use `urlClassification.thirdParty` if available
|
||||
4. Analyze `frameAncestors` for nested iframe scenarios
|
||||
5. Fall back to comparing hostnames
|
||||
|
||||
### Stolen Data Extraction Strategy
|
||||
Data is extracted from multiple sources (priority order):
|
||||
1. **Cookies**: Via `browser.cookies.getAll()`
|
||||
2. **Query Parameters**: Parsed from URL search string
|
||||
3. **Pathname**: URL path segments
|
||||
4. **Headers**: Request headers (Cookie, Referer, etc.)
|
||||
5. **Request Body**: POST/PUT data (form data, JSON)
|
||||
|
||||
### Value Parsing Chain
|
||||
`StolenDataEntry` recursively decodes values:
|
||||
1. Detect Base64 encoding → decode
|
||||
2. Detect URL encoding → decode
|
||||
3. Detect JSON → parse
|
||||
4. Detect nested URLs → extract
|
||||
5. Stop at maximum recursion depth
|
||||
|
||||
### Auto-Marking Rules
|
||||
RequestClusters automatically mark entries as suspicious if:
|
||||
- Value exposes browsing history (referrer, path info)
|
||||
- Cookie length > 100 characters
|
||||
- Known trackers: Google Analytics, Facebook, DoubleClick, etc.
|
||||
- Classified as `id` or `history` type
|
||||
|
||||
### Event System
|
||||
- **SaferEmitter** (`safer-emitter.ts`): EventEmitter wrapper with async emission
|
||||
- Uses `setTimeout(..., 0)` to decouple events from synchronous request handling
|
||||
- Prevents errors in listeners from breaking request flow
|
||||
- **useEmitter Hook** (`components/sidebar/sidebar.tsx`): React integration
|
||||
- Increments counter state on each event to trigger re-renders
|
||||
- Automatically subscribes/unsubscribes via `useEffect`
|
||||
|
||||
## Common Development Tasks
|
||||
|
||||
### Adding a New Data Source
|
||||
1. Add extraction logic to `ExtendedRequest.getAllData()` in `extended-request.ts`
|
||||
2. Define new source type in `StolenDataEntry.sources` in `stolen-data-entry.ts`
|
||||
3. Update classification logic in `StolenDataEntry.getClassification()`
|
||||
4. Update UI components to display the new source (if needed)
|
||||
|
||||
### Supporting a New Browser
|
||||
1. Create implementation in `lib/browser-api/` (e.g., `safari.ts`)
|
||||
2. Update `lib/browser-api/index.ts` to export based on `TARGET` env var
|
||||
3. Add corresponding npm scripts in `package.json` (`build:safari`, etc.)
|
||||
4. Test browser-specific APIs for compatibility
|
||||
|
||||
### Modifying Report Templates
|
||||
- **Email templates**: `email-template-polite.js`, `email-template-harsh.js`
|
||||
- **Problem deduction**: `components/report-window/deduce-problems.tsx`
|
||||
- **Survey questions**: `components/report-window/questions.tsx`
|
||||
- Templates are in Polish and follow GDPR complaint structure
|
||||
|
||||
### Debugging Request Interception
|
||||
- Check `ExtendedRequest.by_id` registry for request lookup issues
|
||||
- Verify `Memory.clusters` structure for data organization
|
||||
- Use browser DevTools → Extensions → Background Page for logging
|
||||
- Enable `console.log` in `memory.ts` event listeners
|
||||
|
||||
## Project Constraints
|
||||
|
||||
- **No automated tests**: Manual testing only via browser extension loading
|
||||
- **Polish language**: UI and reports are Polish-focused (English i18n is future work)
|
||||
- **Manifest V2**: Primary target is Firefox; Chrome V3 migration is in progress
|
||||
- **External screenshot service**: Report generation depends on external API
|
||||
- **No minification**: Currently disabled in esbuild config (commented out)
|
||||
- **Node.js 16.x requirement**: Specified in README
|
||||
|
||||
## Repository Information
|
||||
|
||||
- **Primary Repository**: https://git.internet-czas-dzialac.pl/icd/rentgen (Gitea)
|
||||
- **Mirror**: GitHub (issues not accepted there)
|
||||
- **Issue Tracking**: Email kontakt@internet-czas-dzialac.pl
|
||||
- **License**: GPL-3.0-or-later
|
||||
- **Authors**: Kuba Orlik, Arkadiusz Wieczorek (Internet. Time to act! Foundation)
|
||||
|
||||
## File Organization
|
||||
|
||||
```
|
||||
rentgen/
|
||||
├── background.ts # Extension entry point
|
||||
├── memory.ts # Central state manager (Memory singleton)
|
||||
├── extended-request.ts # HTTP request wrapper and data extraction
|
||||
├── request-cluster.ts # Request aggregation by domain
|
||||
├── stolen-data-entry.ts # Individual data point representation
|
||||
├── safer-emitter.ts # EventEmitter wrapper
|
||||
├── util.ts # Utility functions
|
||||
├── components/
|
||||
│ ├── sidebar/ # Main extension UI
|
||||
│ ├── toolbar/ # Browser action popup
|
||||
│ └── report-window/ # Report generation workflow
|
||||
├── lib/
|
||||
│ └── browser-api/ # Cross-browser API abstraction
|
||||
├── email-template-*.js # Polish email templates
|
||||
├── esbuild.config.js # Build configuration
|
||||
├── manifest.json # Extension manifest (V2)
|
||||
└── assets/ # Icons and screenshots
|
||||
```
|
||||
45
Dockerfile
Normal file
45
Dockerfile
Normal file
@ -0,0 +1,45 @@
|
||||
# Rentgen Browser Extension - Docker Build
|
||||
#
|
||||
# Usage:
|
||||
# Build and extract artifacts directly:
|
||||
# docker buildx build . --output artifacts
|
||||
#
|
||||
# Or traditional build (creates full development environment):
|
||||
# docker build -t rentgen .
|
||||
# docker run --rm rentgen ls -lh /app/web-ext-artifacts/
|
||||
#
|
||||
# Run commands in the container:
|
||||
# docker run --rm rentgen npm run build:chrome
|
||||
# docker run --rm rentgen npm run typecheck
|
||||
|
||||
# Build stage
|
||||
FROM node:lts AS builder
|
||||
|
||||
WORKDIR /app
|
||||
|
||||
# Copy package files for dependency installation (better layer caching)
|
||||
COPY package.json package-lock.json ./
|
||||
|
||||
# Install dependencies
|
||||
RUN npm install
|
||||
|
||||
# Copy source code (respecting .dockerignore)
|
||||
COPY . .
|
||||
|
||||
# Build the extension for Firefox (default)
|
||||
RUN npm run build
|
||||
|
||||
# Create the package
|
||||
RUN npm run create-package
|
||||
|
||||
# Artifacts stage - only contains the built artifacts (for --output)
|
||||
FROM scratch AS artifacts
|
||||
|
||||
# Copy only the built extension zip file to root
|
||||
COPY --from=builder /app/web-ext-artifacts/*.zip /
|
||||
|
||||
# Default stage - full development environment
|
||||
FROM builder
|
||||
|
||||
# Default command shows the built artifact
|
||||
CMD ["ls", "-lh", "/app/web-ext-artifacts/"]
|
||||
Loading…
x
Reference in New Issue
Block a user