forked from icd/rentgen
chore: dodaj wsparcie Docker i dokumentację Claude Code
- Dodano Dockerfile z multi-stage build (artifacts + dev environment) - Dodano .dockerignore dla optymalizacji budowania - Dodano CLAUDE.md z dokumentacją architektury i workflow dla Claude Code 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
parent
cf94d45ee1
commit
6c1df64b40
16
.dockerignore
Normal file
16
.dockerignore
Normal file
@ -0,0 +1,16 @@
|
|||||||
|
.log
|
||||||
|
node_modules
|
||||||
|
sidebar.js
|
||||||
|
web-ext-artifacts/
|
||||||
|
lib/*
|
||||||
|
yarn-error.log
|
||||||
|
rentgen.zip
|
||||||
|
|
||||||
|
# Generated PNG icons (build artifacts)
|
||||||
|
assets/icons/*.png
|
||||||
|
assets/icon-addon-*.png
|
||||||
|
|
||||||
|
# Exception: do not ignore the `browser-api` directory inside `lib`
|
||||||
|
!/lib/browser-api/
|
||||||
|
|
||||||
|
Dockerfile
|
||||||
250
CLAUDE.md
Normal file
250
CLAUDE.md
Normal file
@ -0,0 +1,250 @@
|
|||||||
|
# CLAUDE.md
|
||||||
|
|
||||||
|
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
|
||||||
|
|
||||||
|
## Project Overview
|
||||||
|
|
||||||
|
**Rentgen** is a privacy-focused browser extension for Firefox and Chrome that intercepts web traffic, identifies third-party tracking, and visualizes stolen data (cookies, browsing history, etc.). It generates GDPR-compliant reports and email templates for Polish website administrators and the Personal Data Protection Office.
|
||||||
|
|
||||||
|
**Language Note**: The codebase is in English, but the extension UI and generated reports are in Polish. Comments and documentation may be bilingual.
|
||||||
|
|
||||||
|
## Build & Development Commands
|
||||||
|
|
||||||
|
### Standard Build Workflow
|
||||||
|
```bash
|
||||||
|
npm install # Install dependencies
|
||||||
|
npm run build # Build for Firefox (default)
|
||||||
|
npm run build:firefox # Build for Firefox explicitly
|
||||||
|
npm run build:chrome # Build for Chrome
|
||||||
|
npm run create-package # Package into web-ext-artifacts/ directory
|
||||||
|
npm run build-addon # Complete build: install + build + package
|
||||||
|
```
|
||||||
|
|
||||||
|
### Development Workflow
|
||||||
|
```bash
|
||||||
|
npm run watch # Watch mode - auto-rebuild on file changes
|
||||||
|
npm run watch:firefox # Watch for Firefox
|
||||||
|
npm run watch:chrome # Watch for Chrome
|
||||||
|
npm run ext-test # Run extension in temporary Firefox profile (web-ext run)
|
||||||
|
```
|
||||||
|
|
||||||
|
### Quality Checks
|
||||||
|
```bash
|
||||||
|
npm run typecheck # Run TypeScript type checking (tsc --noEmit)
|
||||||
|
npm run lint # Lint extension with web-ext lint
|
||||||
|
```
|
||||||
|
|
||||||
|
### Testing in Browser
|
||||||
|
After building, load the temporary add-on:
|
||||||
|
1. Firefox: Navigate to `about:debugging` → This Firefox → Load Temporary Add-on
|
||||||
|
2. Chrome: Navigate to `chrome://extensions/` → Enable Developer Mode → Load unpacked
|
||||||
|
|
||||||
|
**Note**: There are no automated test suites in this codebase. Testing is manual via browser extension loading.
|
||||||
|
|
||||||
|
## Architecture Overview
|
||||||
|
|
||||||
|
### Core Pattern: Event-Driven Singleton with Observer Pattern
|
||||||
|
|
||||||
|
Rentgen uses a centralized **Memory** singleton (background script) that:
|
||||||
|
- Intercepts all HTTP requests via `webRequest` API
|
||||||
|
- Maintains hierarchical data structure: `origin → shorthost → RequestCluster → StolenDataEntry`
|
||||||
|
- Emits `'change'` events when data updates
|
||||||
|
- Drives UI re-renders in React components via custom `useEmitter` hook
|
||||||
|
|
||||||
|
### Key Components
|
||||||
|
|
||||||
|
#### Background Service (`background.ts` + `memory.ts`)
|
||||||
|
- **Memory class** (extends SaferEmitter): Central orchestrator managing all extension state
|
||||||
|
- Listens to `webRequest.onBeforeRequest` and `webRequest.onBeforeSendHeaders`
|
||||||
|
- Maintains `clusters` map: `origin → Map<shorthost, RequestCluster>`
|
||||||
|
- Emits `'change'` events to notify UI components
|
||||||
|
- Updates browser badge with domain count and color indicators
|
||||||
|
- Accessible globally via `getMemory()` singleton
|
||||||
|
|
||||||
|
#### Network Interception Layer
|
||||||
|
- **ExtendedRequest class** (`extended-request.ts`): Wraps individual HTTP requests
|
||||||
|
- Static registry: `ExtendedRequest.by_id[requestId]` for fast lookup
|
||||||
|
- Two-phase initialization: constructor (body) + `init()` method (headers)
|
||||||
|
- Detects third-party requests by comparing origins
|
||||||
|
- Extracts "stolen data" from: cookies, query params, pathname, headers, request body
|
||||||
|
- Generates HAR (HTTP Archive) format for reports
|
||||||
|
- Calculates priority scores based on data sensitivity
|
||||||
|
|
||||||
|
- **RequestCluster class** (`request-cluster.ts`): Groups requests by `origin + shorthost`
|
||||||
|
- Aggregates StolenDataEntry items with deduplication
|
||||||
|
- Tracks expanded/collapsed state for UI
|
||||||
|
- Auto-marks suspicious entries (history exposure, tracking IDs)
|
||||||
|
- Emits `'change'` events on modifications
|
||||||
|
|
||||||
|
#### Data Classification
|
||||||
|
- **StolenDataEntry class** (`stolen-data-entry.ts`): Individual data points
|
||||||
|
- Sources: `cookie`, `pathname`, `queryparams`, `header`, `request_body`
|
||||||
|
- Classifications: `id` (tracking ID), `history` (browsing history), `location` (geolocation)
|
||||||
|
- Smart value parsing: recursively decodes Base64, JSON, URLs, nested structures
|
||||||
|
- Priority calculation: combines value length, origin exposure, data type
|
||||||
|
- Mark/unmark system for user selection in reports
|
||||||
|
|
||||||
|
#### Browser API Abstraction (`lib/browser-api/`)
|
||||||
|
- **Cross-browser compatibility layer** selected at build time via `TARGET` env var
|
||||||
|
- `types.ts`: Unified interface for tabs, badge, webRequest, cookies APIs
|
||||||
|
- `index.ts`: Exports Chrome or Firefox implementation based on `process.env.TARGET`
|
||||||
|
- Standardizes differences (e.g., `browserAction` vs `action`)
|
||||||
|
|
||||||
|
### Data Flow
|
||||||
|
|
||||||
|
```
|
||||||
|
HTTP Request Initiated
|
||||||
|
↓
|
||||||
|
Memory.onBeforeRequest → Create ExtendedRequest (capture body)
|
||||||
|
↓
|
||||||
|
Memory.onBeforeSendHeaders → ExtendedRequest.init() (capture headers)
|
||||||
|
↓
|
||||||
|
Extract stolen data (cookies, params, headers, body)
|
||||||
|
↓
|
||||||
|
Memory.register() → Check if third-party → Add to RequestCluster
|
||||||
|
↓
|
||||||
|
Memory.emit('change', shorthost) → Broadcast event
|
||||||
|
↓
|
||||||
|
React components (via useEmitter hook) → UI re-renders
|
||||||
|
```
|
||||||
|
|
||||||
|
### UI Components
|
||||||
|
|
||||||
|
#### Sidebar (`components/sidebar/`)
|
||||||
|
- **sidebar.tsx**: Main extension UI listing third-party domains
|
||||||
|
- **stolen-data.tsx**: Renders RequestClusters with filtering options
|
||||||
|
- **stolen-data-cluster.tsx**: Expandable cluster showing individual StolenDataEntry items
|
||||||
|
- Filters: `minValueLength`, `cookiesOnly`, `cookiesOrOriginOnly`
|
||||||
|
- Real-time updates via `useEmitter(Memory)` hook
|
||||||
|
|
||||||
|
#### Report Window (`components/report-window/`)
|
||||||
|
- Multi-stage report generation workflow:
|
||||||
|
1. **Survey**: User questionnaire (role, tone, gender pronouns) via `survey-react`
|
||||||
|
2. **Screenshot**: External service generates domain screenshots
|
||||||
|
3. **Preview**: Final email/report content with GDPR violation analysis
|
||||||
|
- **deduce-problems.tsx**: Analyzes survey answers to identify GDPR violations
|
||||||
|
- **har-converter.tsx**: Generates filtered HAR archives
|
||||||
|
- **email-content.tsx**: Renders Polish email template (polite or harsh tone)
|
||||||
|
|
||||||
|
#### Toolbar (`components/toolbar/`)
|
||||||
|
- Browser action popup (top-right icon)
|
||||||
|
|
||||||
|
### Build System
|
||||||
|
|
||||||
|
- **esbuild** (`esbuild.config.js`): TypeScript → JavaScript bundler
|
||||||
|
- Entry points: toolbar.tsx, sidebar.tsx, report-window.tsx, background.ts, diag.tsx, styles
|
||||||
|
- External React libs loaded via globals (`globalThis.React`, `globalThis.ReactDOM`)
|
||||||
|
- SCSS plugin for styling
|
||||||
|
- Define flags: `PLUGIN_NAME`, `PLUGIN_URL`
|
||||||
|
- Watch mode available for development
|
||||||
|
|
||||||
|
- **Target Selection**: Set `TARGET=firefox` or `TARGET=chrome` before build to select browser API implementation
|
||||||
|
|
||||||
|
- **Manifest**: Currently uses Manifest V2 (Firefox). Chrome support is partial and being expanded.
|
||||||
|
|
||||||
|
## Important Implementation Details
|
||||||
|
|
||||||
|
### Third-Party Detection Heuristics
|
||||||
|
When determining if a request is third-party (in `extended-request.ts`):
|
||||||
|
1. Compare request origin with tab origin
|
||||||
|
2. Check `documentUrl` and `originUrl` from webRequest details
|
||||||
|
3. Use `urlClassification.thirdParty` if available
|
||||||
|
4. Analyze `frameAncestors` for nested iframe scenarios
|
||||||
|
5. Fall back to comparing hostnames
|
||||||
|
|
||||||
|
### Stolen Data Extraction Strategy
|
||||||
|
Data is extracted from multiple sources (priority order):
|
||||||
|
1. **Cookies**: Via `browser.cookies.getAll()`
|
||||||
|
2. **Query Parameters**: Parsed from URL search string
|
||||||
|
3. **Pathname**: URL path segments
|
||||||
|
4. **Headers**: Request headers (Cookie, Referer, etc.)
|
||||||
|
5. **Request Body**: POST/PUT data (form data, JSON)
|
||||||
|
|
||||||
|
### Value Parsing Chain
|
||||||
|
`StolenDataEntry` recursively decodes values:
|
||||||
|
1. Detect Base64 encoding → decode
|
||||||
|
2. Detect URL encoding → decode
|
||||||
|
3. Detect JSON → parse
|
||||||
|
4. Detect nested URLs → extract
|
||||||
|
5. Stop at maximum recursion depth
|
||||||
|
|
||||||
|
### Auto-Marking Rules
|
||||||
|
RequestClusters automatically mark entries as suspicious if:
|
||||||
|
- Value exposes browsing history (referrer, path info)
|
||||||
|
- Cookie length > 100 characters
|
||||||
|
- Known trackers: Google Analytics, Facebook, DoubleClick, etc.
|
||||||
|
- Classified as `id` or `history` type
|
||||||
|
|
||||||
|
### Event System
|
||||||
|
- **SaferEmitter** (`safer-emitter.ts`): EventEmitter wrapper with async emission
|
||||||
|
- Uses `setTimeout(..., 0)` to decouple events from synchronous request handling
|
||||||
|
- Prevents errors in listeners from breaking request flow
|
||||||
|
- **useEmitter Hook** (`components/sidebar/sidebar.tsx`): React integration
|
||||||
|
- Increments counter state on each event to trigger re-renders
|
||||||
|
- Automatically subscribes/unsubscribes via `useEffect`
|
||||||
|
|
||||||
|
## Common Development Tasks
|
||||||
|
|
||||||
|
### Adding a New Data Source
|
||||||
|
1. Add extraction logic to `ExtendedRequest.getAllData()` in `extended-request.ts`
|
||||||
|
2. Define new source type in `StolenDataEntry.sources` in `stolen-data-entry.ts`
|
||||||
|
3. Update classification logic in `StolenDataEntry.getClassification()`
|
||||||
|
4. Update UI components to display the new source (if needed)
|
||||||
|
|
||||||
|
### Supporting a New Browser
|
||||||
|
1. Create implementation in `lib/browser-api/` (e.g., `safari.ts`)
|
||||||
|
2. Update `lib/browser-api/index.ts` to export based on `TARGET` env var
|
||||||
|
3. Add corresponding npm scripts in `package.json` (`build:safari`, etc.)
|
||||||
|
4. Test browser-specific APIs for compatibility
|
||||||
|
|
||||||
|
### Modifying Report Templates
|
||||||
|
- **Email templates**: `email-template-polite.js`, `email-template-harsh.js`
|
||||||
|
- **Problem deduction**: `components/report-window/deduce-problems.tsx`
|
||||||
|
- **Survey questions**: `components/report-window/questions.tsx`
|
||||||
|
- Templates are in Polish and follow GDPR complaint structure
|
||||||
|
|
||||||
|
### Debugging Request Interception
|
||||||
|
- Check `ExtendedRequest.by_id` registry for request lookup issues
|
||||||
|
- Verify `Memory.clusters` structure for data organization
|
||||||
|
- Use browser DevTools → Extensions → Background Page for logging
|
||||||
|
- Enable `console.log` in `memory.ts` event listeners
|
||||||
|
|
||||||
|
## Project Constraints
|
||||||
|
|
||||||
|
- **No automated tests**: Manual testing only via browser extension loading
|
||||||
|
- **Polish language**: UI and reports are Polish-focused (English i18n is future work)
|
||||||
|
- **Manifest V2**: Primary target is Firefox; Chrome V3 migration is in progress
|
||||||
|
- **External screenshot service**: Report generation depends on external API
|
||||||
|
- **No minification**: Currently disabled in esbuild config (commented out)
|
||||||
|
- **Node.js 16.x requirement**: Specified in README
|
||||||
|
|
||||||
|
## Repository Information
|
||||||
|
|
||||||
|
- **Primary Repository**: https://git.internet-czas-dzialac.pl/icd/rentgen (Gitea)
|
||||||
|
- **Mirror**: GitHub (issues not accepted there)
|
||||||
|
- **Issue Tracking**: Email kontakt@internet-czas-dzialac.pl
|
||||||
|
- **License**: GPL-3.0-or-later
|
||||||
|
- **Authors**: Kuba Orlik, Arkadiusz Wieczorek (Internet. Time to act! Foundation)
|
||||||
|
|
||||||
|
## File Organization
|
||||||
|
|
||||||
|
```
|
||||||
|
rentgen/
|
||||||
|
├── background.ts # Extension entry point
|
||||||
|
├── memory.ts # Central state manager (Memory singleton)
|
||||||
|
├── extended-request.ts # HTTP request wrapper and data extraction
|
||||||
|
├── request-cluster.ts # Request aggregation by domain
|
||||||
|
├── stolen-data-entry.ts # Individual data point representation
|
||||||
|
├── safer-emitter.ts # EventEmitter wrapper
|
||||||
|
├── util.ts # Utility functions
|
||||||
|
├── components/
|
||||||
|
│ ├── sidebar/ # Main extension UI
|
||||||
|
│ ├── toolbar/ # Browser action popup
|
||||||
|
│ └── report-window/ # Report generation workflow
|
||||||
|
├── lib/
|
||||||
|
│ └── browser-api/ # Cross-browser API abstraction
|
||||||
|
├── email-template-*.js # Polish email templates
|
||||||
|
├── esbuild.config.js # Build configuration
|
||||||
|
├── manifest.json # Extension manifest (V2)
|
||||||
|
└── assets/ # Icons and screenshots
|
||||||
|
```
|
||||||
45
Dockerfile
Normal file
45
Dockerfile
Normal file
@ -0,0 +1,45 @@
|
|||||||
|
# Rentgen Browser Extension - Docker Build
|
||||||
|
#
|
||||||
|
# Usage:
|
||||||
|
# Build and extract artifacts directly:
|
||||||
|
# docker buildx build . --output artifacts
|
||||||
|
#
|
||||||
|
# Or traditional build (creates full development environment):
|
||||||
|
# docker build -t rentgen .
|
||||||
|
# docker run --rm rentgen ls -lh /app/web-ext-artifacts/
|
||||||
|
#
|
||||||
|
# Run commands in the container:
|
||||||
|
# docker run --rm rentgen npm run build:chrome
|
||||||
|
# docker run --rm rentgen npm run typecheck
|
||||||
|
|
||||||
|
# Build stage
|
||||||
|
FROM node:lts AS builder
|
||||||
|
|
||||||
|
WORKDIR /app
|
||||||
|
|
||||||
|
# Copy package files for dependency installation (better layer caching)
|
||||||
|
COPY package.json package-lock.json ./
|
||||||
|
|
||||||
|
# Install dependencies
|
||||||
|
RUN npm install
|
||||||
|
|
||||||
|
# Copy source code (respecting .dockerignore)
|
||||||
|
COPY . .
|
||||||
|
|
||||||
|
# Build the extension for Firefox (default)
|
||||||
|
RUN npm run build
|
||||||
|
|
||||||
|
# Create the package
|
||||||
|
RUN npm run create-package
|
||||||
|
|
||||||
|
# Artifacts stage - only contains the built artifacts (for --output)
|
||||||
|
FROM scratch AS artifacts
|
||||||
|
|
||||||
|
# Copy only the built extension zip file to root
|
||||||
|
COPY --from=builder /app/web-ext-artifacts/*.zip /
|
||||||
|
|
||||||
|
# Default stage - full development environment
|
||||||
|
FROM builder
|
||||||
|
|
||||||
|
# Default command shows the built artifact
|
||||||
|
CMD ["ls", "-lh", "/app/web-ext-artifacts/"]
|
||||||
Loading…
x
Reference in New Issue
Block a user