Privacy Policy Analyzer
Welcome to the Privacy Policy Analyzer - an AI-powered tool designed to help you understand
and analyze privacy policies with ease.
π Features
- AI-Powered Analysis: Uses advanced language models to extract key information from privacy policies
- Comprehensive Scoring: Evaluates privacy policies across multiple dimensions
- Easy Integration: Simple Python API for seamless integration into your workflow
- Web Scraping: Automatically extracts privacy policy content from websites
- Caching: Intelligent caching system for improved performance
π― What It Does
The Privacy Policy Analyzer helps you:
- Extract Key Information: Automatically identify data collection practices, sharing policies,
and user rights
- Score Privacy Policies: Get quantitative scores on various privacy aspects
- Compare Policies: Analyze multiple privacy policies side by side
- Generate Reports: Create detailed analysis reports for stakeholders
πββοΈ Quick Start
Installation
# Clone the repository
git clone https://github.com/HappyHackingSpace/privacy-policy.git
cd privacy-policy
# Using uv (recommended)
uv sync
# Or using pip
pip install -e .
Basic Usage
# Analyze a homepage URL β auto-discovery will attempt to resolve a policy page
python -m src.main --url "https://example.com"
# Analyze a known policy URL directly
python -m src.main --url "https://example.com/privacy-policy" --no-discover
π Analysis Dimensions
The analyzer evaluates privacy policies across ten key dimensions.
Each dimension is scored on a 0β10 scale, then combined with weights to produce a final 0β100 overall score.
- Lawful Basis & Purpose: Whether the policy explains clear purposes for processing and, where relevant, the legal basis or justification.
- Collection & Minimization: How clearly the policy describes the types of data collected and whether collection is limited to what is necessary.
- Secondary Use & Limits: Whether the policy restricts or explains additional uses beyond the original purpose.
- Retention & Deletion: Clarity on how long data is kept, deletion practices, or criteria for determining retention.
- Third Parties & Processors: Disclosure of processors, vendors, or third parties with whom data is shared, and their roles.
- Cross-Border Transfers: Information on transfers outside the userβs country/region and safeguards in place.
- User Rights & Redress: How users can exercise rights such as access, correction, deletion, or complaint, and available escalation channels.
- Security & Breach: Security measures described and any statements about breach notification or handling.
- Transparency & Notice: Overall clarity, structure, contact details, and how users are informed of updates or changes.
- Sensitive Data, Children, Ads & Profiling: How sensitive categories are handled, rules for childrenβs data, use of data for advertising, and automated decision-making/profiling.
π§ Advanced Features
- Flexible Fetching: Choose between
auto
, http
, or selenium
modes.
- Configurable Chunking: Control
--chunk-size
, --chunk-overlap
, and --max-chunks
for long policies.
- Multiple Report Levels: Select
summary
, detailed
, or full
output.
- Model Override: Use
--model
or the OPENAI_MODEL
environment variable to select your OpenAI model.
π Documentation
π€ Contributing
We welcome contributions! Please see our Contributing Guide for details on how to get started.
π License
This project is licensed under the MIT License - see the LICENSE file for details.
π Support
Built with β€οΈ for privacy-conscious developers and organizations