# CLAUDE.md This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. ## Project Overview Single-script Python tool that extracts credit card transactions from BAC Costa Rica statement PDFs. Parses sections B (purchases), D (other charges), and E (voluntary services) and outputs JSON. ## Dependencies - pdfplumber (>=0.10.0) ## Commands ```bash # Run tests python testStatements/run_tests.py # Run extractor python bac_extract.py [options] # Examples python bac_extract.py EstadodeCuenta.pdf --pretty python bac_extract.py statement.pdf -o output.json -v ``` Options: - `-o, --output`: Output JSON path (default: transactions.json) - `--pretty`: Pretty-print JSON - `-v, --verbose`: Enable debug logging ## Architecture The extraction pipeline: 1. Validates PDF is a BAC statement (`is_bac_statement`) 2. Iterates pages line-by-line, detecting section boundaries via `SECTIONS` dict patterns 3. Parses transactions matching `TRANSACTION_PATTERN` regex 4. Outputs card holders, transactions by section, and summaries Key data structures: - `SECTIONS`: Maps section IDs (B/D/E) to start/end regex patterns and output keys - `SPANISH_MONTHS`: Spanish month abbreviations for date parsing Key parsing functions: - `parse_spanish_date`: Converts "15-ENE-25" to "2025-01-15" - `parse_amount`: Handles "1,234.56" and trailing negatives "100.00-" - `matches_patterns`: Generic regex pattern matcher for section detection