53 lines
1.7 KiB
Markdown
53 lines
1.7 KiB
Markdown
# CLAUDE.md
|
|
|
|
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
|
|
|
|
## Project Overview
|
|
|
|
Python tools for BAC Costa Rica credit card statement processing:
|
|
- `bac_extract.py`: Extracts transactions from statement PDFs to JSON
|
|
- `bac_analyze.py`: Analyzes JSON output with categorization and graphs
|
|
|
|
## Dependencies
|
|
|
|
- pdfplumber (>=0.10.0) - PDF extraction
|
|
- matplotlib (>=3.5.0) - graphs (optional, only for bac_analyze.py --graph)
|
|
|
|
## Commands
|
|
|
|
```bash
|
|
# Run tests
|
|
python testStatements/run_tests.py
|
|
|
|
# Extract transactions from PDF
|
|
python bac_extract.py statement.pdf --pretty
|
|
python bac_extract.py statement.pdf -o output.json -v
|
|
|
|
# Analyze transactions (supports multiple JSON files)
|
|
python bac_analyze.py transactions.json
|
|
python bac_analyze.py *.json --graph all
|
|
python bac_analyze.py *.json --graph bar -o spending.png
|
|
python bac_analyze.py *.json --categories my_categories.json
|
|
```
|
|
|
|
## Architecture
|
|
|
|
### bac_extract.py
|
|
|
|
Extraction pipeline:
|
|
1. Validates PDF is a BAC statement (`is_bac_statement`)
|
|
2. Iterates pages line-by-line, detecting section boundaries via `SECTIONS` dict patterns
|
|
3. Parses transactions matching `TRANSACTION_PATTERN` regex
|
|
4. Outputs card holders, transactions by section, and summaries
|
|
|
|
Key data structures:
|
|
- `SECTIONS`: Maps section IDs (B/D/E) to start/end regex patterns and output keys
|
|
- `SPANISH_MONTHS`: Spanish month abbreviations for date parsing
|
|
|
|
### bac_analyze.py
|
|
|
|
Analysis pipeline:
|
|
1. Loads transactions from one or more JSON files (purchases only)
|
|
2. Categorizes by matching description against patterns in `categories.json`
|
|
3. Aggregates by category and month, keeping CRC/USD separate
|
|
4. Outputs text summary and optional graphs (bar/pie/timeline/all)
|