bac_tools/CLAUDE.md
2026-03-09 13:24:41 -06:00

40 lines
1.3 KiB
Markdown

# CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
## Project Overview
Single-script Python tool that extracts credit card transactions from BAC Costa Rica statement PDFs. Parses section "B) Detalle de compras del periodo" and outputs JSON.
## Dependencies
- pdfplumber (>=0.10.0)
## Usage
```bash
python bac_extract.py <pdf_file> <card_suffix> [options]
# Examples
python bac_extract.py EstadodeCuenta.pdf 1234 --pretty
python bac_extract.py statement.pdf 1234 -o output.json -v
```
Options:
- `-o, --output`: Output JSON path (default: transactions.json)
- `--pretty`: Pretty-print JSON
- `-v, --verbose`: Enable debug logging
## Architecture
The extraction pipeline:
1. Validates PDF is a BAC statement (`is_bac_statement`)
2. Locates section B via regex patterns (`find_section_b_start`, `is_section_end`)
3. Extracts tables page-by-page using pdfplumber
4. Filters transactions by card suffix (last 4 digits)
5. Parses Spanish dates (D-MMM-YY format) and amounts with comma separators
Key parsing functions:
- `parse_spanish_date`: Converts "15-ENE-25" to "2025-01-15"
- `parse_amount`: Handles "1,234.56" and trailing negatives "100.00-"
- `extract_card_holder`: Matches "************1234 NAME" pattern