1.3 KiB
1.3 KiB
CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
Project Overview
Single-script Python tool that extracts credit card transactions from BAC Costa Rica statement PDFs. Parses section "B) Detalle de compras del periodo" and outputs JSON.
Dependencies
- pdfplumber (>=0.10.0)
Usage
python bac_extract.py <pdf_file> <card_suffix> [options]
# Examples
python bac_extract.py EstadodeCuenta.pdf 1234 --pretty
python bac_extract.py statement.pdf 1234 -o output.json -v
Options:
-o, --output: Output JSON path (default: transactions.json)--pretty: Pretty-print JSON-v, --verbose: Enable debug logging
Architecture
The extraction pipeline:
- Validates PDF is a BAC statement (
is_bac_statement) - Locates section B via regex patterns (
find_section_b_start,is_section_end) - Extracts tables page-by-page using pdfplumber
- Filters transactions by card suffix (last 4 digits)
- Parses Spanish dates (D-MMM-YY format) and amounts with comma separators
Key parsing functions:
parse_spanish_date: Converts "15-ENE-25" to "2025-01-15"parse_amount: Handles "1,234.56" and trailing negatives "100.00-"extract_card_holder: Matches "************1234 NAME" pattern