bac_tools/CLAUDE.md

1.2 KiB

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Project Overview

Single-script Python tool that extracts credit card transactions from BAC Costa Rica statement PDFs. Parses section "B) Detalle de compras del periodo" and outputs JSON.

Dependencies

  • pdfplumber (>=0.10.0)

Usage

python bac_extract.py <pdf_file> [options]

# Examples
python bac_extract.py EstadodeCuenta.pdf --pretty
python bac_extract.py statement.pdf -o output.json -v

Options:

  • -o, --output: Output JSON path (default: transactions.json)
  • --pretty: Pretty-print JSON
  • -v, --verbose: Enable debug logging

Architecture

The extraction pipeline:

  1. Validates PDF is a BAC statement (is_bac_statement)
  2. Locates section B via regex patterns (find_section_b_start, is_section_end)
  3. Extracts tables page-by-page using pdfplumber
  4. Parses Spanish dates (D-MMM-YY format) and amounts with comma separators

Key parsing functions:

  • parse_spanish_date: Converts "15-ENE-25" to "2025-01-15"
  • parse_amount: Handles "1,234.56" and trailing negatives "100.00-"
  • extract_card_holder: Matches "************1234 NAME" pattern