Ingesting PDFs and why Gemini 2.0 changes everything
In 2021 I built a side project for my dad, the gist is:
- Crawl a few thousand records from a county site (very outdated, very painful)
- Extract the same number of PDFs
- Get a specific set of data that could be on page 1 through 20, or not at all. Usually page 1 or 2
It took me roughly 2 weeks to get the crawler working, then 1-2 weeks asking my wife (thank you) to review the 200 most promising PDFs.
Last week that project became relevant again, and, of course, they'd updated the site (barely) and the crawler code was no longer valid.
This time it took ~3 hours with cursor to get the crawler working. And I tested Gemini to read the pdfs. It... basically just worked. I spent 5 min tuning the prompt to get additional data points I hadn't done 4 years ago and again, it just worked. After an hour (and a small wait for a quota increase) it had processed ALL 2,000 records.
4 weeks of work in 4 hours.
Google is trying to win over Enterprise with legacy workflows (PDF extraction) and speed + cost. Gemini is incredibly cheap (6,000 pages / $1 from the original article), accurate, and fast.
Is it the smartest reasoning? not yet, o3 is safe there.
Member discussion