Portable Document Format (PDF) files are ubiquitous in our digital world. We use them for everything from sharing documents to filling out forms online. But working with PDFs isn’t always easy. That’s where artificial intelligence comes in.
We’ll be looking at how well the AI assistant Claude 2 handles PDF-related tasks. Claude 2, created by Anthropic, is designed to be helpful, harmless, and honest. We’ll put it through its paces on some common PDF actions to see if it lives up to these ideals when working with this important file format.
It’s essential to put AI assistants to the test in real-world scenarios. With AI becoming such a big part of our lives, it’s crucial to know what they excel at and where they might fall short. Claude seems to be emerging as a robust contender, possibly on par with models like GPT-4. We’re confident that users who understand these technologies will play a significant role in their successful adoption.
So join us as we explore whether Claude 2 can make working with PDFs easier or if its skills still need improvement. The results may surprise you.
Claude 2 stands out from other AI assistants for its built-in ability to analyze and work with PDF files. The researchers at Anthropic designed Claude 2 to parse and understand the structure of PDF documents using machine-learning techniques. This gives Claude 2 an inherent advantage in processing PDFs compared to other chatbots that would struggle to make sense of them. As one of the first AI models with a dedicated PDF analyzer component, Claude 2 is uniquely positioned to excel at PDF-related tasks. In this blog post, we’ll examine how its specialized engineering translates to real-world proficiency in working with this ubiquitous document format. Our tests will reveal whether Claude 2 can deliver on its promise to provide helpful, harmless, and honest assistance with manipulating PDFs. Also, note that 10MB is the maximum file size.
For the sake of our tutorial, we are going to use a Python tutorial PDF and see what we can get from it.
Let’s start by asking it questions that we know are in the document. We will give it the prompt “What are Formatted String Literals?”. The answer is also pretty deep in the PDF, so it would be interesting if it could answer texts at the beginning of documents but not the end.
Here is the result. The answers are pretty concise in the PDF.
Now, let’s try to get direct quotes from the file. Our prompt will now be “What are Function Annotations? Give me a quote from the document”.
Here is the result.
Which confirms to be a direct quote! It was even able to display the code snippet.
Now, we will try financial docs. We will add the most recent quarterly report of Microsoft. We will prompt it with “According to the document. What was Microsoft’s Total revenue for the quarter?”Here is the screenshot of our results.
We can see that Claude provided the revenue information we requested, and upon fact-checking, we can confidently validate its accuracy. Claude even pinpointed the exact page where this information could be found, and it’s also correct.
Then we prompted it with, “What was the percentage change in revenue from last year?”. I wanted to see if it could do some analysis.
To my surprise, it was able to figure it out. It gave the page number of the results as well. I didn’t even know this data was in the documents. I thought it would take last year’s Q3 revenue and this year’s, then do the math to calculate the percentage difference.
Claude currently imposes limitations on the number of requests you can make and may even have a waiting list for access. With these constraints in mind, it’s worthwhile to explore some alternative options.
Perplexity AI is a great AI tool for NLP with docs. Users can upload PDF files in plain text, code, or PDF format, and Perplexity will utilize the file contents to formulate answers. For short files, the whole document will be analyzed by the language model. Perplexity can also chunk long PDFs manually into topic areas and feed them to GPT-4 for creative writing. Perplexity can analyze PDFs to answer questions directly from the documents, provide source citations for the answers it gives, compare and contrast research papers, find related documents or papers based on a query, analyze data and generate insights from various sources, visualize data and create graphics from various sources, and translate text from one language to another. If you’re on a free account, you can only do a certain number of requests. If you want Unlimited File Upload, you will need to subscribe for $20/mo.
ChatGPT has announced PDF analysis as a new feature in its latest update for ChatGPT Plus subscribers. This feature allows users to upload PDF files and other documents, which can then be analyzed by ChatGPT. The chatbot can extract summaries and various data points or even write graphs and charts based on that data. The functionality is currently in beta and available for ChatGPT Plus members. The update also includes automatic tool switching, which allows ChatGPT to guess what users want based on context. The new features have been available for ChatGPT Plus customers since October 2023
Last but certainly not least, open-source solutions provide a compelling alternative. A plethora of open-source tools are available for PDF analysis, leveraging various technologies such as Langchain or Python data science, often integrated with vector databases. It’s worth noting that vector database solutions like Pgvector can offer a significantly more cost-effective option compared to commercial services like Pinecone. Nevertheless, the open-source community on platforms like GitHub provides a wealth of accessible and customizable models to meet your PDF analysis needs.
I was really excited when we first tested out Claude’s PDF analyzer. The early results looked great. But you know how it goes with AI models – they’re not perfect. There were definitely some mistakes here and there. When I first started chatting with Claude about PDFs, it would get confused pretty often. But I’ve been continually impressed with how much better it’s gotten. The errors are way down, if any at all, compared to before.
It’s really promising to see this kind of improvement over time. I’m not saying it’s ready to replace human expertise just yet; obviously, you’d want to double-check things. We still need to keep an eye out for any potential issues. But I’m optimistic about where Claude’s PDF abilities are headed. This could end up being an incredibly useful tool. Of course, there are a lot of options, but this is a great one of them.