How to read content from PDF document in Laravel 8

How to read content from PDF document in Laravel 8

ยท

2 min read

In some of the cases, there are requirements to read contents from PDF documents and store them into our database, so in this blog post, we are going to see how we can extract data from pdf document and store it.

I am going to use smalot/pdfparser package for content reading. First, we will create a migration file

php artisan make:migration File -mc

It will Create 3 files 1 - Migration file 2 - File Model 3 - FileController

In migration file add below fields

Schema::create('files', function (Blueprint $table) {
            $table->id();
            $table->string('orig_filename', 100)->nullable();
            $table->string('mime_type', 50)->nullable();
            $table->bigInteger('filesize');
            $table->text('content')->nullable();
            $table->timestamps();
        });

Now we will install it using composer - run this command

composer require smalot/pdfparser

Last thing we will create blade file file.blade.php to upload PDF file. I am using Tailwind Css.

<x-app-layout>
    <x-slot name="header">
        <h2 class="font-semibold text-xl text-gray-800 leading-tight">
            {{ __('Upload File') }}
        </h2>
    </x-slot>
    {{-- ex using component x-component_name --}}
    @if (Session::has('success'))
        <x-success>
            {{ session()->get('success') }}
        </x-success>
    @endif
    @if (Session::has('error'))
        <x-error>
            {{ session()->get('error') }}
        </x-error>
    @endif

    <div class="py-12">
        <div class="max-w-5xl mx-auto sm:px-6 lg:px-8">
            <div class="bg-white overflow-hidden shadow-sm sm:rounded-lg">
                <div class="p-6 bg-white border-b border-gray-200">
                    <form action="{{ route('file.store') }}" enctype="multipart/form-data"
                        method="POST">
                        @csrf
                        <div class="mb-2"> <span>Attachments</span>
                            <div
                                class="relative h-40 rounded-lg border-dashed border-2 border-gray-200 bg-white flex justify-center items-center hover:cursor-pointer">
                                <div class="absolute">
                                    <div class="flex flex-col items-center "> <i
                                            class="fa fa-cloud-upload fa-3x text-gray-200"></i>
                                        <span class="block text-gray-400 font-normal">Attach
                                            you files here</span> <span
                                            class="block text-gray-400 font-normal">or</span>
                                        <span class="block text-blue-400 font-normal">Browse
                                            files</span>
                                    </div>
                                </div>
                                <input type="file" class="h-full w-full opacity-0" name="file" >
                            </div>
                        </div>
                        <div class="mt-3 text-center pb-3">
                            <button type="submit"
                                class="w-full h-12 text-lg w-32 bg-blue-600 rounded text-white hover:bg-blue-700">
                                Save
                            </button>
                        </div>
                    </form>
                </div>
            </div>
        </div>
    </div>
</x-app-layout>

web.php

use App\Http\Controllers\FileController;
Route::get('file', [FileController::class, 'index'])->name('file');
Route::post('file', [FileController::class, 'store'])->name('file.store');

We have now our migration and blade file ready. Now we will see the actual logic in our FileController

use Smalot\PdfParser\Parser;
use App\Models\File;

class FileController extends Controller
{
   public function index() { 
      return view('file');
   }
   public function store(Request $request) {

        $file = $request->file;

        $request->validate([
            'file' => 'required|mimes:pdf',
        ]);

        // use of pdf parser to read content from pdf 
        $fileName = $file->getClientOriginalName();

        $pdfParser = new Parser();
        $pdf = $pdfParser->parseFile($file->path());
        $content = $pdf->getText();

       $upload_file = new File;
       $upload_file->orig_filename = $fileName;
       $upload_file->mime_type = $file->getMimeType();
       $upload_file->filesize = $file->getSize();
       $upload_file->content = $content;
       $upload_file->save();
       return redirect()->back()->with('success', 'File  submitted');
}
}

Hooray, we have read and stored contents from our PDF document to the database.

Happy Reading ๐Ÿ˜ƒ๐Ÿ˜

ย