How to Build a PDF Report Generator from Database Data

Most SaaS applications eventually need to generate reports. Sales summaries, usage breakdowns, financial statements, activity logs — users want something they can download, share, and file. The challenge is that building a PDF report generator is rarely as simple as it sounds.

You have to query the database, transform the data into something presentable, lay it out in a document, and produce a file that renders correctly in every PDF viewer. Each of those steps has friction. This guide walks through a practical approach: a data pipeline from database query to downloadable PDF, using a template-based API instead of a custom rendering stack.


Where PDF reports are actually useful

PDF report generation comes up in a few common scenarios:

  • Analytics dashboards: monthly active users, churn rate, revenue breakdown
  • Financial reports: invoices by period, payment summaries, tax exports
  • Operational reports: work orders completed, incident logs, delivery summaries
  • Compliance exports: audit trails, access logs, GDPR data exports
  • Customer-facing reports: usage reports for B2B customers, account statements

What these have in common: structured data from a database, a repeating layout, and a requirement for a consistent, printable output.


Designing the data pipeline

The pipeline has four stages:

  1. Query — fetch the raw records from your database
  2. Transform — shape the data into something your template expects
  3. Map — pass the variables to the PDF generation API
  4. Deliver — return the PDF URL to the client

The key insight is that stages 1 and 2 are your application's responsibility. Stage 3 is where a PDF API takes over. Keeping those boundaries clean means your report logic stays in your codebase, and your rendering logic stays out of it.


Template structure for reports

A good PDF report template uses variables for dynamic content and loops for repeating rows. When working with Transactional.dev, you define your template once in the editor using HTML and Tailwind CSS, then reference variables with Handlebars syntax.

A typical report template structure looks like this:

<!-- Report header -->
<div class="flex justify-between">
  <h1 class="text-2xl font-bold">{{ report.title }}</h1>
  <p class="text-gray-500">{{ report.period }}</p>
</div>

<!-- Summary metrics -->
<div class="grid grid-cols-3 gap-4 my-6">
  <div class="p-4 bg-gray-100 rounded">
    <p class="text-sm text-gray-600">Total Revenue</p>
    <p class="text-xl font-bold">{{ summary.totalRevenue }}</p>
  </div>
  <!-- more metric cards -->
</div>

<!-- Data table -->
<table class="w-full mt-4">
  <thead>
    <tr>
      <th>Date</th>
      <th>Description</th>
      <th>Amount</th>
    </tr>
  </thead>
  <tbody>
    {{#each transactions}}
    <tr>
      <td>{{ this.date }}</td>
      <td>{{ this.description }}</td>
      <td>{{ this.amount }}</td>
    </tr>
    {{/each}}
  </tbody>
</table>

The template is static. The data is dynamic. That separation is what makes the system maintainable.


DB query and data shaping (Node.js + Prisma)

Here is a practical example. You want to generate a monthly revenue report for a specific customer.

Step 1: Query the database

import { PrismaClient } from '@prisma/client'

const prisma = new PrismaClient()

async function getMonthlyReportData(customerId: string, year: number, month: number) {
  const startDate = new Date(year, month - 1, 1)
  const endDate = new Date(year, month, 0, 23, 59, 59)

  const [customer, transactions] = await Promise.all([
    prisma.customer.findUnique({
      where: { id: customerId },
      select: { name: true, email: true, plan: true }
    }),
    prisma.transaction.findMany({
      where: {
        customerId,
        createdAt: { gte: startDate, lte: endDate },
        status: 'COMPLETED'
      },
      select: {
        id: true,
        createdAt: true,
        description: true,
        amount: true,
        currency: true
      },
      orderBy: { createdAt: 'asc' }
    })
  ])

  return { customer, transactions }
}

Step 2: Transform the data

Raw database rows are not template-ready. You need to format dates, compute totals, and structure the object to match your template's variable schema.

function shapeReportData(customer: any, transactions: any[], year: number, month: number) {
  const monthName = new Date(year, month - 1, 1)
    .toLocaleString('en-US', { month: 'long', year: 'numeric' })

  const totalRevenue = transactions.reduce((sum, t) => sum + t.amount, 0)
  const formatter = new Intl.NumberFormat('en-US', { style: 'currency', currency: 'USD' })

  return {
    report: {
      title: `Revenue Report — ${monthName}`,
      period: monthName,
      generatedAt: new Date().toLocaleDateString('en-US')
    },
    customer: {
      name: customer.name,
      email: customer.email,
      plan: customer.plan
    },
    summary: {
      totalRevenue: formatter.format(totalRevenue / 100),
      transactionCount: transactions.length
    },
    transactions: transactions.map(t => ({
      date: new Date(t.createdAt).toLocaleDateString('en-US'),
      description: t.description,
      amount: formatter.format(t.amount / 100)
    }))
  }
}

This transformation step is where most of the real work happens. Amounts from the database are in cents; the template expects formatted strings. Dates are ISO timestamps; the template expects readable values. Do not skip this layer.


Calling the PDF generation API

With your data shaped and your template ready in Transactional.dev, generating the PDF is a single API call.

async function generateReport(customerId: string, year: number, month: number): Promise<string> {
  const { customer, transactions } = await getMonthlyReportData(customerId, year, month)
  const variables = shapeReportData(customer, transactions, year, month)

  const response = await fetch('https://api.transactional.dev/v1/generate', {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-api-token': process.env.TRANSACTIONAL_API_TOKEN!
    },
    body: JSON.stringify({
      documentId: process.env.REPORT_DOCUMENT_ID,
      variables
    })
  })

  if (!response.ok) {
    const error = await response.json()
    throw new Error(`PDF generation failed: ${error.message}`)
  }

  const { url } = await response.json()
  return url
}

The response includes a url pointing to the generated PDF. You can return that URL directly to the client, store it in your database, or attach it to an email.


On-demand vs scheduled reports

There are two common delivery patterns:

On-demand: The user clicks "Download Report" in your app. Your backend runs the query, shapes the data, calls the API, and redirects the user to the PDF URL. This works well for small to medium datasets and when users need fresh data.

Scheduled: A cron job or job queue generates reports at fixed intervals (end of month, end of day) and stores the PDF URL for later retrieval. This works well when generation is slow, when you want to guarantee availability at a predictable time, or when you need to send the report by email automatically.

You can combine both: generate reports on a schedule, cache the URL, and serve it instantly when the user requests it.


Caching strategy

PDF URLs from Transactional.dev are signed and expire. If you want to serve the same report to multiple users or allow re-downloads, store the URL in your database when you generate it.

A simple approach:

async function getOrGenerateReport(customerId: string, year: number, month: number): Promise<string> {
  // Check cache first
  const cached = await prisma.report.findFirst({
    where: { customerId, year, month },
    orderBy: { createdAt: 'desc' }
  })

  if (cached && cached.pdfUrl) {
    return cached.pdfUrl
  }

  // Generate fresh
  const pdfUrl = await generateReport(customerId, year, month)

  // Store it
  await prisma.report.create({
    data: { customerId, year, month, pdfUrl }
  })

  return pdfUrl
}

For scheduled reports, generate once and store the result. For on-demand reports that are requested frequently, use a short-lived cache (30 to 60 minutes) to avoid hitting the API on every click.


Tips

  • Validate before generating: check that the date range has data before calling the API. Empty reports are confusing for users.
  • Handle pagination at the query level: if a customer has thousands of transactions in a month, fetch them in batches and merge, or limit the report scope.
  • Use a job queue for large reports: wrap generateReport in a background job (Bull, BullMQ, Inngest) and return a job ID to the client. Poll or use webhooks to notify when ready.
  • Log generation failures: PDF generation can fail due to template errors or network issues. Log the error with the customer ID and date range so you can reproduce it.
  • Test your template with edge cases: empty arrays, missing optional fields, very long strings in table cells. Handlebars will silently render an empty string for undefined variables.

Common mistakes

Passing raw database rows to the template: timestamps, amounts in cents, and null values will render incorrectly. Always transform your data first.

Assuming the URL is permanent: PDF URLs expire. If you need the file long-term, download it and store it in your own storage, or at minimum store the URL in your database immediately.

Not handling the transactions array when empty: if {{#each transactions}} iterates over an empty array, the table body renders nothing. Add an explicit empty state in your template.

Generating PDFs synchronously in an HTTP handler: for reports that take more than a second to generate, this will cause request timeouts. Move generation to a background job.

Mixing template logic with data logic: keep your Handlebars template simple. Complex formatting (currency, dates, conditionals on values) belongs in your data transformation layer, not in the template.


Conclusion

Building a PDF report generator is mostly a data problem. The heavy lifting is in writing the right query, transforming the results into a clean structure, and designing a template that handles your edge cases. The actual rendering step can be a single API call.

If you want to skip building and maintaining your own PDF rendering infrastructure, Transactional.dev gives you a template editor, a generation API, and hosted delivery. You define the template once and generate reports on demand with a POST request that returns a PDF URL. The free tier lets you test the full workflow without a credit card.

The pipeline is the same either way: query your database, shape your data, call the API, deliver the URL.