Building Report Queries

Master the mechanics of transforming data into insights with query language

Build custom reports and analytics by combining Payload's query language with aggregate functions, grouping, and filtering. Understanding the mechanics of report queries enables you to create any type of analysis from transaction reports to customer analytics. This guide covers the fundamental building blocks and techniques that power all reporting capabilities.

Report queries extend the standard query language with aggregate functions (sum, count, avg) and grouping capabilities. Once you understand these mechanics, you can build any report from simple summaries to complex multi-dimensional analysis without exporting data to external tools.

Prerequisites

Before building report queries, understand:

Query language

Learn the basics of filtering and querying Payload objects.

Query language reference

Functions

Understand date, string, and aggregate functions for reports.

Functions reference

Paging & ordering

Control result sets with pagination and sorting.

Paging and ordering reference

Query Components

Report queries consist of four main components that work together.

Specify which fields to return and apply functions to transform values:

select.py

import payload
pl = payload.Session('secret_key_3bW9...', api_version='v2')
 
# Select fields with functions to transform values
pl.Transaction.select(
    pl.attr.created_at.month(),    # Extract month from date
    pl.attr.amount.sum(),           # Sum transaction amounts
    pl.attr.id.count()              # Count transactions
)

Select capabilities:

Raw fields: Return attribute values as-is (pl.attr.status)
Function transforms: Apply functions to modify values (pl.attr.created_at.year())
Aggregate calculations: Use sum, count, avg, etc. (pl.attr.amount.sum())
Wildcard selection: Use *pl.attr to select all fields

Narrow results using conditional logic:

filter.py

import payload
 
pl = payload.Session("secret_key_3bW9...", api_version="v2")
 
# Filter results using conditional logic
pl.Transaction.filter_by(
    pl.attr.status.value == "processed",
    pl.attr.type == "payment",
    pl.attr.created_at.year() == 2024,
)

Filter capabilities:

Equality conditions: Match exact values
Comparison operators: Greater than, less than, ranges
OR logic: Match multiple values
Function filters: Apply functions in conditions
Nested attributes: Query related object properties

Organize results by dimensions for aggregation:

group.py

import payload
 
pl = payload.Session("secret_key_3bW9...", api_version="v2")
 
# Group results by dimensions for aggregation
pl.Account.select("count(id)").group_by(
    pl.attr.created_at.year(),
    pl.attr.created_at.month(),
)

Grouping capabilities:

Single dimension: Group by one attribute
Multiple dimensions: Group by multiple attributes
Function grouping: Group by function results
Date grouping: Group by time periods

Important: Fields used in group_by must also be included in the select statement. This ensures the grouping dimensions are returned in the results.

Control result organization and size:

order.py

import payload
 
pl = payload.Session("secret_key_3bW9...", api_version="v2")
 
# Control result organization and size
pl.Account.order_by(
    pl.attr.created_at.desc(),
).limit(100).all()

Ordering capabilities:

Ascending/descending: Control sort direction
Multiple criteria: Order by multiple fields
Function ordering: Order by calculated values
Pagination: Limit and offset results

Date Functions

Extract and manipulate date/time values for time-based analysis.

Available date functions

Date functions extract specific components from datetime fields:

Function	Returns	Example	Use Case
`year()`	Year (2024)	`pl.attr.created_at.year()`	Annual reports
`month()`	Month number (1-12)	`pl.attr.created_at.month()`	Monthly trends
`monthname()`	Month name (January)	`pl.attr.created_at.monthname()`	Readable reports
`day()`	Day of month (1-31)	`pl.attr.created_at.day()`	Daily analysis
`dayname()`	Day name (Monday)	`pl.attr.created_at.dayname()`	Weekly patterns
`dayofweek()`	Day number (0-6)	`pl.attr.created_at.dayofweek()`	Week grouping
`weekofyear()`	Week number (1-52)	`pl.attr.created_at.weekofyear()`	Weekly reports
`quarter()`	Quarter number (1-4)	`pl.attr.created_at.quarter()`	Quarterly analysis
`hour()`	Hour (0-23)	`pl.attr.created_at.hour()`	Hourly patterns

Date function examples

quarterly.py

import payload
 
pl = payload.Session("secret_key_3bW9...", api_version="v2")
 
# Monthly revenue report
monthly = (
    pl.Transaction.select(
        pl.attr.created_at.year(),
        pl.attr.created_at.month(),
        pl.attr.amount.sum(),
    )
    .filter_by(pl.attr.type == "payment")
    .group_by(pl.attr.created_at.year(), pl.attr.created_at.month())
    .order_by(
        pl.attr.created_at.desc(),
    )
    .all()
)

Date function tips: - Use year() and month() together for time-series - dayname() is more readable than dayofweek() for display - weekofyear() useful for weekly batch reports

quarter() simplifies quarterly financial reports - hour() reveals transaction timing patterns

Aggregate Functions

Calculate summary statistics across grouped records.

Numeric aggregates

Calculate values across records:

Function	Description	Example
`sum()`	Total of all values	`pl.attr.amount.sum()`
`avg()`	Average value	`pl.attr.amount.avg()`
`min()`	Minimum value	`pl.attr.amount.min()`
`max()`	Maximum value	`pl.attr.amount.max()`
`count()`	Count records	`pl.attr.id.count()`
`count_distinct()`	Count unique values	`pl.attr.customer_id.count_distinct()`
`stddev()`	Standard deviation	`pl.attr.amount.stddev()`
`variance()`	Statistical variance	`pl.attr.amount.variance()`

Using multiple aggregates

Combine multiple aggregate functions in a single query:

stats.py

import payload
pl = payload.Session('secret_key_3bW9...', api_version='v2')
 
# Comprehensive statistics with multiple aggregates
stats = pl.Transaction.select(
    pl.attr.status.value,
    pl.attr.amount.sum(),
    pl.attr.amount.avg(),
    pl.attr.amount.min(),
    pl.attr.amount.max(),
    pl.attr.id.count(),
    pl.attr.sender.account_id.count_distinct()
).group_by(
    pl.attr.status.value
).all()

Results show comprehensive statistics:

{
  "status": "processed",
  "sum(amount)": 1250000.0,
  "avg(amount)": 285.71,
  "min(amount)": 5.0,
  "max(amount)": 5000.0,
  "count(id)": 4375,
  "count_distinct(customer_id)": 892
}

Aggregate function tips: - sum() for totals (revenue, fees, counts) - avg() for averages (order value, transaction size) - count() for volume metrics - count_distinct() for unique entities (customers, products) - min()/max() for ranges and outliers - stddev() for variance analysis

Filtering Techniques

Apply filters to focus reports on specific data segments.

Limit reports to specific time periods:

last-30-days.py

import payload
pl = payload.Session('secret_key_3bW9...', api_version='v2')
from datetime import datetime, timedelta
 
# Last 30 days
thirty_days_ago = datetime.now() - timedelta(days=30)
 
recent = pl.Transaction.select(
    pl.attr.status.value,
    pl.attr.amount.sum()
).filter_by(
    pl.attr.created_at >= thirty_days_ago
).group_by(
    pl.attr.status.value
).all()
 
# Specific date range
start_date = datetime(2024, 1, 1)
end_date = datetime(2024, 12, 31)
 
year_2024 = pl.Transaction.select(
    pl.attr.created_at.month(),
    pl.attr.amount.sum()
).filter_by(
    pl.attr.created_at >= start_date,
    pl.attr.created_at < end_date
).group_by(
    pl.attr.created_at.month()
).all()

Use multiple conditions to narrow results:

filtered.py

import payload
pl = payload.Session('secret_key_3bW9...', api_version='v2')
 
# Use multiple conditions to narrow results
filtered = pl.Transaction.select(
    pl.attr.sender.account_id,
    pl.attr.amount.sum(),
    pl.attr.id.count()
).filter_by(
    pl.attr.type == 'payment',
    pl.attr.status.value == 'processed',
    pl.attr.amount > 100,
    pl.attr.created_at.year() == 2024,
    pl.attr.source == 'keyed'
).group_by(
    pl.attr.sender.account_id
).all()

Match multiple values for a field:

multi-status.py

import payload
pl = payload.Session('secret_key_3bW9...', api_version='v2')
 
# Match multiple status values
multi_status = pl.Transaction.select(
    pl.attr.created_at.month(),
    pl.attr.amount.sum()
).filter_by(
    (pl.attr.status.value == 'processed') | (pl.attr.status.value == 'processing'),
    pl.attr.type == 'payment'
).group_by(
    pl.attr.created_at.month()
).all()

Query Building Patterns

Common patterns for constructing effective queries.

Build queries incrementally:

incremental.py

import payload
 
pl = payload.Session("secret_key_3bW9...", api_version="v2")
 
# Step 1: Basic query
basic = pl.Transaction.select(pl.attr.amount.sum()).all()
 
# Step 2: Add filtering
with_filter = (
    pl.Transaction.select(pl.attr.amount.sum())
    .filter_by(pl.attr.type == "payment")
    .all()
)
 
# Step 3: Add grouping
with_grouping = (
    pl.Transaction.select(pl.attr.status.value, pl.attr.amount.sum())
    .filter_by(pl.attr.type == "payment")
    .group_by(pl.attr.status.value)
    .all()
)
 
# Step 4: Add ordering
complete = (
    pl.Transaction.select(pl.attr.status.value, pl.attr.amount.sum())
    .filter_by(pl.attr.type == "payment")
    .group_by(pl.attr.status.value)
    .order_by(pl.attr.amount.desc())
    .all()
)

Always filter first to reduce dataset size:

efficient.py

import payload
pl = payload.Session('secret_key_3bW9...', api_version='v2')
from datetime import datetime
 
# Good: Filter reduces data before grouping
efficient = pl.Transaction.select(
    pl.attr.sender.account_id,
    pl.attr.amount.sum()
).filter_by(
    pl.attr.created_at >= datetime(2024, 1, 1),  # Filter first
    pl.attr.type == 'payment',
    pl.attr.status.value == 'processed'
).group_by(
    pl.attr.sender.account_id
).all()
 
# Less efficient: Processing all data then grouping
less_efficient = pl.Transaction.select(
    pl.attr.sender.account_id,
    pl.attr.amount.sum()
).group_by(
    pl.attr.sender.account_id
).all()  # Groups everything, then you filter results

Make queries self-documenting:

descriptive.py

import payload
pl = payload.Session('secret_key_3bW9...', api_version='v2')
 
# Good: Clear, descriptive names
monthly_revenue_by_source = pl.Transaction.select(
    pl.attr.created_at.month(),
    pl.attr.source,
    pl.attr.amount.sum()
).filter_by(
    pl.attr.type == 'payment',
    pl.attr.status.value == 'processed'
).group_by(
    pl.attr.created_at.month(),
    pl.attr.source
).all()
 
# Avoid: Unclear names
results = pl.Transaction.select(
    pl.attr.created_at.month(),
    pl.attr.amount.sum()
).all()

Define common filters once, reuse across queries:

reuse.py

import payload
pl = payload.Session('secret_key_3bW9...', api_version='v2')
from datetime import datetime
 
# Define common filters
PROCESSED_PAYMENTS = [
    pl.attr.type == 'payment',
    pl.attr.status.value == 'processed'
]
 
CURRENT_YEAR = [
    pl.attr.created_at.year() == datetime.now().year
]
 
# Reuse across queries
revenue = pl.Transaction.select(
    pl.attr.created_at.month(),
    pl.attr.amount.sum()
).filter_by(*PROCESSED_PAYMENTS, *CURRENT_YEAR).group_by(
    pl.attr.created_at.month()
).all()
 
customers = pl.Transaction.select(
    pl.attr.sender.account_id,
    pl.attr.amount.sum()
).filter_by(*PROCESSED_PAYMENTS, *CURRENT_YEAR).group_by(
    pl.attr.sender.account_id
).all()

Best Practices

Follow these guidelines for efficient and maintainable queries.

Index commonly grouped fields:

Group by indexed fields when possible
Date fields are typically indexed
Customer and status fields benefit from indexes

Limit result sets:

# Use pagination for large result sets
report = (
    pl.Transaction.select(pl.attr.sender.account_id, pl.attr.amount.sum())
    .group_by(pl.attr.sender.account_id)
    .order_by(pl.attr.sender.account_id)
    .limit(100)
    .all()
)  # First 100 customers, ordered by account id

Filter before grouping:

# Efficient: Filter reduces dataset first
efficient = (
    pl.Transaction.select(pl.attr.status.value, pl.attr.amount.sum())
    .filter_by(pl.attr.created_at >= datetime(2024, 1, 1))  # Filter first
    .group_by(pl.attr.status.value)
    .all()
)

Handle null values:

Aggregate functions skip null values automatically
Use filters to exclude incomplete records when needed
Document assumptions about null handling

Timezone considerations:

from datetime import datetime, timezone
 
# Be explicit about timezone for date functions
utc_report = (
    pl.Transaction.select(pl.attr.created_at.date(), pl.attr.amount.sum())
    .filter_by(pl.attr.created_at >= datetime(2024, 1, 1, tzinfo=timezone.utc))
    .group_by(pl.attr.created_at.date())
    .all()
)

Consistent ordering:

Always specify order_by for reproducible results
Include secondary sort for ties
Use consistent direction (asc/desc)

Document complex logic:

twelve_months_ago = (datetime.now(timezone.utc) - timedelta(days=365)).strftime("%Y-%m-%dT%H:%M:%SZ")
 
customer_ltv = (
    pl.Transaction.select(
        pl.attr.sender.account_id,
        pl.attr.amount.sum(),
        pl.attr.amount.avg(),
        pl.attr.id.count(),
    )
    .filter_by(
        pl.attr.type == "payment",
        pl.attr.status.value == "processed",
        pl.attr.created_at >= twelve_months_ago,
    )
    .group_by(pl.attr.sender.account_id)
    .order_by(pl.attr.sender.account_id)
    .limit(1000)
    .all()
)

Test incrementally:

Start with simple query
Add one component at a time
Verify results at each step
Debug issues before adding complexity

Troubleshooting

Common issues and solutions when building queries.

Problem: Query returns no results

Solutions:

Check filter conditions are not too restrictive
Verify date ranges include data
Test without filters to confirm data exists
Check for timezone issues in date filters

# Debug: Remove filters incrementally
debug = (
    pl.Transaction.select(pl.attr.id.count())
    .filter_by(
        # Comment out filters one at a time to isolate issue
        # pl.attr.status == 'processed',
        # pl.attr.created_at.year() == 2024
    )
    .all()
)

Problem: Query takes too long to execute

Solutions:

Add filters to reduce dataset size
Limit result set with .limit()
Use indexed fields in filters and groups
Break complex queries into smaller parts

# Optimize: Filter and limit
optimized = (
    pl.Transaction.select(pl.attr.sender.account_id, pl.attr.amount.sum())
    .filter_by(
        pl.attr.created_at >= "2024-01-01T00:00:00Z",  # Reduce dataset
        pl.attr.type == "payment",
    )
    .group_by(pl.attr.sender.account_id)
    .order_by(pl.attr.sender.account_id)
    .limit(100)
    .all()
)  # Limit results

Problem: Sums or counts don't match expectations

Solutions:

Check for duplicate records in grouping
Verify filters are correct
Account for null values
Consider transaction types (payments vs refunds)

# Verify: Check raw data first
raw = pl.Transaction.filter_by(pl.attr.sender.account_id == "acct_123").all()
 
# Then compare to aggregated
aggregated = (
    pl.Transaction.select(
        pl.attr.sender.account_id, pl.attr.amount.sum(), pl.attr.id.count()
    )
    .filter_by(pl.attr.sender.account_id == "acct_123")
    .group_by(pl.attr.sender.account_id)
    .all()
)

Problem: Group by not working as expected

Solutions:

Ensure grouped fields are in select
Check function syntax
Verify nested attribute paths
Confirm no circular references

# Correct: Include grouped fields in select
correct = (
    pl.Transaction.select(pl.attr.status.value, pl.attr.amount.sum())  # Include this
    .group_by(pl.attr.status.value)  # When grouping by this
    .all()
)
 
# Incorrect: Missing grouped field
# This will error
incorrect = (
    pl.Transaction.select(pl.attr.amount.sum())  # Missing status
    .group_by(pl.attr.status.value)  # Error: status not in select
    .all()
)

Next Steps

Now that you understand the mechanics, explore specific report types

Revenue Reports

Build revenue analysis, trends, and financial reports with Revenue Reports guide, analyze transaction volumes and processing metrics, and create comprehensive financial statements for accounting teams.

Customer Analytics

Create customer lifetime value and behavior analysis with Customer Analytics documentation, segment customers by spending patterns and activity, and identify high-value customers for targeted marketing campaigns.

Handling Batched Payments Revenue Reports