Building Report Queries
Master the mechanics of transforming data into insights with query language
Build custom reports and analytics by combining Payload's query language with aggregate functions, grouping, and filtering. Understanding the mechanics of report queries enables you to create any type of analysis from transaction reports to customer analytics. This guide covers the fundamental building blocks and techniques that power all reporting capabilities.
Report queries extend the standard query language with aggregate functions (sum, count, avg) and grouping capabilities. Once you understand these mechanics, you can build any report from simple summaries to complex multi-dimensional analysis without exporting data to external tools.
Prerequisites
Before building report queries, understand:
Query Components
Report queries consist of four main components that work together.
Specify which fields to return and apply functions to transform values:
Select capabilities:
- Raw fields: Return attribute values as-is (
pl.attr.status) - Function transforms: Apply functions to modify values (
pl.attr.created_at.year()) - Aggregate calculations: Use sum, count, avg, etc. (
pl.attr.amount.sum()) - Wildcard selection: Use
*pl.attrto select all fields
Narrow results using conditional logic:
Filter capabilities:
- Equality conditions: Match exact values
- Comparison operators: Greater than, less than, ranges
- OR logic: Match multiple values
- Function filters: Apply functions in conditions
- Nested attributes: Query related object properties
Organize results by dimensions for aggregation:
Grouping capabilities:
- Single dimension: Group by one attribute
- Multiple dimensions: Group by multiple attributes
- Function grouping: Group by function results
- Date grouping: Group by time periods
Important: Fields used in group_by must also be included in the select statement. This
ensures the grouping dimensions are returned in the results.
Control result organization and size:
Ordering capabilities:
- Ascending/descending: Control sort direction
- Multiple criteria: Order by multiple fields
- Function ordering: Order by calculated values
- Pagination: Limit and offset results
Date Functions
Extract and manipulate date/time values for time-based analysis.
Available date functions
Date functions extract specific components from datetime fields:
| Function | Returns | Example | Use Case |
|---|---|---|---|
year() | Year (2024) | pl.attr.created_at.year() | Annual reports |
month() | Month number (1-12) | pl.attr.created_at.month() | Monthly trends |
monthname() | Month name (January) | pl.attr.created_at.monthname() | Readable reports |
day() | Day of month (1-31) | pl.attr.created_at.day() | Daily analysis |
dayname() | Day name (Monday) | pl.attr.created_at.dayname() | Weekly patterns |
dayofweek() | Day number (0-6) | pl.attr.created_at.dayofweek() | Week grouping |
weekofyear() | Week number (1-52) | pl.attr.created_at.weekofyear() | Weekly reports |
quarter() | Quarter number (1-4) | pl.attr.created_at.quarter() | Quarterly analysis |
hour() | Hour (0-23) | pl.attr.created_at.hour() | Hourly patterns |
Date function examples
Date function tips: - Use year() and month() together for time-series - dayname() is
more readable than dayofweek() for display - weekofyear() useful for weekly batch reports
-
quarter()simplifies quarterly financial reports -hour()reveals transaction timing patterns
Aggregate Functions
Calculate summary statistics across grouped records.
Numeric aggregates
Calculate values across records:
| Function | Description | Example |
|---|---|---|
sum() | Total of all values | pl.attr.amount.sum() |
avg() | Average value | pl.attr.amount.avg() |
min() | Minimum value | pl.attr.amount.min() |
max() | Maximum value | pl.attr.amount.max() |
count() | Count records | pl.attr.id.count() |
count_distinct() | Count unique values | pl.attr.customer_id.count_distinct() |
stddev() | Standard deviation | pl.attr.amount.stddev() |
variance() | Statistical variance | pl.attr.amount.variance() |
Using multiple aggregates
Combine multiple aggregate functions in a single query:
Results show comprehensive statistics:
{
"status": "processed",
"sum(amount)": 1250000.0,
"avg(amount)": 285.71,
"min(amount)": 5.0,
"max(amount)": 5000.0,
"count(id)": 4375,
"count_distinct(customer_id)": 892
}Aggregate function tips: - sum() for totals (revenue, fees, counts) - avg() for
averages (order value, transaction size) - count() for volume metrics - count_distinct()
for unique entities (customers, products) - min()/max() for ranges and outliers -
stddev() for variance analysis
Filtering Techniques
Apply filters to focus reports on specific data segments.
Limit reports to specific time periods:
Use multiple conditions to narrow results:
Match multiple values for a field:
Query Building Patterns
Common patterns for constructing effective queries.
Build queries incrementally:
Always filter first to reduce dataset size:
Make queries self-documenting:
Define common filters once, reuse across queries:
Best Practices
Follow these guidelines for efficient and maintainable queries.
Index commonly grouped fields:
- Group by indexed fields when possible
- Date fields are typically indexed
- Customer and status fields benefit from indexes
Limit result sets:
# Use pagination for large result sets
report = (
pl.Transaction.select(pl.attr.sender.account_id, pl.attr.amount.sum())
.group_by(pl.attr.sender.account_id)
.order_by("desc(sum(amount))")
.limit(100)
.all()
) # Top 100 customersFilter before grouping:
# Efficient: Filter reduces dataset first
efficient = (
pl.Transaction.select(pl.attr.status, pl.attr.amount.sum())
.filter_by(pl.attr.created_at >= datetime(2024, 1, 1)) # Filter first
.group_by(pl.attr.status)
.all()
)Handle null values:
- Aggregate functions skip null values automatically
- Use filters to exclude incomplete records when needed
- Document assumptions about null handling
Timezone considerations:
from datetime import datetime, timezone
# Be explicit about timezone for date functions
utc_report = (
pl.Transaction.select(pl.attr.created_at.date(), pl.attr.amount.sum())
.filter_by(pl.attr.created_at >= datetime(2024, 1, 1, tzinfo=timezone.utc))
.group_by(pl.attr.created_at.date())
.all()
)Consistent ordering:
- Always specify
order_byfor reproducible results - Include secondary sort for ties
- Use consistent direction (asc/desc)
Document complex logic:
# Calculate customer lifetime value for active customers
# Includes total spend, average order value, and purchase frequency
# Filters to customers with at least one transaction in last 12 months
twelve_months_ago = (
datetime.now(timezone.utc) - timedelta(days=365)
).isoformat()
twelve_months_ago = twelve_months_ago.replace("+00:00", "Z")
customer_ltv = (
pl.Transaction.select(
pl.attr.sender.account_id,
pl.attr.amount.sum(), # Total lifetime spend
pl.attr.amount.avg(), # Average order value
pl.attr.id.count(), # Purchase frequency
)
.filter_by(
pl.attr.type == "payment",
pl.attr.status == "processed",
pl.attr.created_at >= twelve_months_ago,
)
.group_by(pl.attr.sender.account_id)
.order_by("desc(sum(amount))")
.limit(1000)
.all()
)Test incrementally:
- Start with simple query
- Add one component at a time
- Verify results at each step
- Debug issues before adding complexity
Troubleshooting
Common issues and solutions when building queries.
Problem: Query returns no results
Solutions:
- Check filter conditions are not too restrictive
- Verify date ranges include data
- Test without filters to confirm data exists
- Check for timezone issues in date filters
# Debug: Remove filters incrementally
debug = (
pl.Transaction.select(pl.attr.id.count())
.filter_by(
# Comment out filters one at a time to isolate issue
# pl.attr.status == 'processed',
# pl.attr.created_at.year() == 2024
)
.all()
)Problem: Query takes too long to execute
Solutions:
- Add filters to reduce dataset size
- Limit result set with
.limit() - Use indexed fields in filters and groups
- Break complex queries into smaller parts
# Optimize: Filter and limit
optimized = (
pl.Transaction.select(pl.attr.sender.account_id, pl.attr.amount.sum())
.filter_by(
pl.attr.created_at >= "2024-01-01T00:00:00Z", # Reduce dataset
pl.attr.type == "payment",
)
.group_by(pl.attr.sender.account_id)
.order_by("desc(sum(amount))")
.limit(100)
.all()
) # Limit resultsProblem: Sums or counts don't match expectations
Solutions:
- Check for duplicate records in grouping
- Verify filters are correct
- Account for null values
- Consider transaction types (payments vs refunds)
# Verify: Check raw data first
raw = pl.Transaction.filter_by(pl.attr.sender.account_id == "acct_123").all()
# Then compare to aggregated
aggregated = (
pl.Transaction.select(
pl.attr.sender.account_id, pl.attr.amount.sum(), pl.attr.id.count()
)
.filter_by(pl.attr.sender.account_id == "acct_123")
.group_by(pl.attr.sender.account_id)
.all()
)Problem: Group by not working as expected
Solutions:
- Ensure grouped fields are in select
- Check function syntax
- Verify nested attribute paths
- Confirm no circular references
# Correct: Include grouped fields in select
correct = (
pl.Transaction.select(pl.attr.status, pl.attr.amount.sum()) # Include this
.group_by(pl.attr.status) # When grouping by this
.all()
)
# Incorrect: Missing grouped field
# This will error
incorrect = (
pl.Transaction.select(pl.attr.amount.sum()) # Missing status
.group_by(pl.attr.status) # Error: status not in select
.all()
)Next Steps
Now that you understand the mechanics, explore specific report types
Revenue Reports
Build revenue analysis, trends, and financial reports with Revenue Reports guide, analyze transaction volumes and processing metrics, and create comprehensive financial statements for accounting teams.
Customer Analytics
Create customer lifetime value and behavior analysis with Customer Analytics documentation, segment customers by spending patterns and activity, and identify high-value customers for targeted marketing campaigns.