Database pagination offset problem¶
The database pagination offset problem refers to the performance degradation experienced when using LIMIT and OFFSET to retrieve deep pages within a large dataset. Specifically, query speed decreases significantly as the offset value increases^[600-developer__database__mysql__大数据分页优化.md].
Root Cause¶
The primary performance bottleneck is caused by the presence of SELECT * in conjunction with a large offset.^[600-developer__database__mysql__大数据分页优化.md] When MySQL processes the query, it traverses the primary key index to locate the starting offset, but the wildcard forces the database to perform a "row lookup" (reading the full row data) for every row skipped during the traversal^[600-developer__database__mysql__大数据分页优化.md] .
This process of seeking the index and then retrieving the full row data for discarded rows creates substantial overhead^[600-developer__database__mysql__大数据分页优化.md].
Optimization Strategies¶
There are three common approaches to mitigate this issue^[600-developer__database__mysql__大数据分页优化.md]:
- Business Restrictions: Limit the maximum depth of pagination, effectively preventing users from accessing extremely deep pages^[600-developer__database__mysql__大数据分页优化.md].
-
Seek Method: Instead of an offset, pass the ID of the last record from the previous page as a parameter. The query is then modified to filter where the ID is greater than the starting value^[600-developer__database__mysql__大数据分页优化.md] :
SELECT * FROM table WHERE id > 3000000 LIMIT 10; -
Delayed Association (Deferred Join): This method uses a subquery to fetch only the IDs required for pagination, then joins back to the original table to retrieve the full data^[600-developer__database__mysql__大数据分页优化.md] . This allows the database to scan the covering index (ID only) for the offset without performing row lookups for the skipped rows^[600-developer__database__mysql__大数据分页优化.md] .
SELECT table.* FROM table INNER JOIN ( SELECT id FROM table LIMIT 3000000, 10 ) AS tmp ON tmp.id = table.id;This approach typically keeps query response time within one second^[600-developer__database__mysql__大数据分页优化.md] .
Related Concepts¶
- [[Database Indexing]]
- [[Query Optimization]]
- [[SQL Performance]]
Sources¶
600-developer__database__mysql__大数据分页优化.md