Contents
Overview: why batching long tasks matters
Many WordPress tasks (migrating data, regenerating images, exporting large datasets, bulk meta updates) can exceed PHP max execution time or memory limits when executed in a single request. Running them in smaller batches through the REST API and a JavaScript runner avoids timeouts, gives progress feedback to users, and enables resumable, safe operations that behave well on shared hosts.
High-level architecture
- Server: expose a REST route that accepts job id, offset and limit (or a cursor) and processes one batch per request.
- Server storage: persist job state (remaining items, processed count, status) in a transient, option, or custom table so batches can resume reliably.
- Client: JavaScript loop (using fetch) that repeatedly calls the REST endpoint until the job reports completion, updating UI progress and handling retries/backoff.
- Security: validate nonces and capabilities on the REST endpoint make handlers idempotent or safely reentrant.
Design choices and tradeoffs
- Chunk size: smaller chunks reduce risk of timeouts but increase number of requests. Choose based on typical item complexity start small (e.g., 50 items) and increase if safe.
- Storage: transients/options are simple but limited a custom table is best for many concurrent jobs or large metadata.
- Retries: implement limited retries and exponential backoff on the client to handle transient server or network errors.
- Concurrency: single-threaded polling is simplest multiple concurrent requests can increase throughput but require careful locking to avoid double-processing.
- Idempotency: ensure processing an item twice is harmless (e.g., skip if already processed).
REST response contract (recommended)
Have the REST endpoint return a consistent JSON structure so the client can decide next actions. Example fields:
job_id | Unique identifier for this job |
processed | Number of items processed so far |
total | Total number of items to process (if known) |
next_offset | Offset (or cursor) to use for the next batch, or null if done |
done | Boolean indicating completion |
errors | Array of error messages (if any) produced in this batch |
time | Server-side time used to process this batch (ms) |
Example: REST route and handler (PHP)
The following example shows a simple plugin-style REST route that handles a single batch per request. It uses a transient to store job state. Replace the capability check and nonce validation with what fits your context.
POST, callback => my_plugin_process_batch, permission_callback => function( WP_REST_Request request ) { // Example: require manage_options capability and valid nonce nonce = request->get_header( x-wp-nonce ) if ( ! wp_verify_nonce( nonce, wp_rest ) ) { return new WP_Error( invalid_nonce, Invalid nonce, array( status => 403 ) ) } return current_user_can( manage_options ) }, ) ) } ) function my_plugin_process_batch( WP_REST_Request request ) { params = request->get_json_params() job_id = isset( params[job_id] ) ? sanitize_text_field( params[job_id] ) : null offset = isset( params[offset] ) ? intval( params[offset] ) : 0 limit = isset( params[limit] ) ? intval( params[limit] ) : 50 if ( empty( job_id ) ) { return rest_ensure_response( array( code => missing_job_id, message => job_id is required, ) ) } // Load or initialize job state stored as transient transient_key = my_plugin_job_{job_id} state = get_transient( transient_key ) if ( false === state ) { state = array( processed => 0, failed => 0, total => null, // optional: set if you can compute total status => running, ) } start_time = microtime( true ) errors = array() // Example: get items to process with offset and limit // Replace with your data source (posts, external API, custom table, etc.) query_args = array( post_type => post, posts_per_page => limit, offset => offset, post_status => any, fields => ids, ) query = new WP_Query( query_args ) ids = query->posts foreach ( ids as id ) { try { // idempotent processing: check a meta flag before processing if ( get_post_meta( id, _my_plugin_processed, true ) ) { continue } // Perform the heavy operation for a single item. // Example: regenerate a heavy meta, call external API, etc. // This needs to be short and safe for each item. update_post_meta( id, processed_at, current_time( mysql ) ) // Mark processed update_post_meta( id, _my_plugin_processed, 1 ) state[processed] } catch ( Exception e ) { errors[] = sprintf( ID %d: %s, id, e->getMessage() ) state[failed] } } done = empty( ids ) count( ids ) < limit // if less than limit, assume finished state[status] = done ? completed : running if ( done ) { state[completed_at] = current_time( mysql ) } // Persist state and return response set_transient( transient_key, state, 12 HOUR_IN_SECONDS ) response = array( job_id => job_id, processed => state[processed], failed => state[failed], total => state[total], next_offset => done ? null : offset limit, done => done, errors => errors, time => round( ( microtime( true ) - start_time ) 1000 ), // ms ) return rest_ensure_response( response ) } ?>
Notes on the PHP example
- Idempotency: the handler checks a post meta flag to avoid processing the same item twice.
- State storage: using a transient keyed by job_id to track progress. For large-scale or concurrent jobs, a custom DB table with explicit locks is safer.
- Responsibility split: each invocation processes only a small number of items and returns an updated offset.
Persisting job state: transients vs custom table
For simple use cases, transients or options suffice. For robust multi-job or multi-user cases, create a custom table. Example SQL to create a lightweight job table:
CREATE TABLE wp_my_plugin_jobs ( job_id VARCHAR(191) PRIMARY KEY, user_id BIGINT UNSIGNED NOT NULL, status VARCHAR(32) NOT NULL, processed BIGINT UNSIGNED DEFAULT 0, failed BIGINT UNSIGNED DEFAULT 0, total BIGINT UNSIGNED DEFAULT NULL, meta LONGTEXT, created_at DATETIME NOT NULL, updated_at DATETIME NOT NULL ) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci
Use wpdb->insert / update to persist and wpdb->get_row to read. Wrap updates in transactions if necessary (or implement optimistic locking with a version field).
Client-side JavaScript: fetch loop, progress and retries
The client repeatedly calls the REST endpoint until done. This example uses an AbortController for canceling, exponential backoff on errors, and a limit of attempts.
// Example: run a job from the browser async function runJob({ jobId, limit = 50, nonce, onProgress }) { let offset = 0 let done = false let attempts = 0 const maxAttempts = 5 const baseDelay = 300 // ms const controller = new AbortController() // Expose a way to cancel from outside by returning the controller. const result = { controller, promise: (async () => { while (!done) { try { const resp = await fetch( /wp-json/my-plugin/v1/job, { method: POST, headers: { Content-Type: application/json, X-WP-Nonce: nonce, }, body: JSON.stringify({ job_id: jobId, offset, limit }), signal: controller.signal, } ) if (!resp.ok) { throw new Error(Network error: resp.status) } const json = await resp.json() // If your REST returns WP_Error objects, check for code/message if ( json.code !json.job_id ) { throw new Error( json.message Server error ) } // Update progress to consumer if ( typeof onProgress === function ) { onProgress({ processed: json.processed, total: json.total, failed: json.failed, done: json.done, }) } if ( json.done ) { done = true return json } // Reset attempts on success attempts = 0 offset = json.next_offset === null ? offset limit : json.next_offset } catch (err) { attempts if ( attempts > maxAttempts ) { throw new Error(Max retries exceeded: err.message) } // Exponential backoff jitter const delay = Math.pow(2, attempts) baseDelay Math.random() 100 await new Promise( resolve => setTimeout(resolve, delay) ) } } })() } return result } // Usage: const runner = await runJob({ jobId: export_2025_09, limit: 100, nonce: my_nonce_from_wp_localize, onProgress: ({ processed, total, failed, done }) => { // update UI console.log(Processed:, processed, Total:, total, Failed:, failed, Done:, done) } }) // To cancel: // runner.controller.abort()
Client-side UI hints
- Render a progress bar using processed/total. If total is unknown, show items processed and a spinner.
- Allow cancellation: call AbortController.abort() and optionally tell the server to mark job as aborted.
- Persist job_id in localStorage so the UI can resume polling after browser reload (if the job is resumable server-side).
Advanced: locking, concurrency, and safe parallelism
If you want multiple workers (browser tabs or server cron) to process the same job concurrently, you need locking to avoid double-processing. Approaches:
- Row locking (custom table): use SELECT … FOR UPDATE (where supported) or a status column with an atomic update (WHERE status = pending) to claim a batch.
- Work queues: store item-level rows and atomically set claimed_by / claimed_at columns.
- Optimistic locking: include a version or token that gets updated on each change if update fails, retry.
Error handling best practices
- Return clear error codes and messages from REST endpoints (use WP_Error with proper status codes).
- Log serious failures on the server and store per-item errors for later inspection.
- Make per-item processing idempotent so retrying a batch is safe.
- Limit retries on the client to avoid tight infinite loops.
Real-world plugin example: start poll endpoints
A typical workflow:
- User clicks Start job. Client calls REST endpoint /start-job which creates a job record (returns job_id).
- Client runs the poller loop calling /job with job_id, offset and limit until completion.
- Server marks job completed when done and returns final summary.
Example: start-job endpoint (PHP)
POST, callback => my_plugin_start_job, permission_callback => function () { return current_user_can( manage_options ) } ) ) } ) function my_plugin_start_job( WP_REST_Request request ) { // Determine total if possible (optional) total = wp_count_posts( post )->publish wp_count_posts( post )->draft // simplistic example // Create a simple job id (use uniqid or wp_generate_uuid4 in newer WP) job_id = wp_generate_uuid4() state = array( job_id => job_id, status => running, created_by => get_current_user_id(), created_at => current_time( mysql ), processed => 0, failed => 0, total => total, ) set_transient( my_plugin_job_{job_id}, state, 12 HOUR_IN_SECONDS ) return rest_ensure_response( array( job_id => job_id, total => total ) ) } ?>
Testing and performance tips
- Start with small limits and gradually increase while monitoring server time and memory usage.
- Log the time taken by each batch server-side if a single item is slow, optimize that operation.
- If your batch performs many DB writes, wrap item-level writes into a single transaction (when using custom tables) to reduce overhead.
- Use object caching and avoid heavy WP_Query inside tight loops (fetch only IDs, then process in smaller queries).
Security checklist
- Validate nonces and use capability checks in permission_callback.
- Sanitize all input and escape output where appropriate.
- Make sure the job_id is unguessable (use UUID) and authorize who can view/control a job.
- Limit job lifetime with expirations and allow an admin to cancel stale jobs.
Summary: checklist to implement a robust batched runner
- Design your REST contract (job start, batch process endpoints, optional cancel/status endpoints).
- Choose a persistent storage strategy for job state (transient, option, custom table).
- Make per-item processing idempotent and short.
- Implement client polling with exponential backoff, cancellation, and progress UI updates.
- Add server-side validation, logging, and a way to inspect or resume jobs.
- Test with different chunk sizes, concurrency levels, and on real hosting to find safe defaults.
Further reading and libraries
- WordPress REST API documentation
- WP Background Processing (a useful library for background tasks)
- Sanitizing and validating input in WordPress
|
Acepto donaciones de BAT's mediante el navegador Brave 🙂 |