How to run long tasks in batches with REST and JS in WordPress

Contents

Overview: why batching long tasks matters

Many WordPress tasks (migrating data, regenerating images, exporting large datasets, bulk meta updates) can exceed PHP max execution time or memory limits when executed in a single request. Running them in smaller batches through the REST API and a JavaScript runner avoids timeouts, gives progress feedback to users, and enables resumable, safe operations that behave well on shared hosts.

High-level architecture

  • Server: expose a REST route that accepts job id, offset and limit (or a cursor) and processes one batch per request.
  • Server storage: persist job state (remaining items, processed count, status) in a transient, option, or custom table so batches can resume reliably.
  • Client: JavaScript loop (using fetch) that repeatedly calls the REST endpoint until the job reports completion, updating UI progress and handling retries/backoff.
  • Security: validate nonces and capabilities on the REST endpoint make handlers idempotent or safely reentrant.

Design choices and tradeoffs

  • Chunk size: smaller chunks reduce risk of timeouts but increase number of requests. Choose based on typical item complexity start small (e.g., 50 items) and increase if safe.
  • Storage: transients/options are simple but limited a custom table is best for many concurrent jobs or large metadata.
  • Retries: implement limited retries and exponential backoff on the client to handle transient server or network errors.
  • Concurrency: single-threaded polling is simplest multiple concurrent requests can increase throughput but require careful locking to avoid double-processing.
  • Idempotency: ensure processing an item twice is harmless (e.g., skip if already processed).

REST response contract (recommended)

Have the REST endpoint return a consistent JSON structure so the client can decide next actions. Example fields:

job_id Unique identifier for this job
processed Number of items processed so far
total Total number of items to process (if known)
next_offset Offset (or cursor) to use for the next batch, or null if done
done Boolean indicating completion
errors Array of error messages (if any) produced in this batch
time Server-side time used to process this batch (ms)

Example: REST route and handler (PHP)

The following example shows a simple plugin-style REST route that handles a single batch per request. It uses a transient to store job state. Replace the capability check and nonce validation with what fits your context.

 POST,
        callback => my_plugin_process_batch,
        permission_callback => function( WP_REST_Request request ) {
            // Example: require manage_options capability and valid nonce
            nonce = request->get_header( x-wp-nonce )
            if ( ! wp_verify_nonce( nonce, wp_rest ) ) {
                return new WP_Error( invalid_nonce, Invalid nonce, array( status => 403 ) )
            }
            return current_user_can( manage_options )
        },
    ) )
} )

function my_plugin_process_batch( WP_REST_Request request ) {
    params = request->get_json_params()
    job_id = isset( params[job_id] ) ? sanitize_text_field( params[job_id] ) : null
    offset = isset( params[offset] ) ? intval( params[offset] ) : 0
    limit  = isset( params[limit] ) ? intval( params[limit] ) : 50

    if ( empty( job_id ) ) {
        return rest_ensure_response( array(
            code => missing_job_id,
            message => job_id is required,
        ) )
    }

    // Load or initialize job state stored as transient
    transient_key = my_plugin_job_{job_id}
    state = get_transient( transient_key )
    if ( false === state ) {
        state = array(
            processed => 0,
            failed => 0,
            total => null, // optional: set if you can compute total
            status => running,
        )
    }

    start_time = microtime( true )
    errors = array()

    // Example: get items to process with offset and limit
    // Replace with your data source (posts, external API, custom table, etc.)
    query_args = array(
        post_type => post,
        posts_per_page => limit,
        offset => offset,
        post_status => any,
        fields => ids,
    )
    query = new WP_Query( query_args )
    ids = query->posts

    foreach ( ids as id ) {
        try {
            // idempotent processing: check a meta flag before processing
            if ( get_post_meta( id, _my_plugin_processed, true ) ) {
                continue
            }

            // Perform the heavy operation for a single item.
            // Example: regenerate a heavy meta, call external API, etc.
            // This needs to be short and safe for each item.
            update_post_meta( id, processed_at, current_time( mysql ) )

            // Mark processed
            update_post_meta( id, _my_plugin_processed, 1 )
            state[processed]  
        } catch ( Exception e ) {
            errors[] = sprintf( ID %d: %s, id, e->getMessage() )
            state[failed]  
        }
    }

    done = empty( ids )  count( ids ) < limit // if less than limit, assume finished
    state[status] = done ? completed : running
    if ( done ) {
        state[completed_at] = current_time( mysql )
    }

    // Persist state and return response
    set_transient( transient_key, state, 12  HOUR_IN_SECONDS )

    response = array(
        job_id => job_id,
        processed => state[processed],
        failed => state[failed],
        total => state[total],
        next_offset => done ? null : offset   limit,
        done => done,
        errors => errors,
        time => round( ( microtime( true ) - start_time )  1000 ), // ms
    )

    return rest_ensure_response( response )
}
?>

Notes on the PHP example

  • Idempotency: the handler checks a post meta flag to avoid processing the same item twice.
  • State storage: using a transient keyed by job_id to track progress. For large-scale or concurrent jobs, a custom DB table with explicit locks is safer.
  • Responsibility split: each invocation processes only a small number of items and returns an updated offset.

Persisting job state: transients vs custom table

For simple use cases, transients or options suffice. For robust multi-job or multi-user cases, create a custom table. Example SQL to create a lightweight job table:

CREATE TABLE wp_my_plugin_jobs (
  job_id VARCHAR(191) PRIMARY KEY,
  user_id BIGINT UNSIGNED NOT NULL,
  status VARCHAR(32) NOT NULL,
  processed BIGINT UNSIGNED DEFAULT 0,
  failed BIGINT UNSIGNED DEFAULT 0,
  total BIGINT UNSIGNED DEFAULT NULL,
  meta LONGTEXT,
  created_at DATETIME NOT NULL,
  updated_at DATETIME NOT NULL
) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci

Use wpdb->insert / update to persist and wpdb->get_row to read. Wrap updates in transactions if necessary (or implement optimistic locking with a version field).

Client-side JavaScript: fetch loop, progress and retries

The client repeatedly calls the REST endpoint until done. This example uses an AbortController for canceling, exponential backoff on errors, and a limit of attempts.

// Example: run a job from the browser
async function runJob({ jobId, limit = 50, nonce, onProgress }) {
  let offset = 0
  let done = false
  let attempts = 0
  const maxAttempts = 5
  const baseDelay = 300 // ms
  const controller = new AbortController()

  // Expose a way to cancel from outside by returning the controller.
  const result = {
    controller,
    promise: (async () => {
      while (!done) {
        try {
          const resp = await fetch(
            /wp-json/my-plugin/v1/job,
            {
              method: POST,
              headers: {
                Content-Type: application/json,
                X-WP-Nonce: nonce,
              },
              body: JSON.stringify({ job_id: jobId, offset, limit }),
              signal: controller.signal,
            }
          )
          if (!resp.ok) {
            throw new Error(Network error:    resp.status)
          }
          const json = await resp.json()

          // If your REST returns WP_Error objects, check for code/message
          if ( json.code  !json.job_id ) {
            throw new Error( json.message  Server error )
          }

          // Update progress to consumer
          if ( typeof onProgress === function ) {
            onProgress({
              processed: json.processed,
              total: json.total,
              failed: json.failed,
              done: json.done,
            })
          }

          if ( json.done ) {
            done = true
            return json
          }

          // Reset attempts on success
          attempts = 0
          offset = json.next_offset === null ? offset   limit : json.next_offset

        } catch (err) {
          attempts  
          if ( attempts > maxAttempts ) {
            throw new Error(Max retries exceeded:    err.message)
          }
          // Exponential backoff   jitter
          const delay = Math.pow(2, attempts)  baseDelay   Math.random()  100
          await new Promise( resolve => setTimeout(resolve, delay) )
        }
      }
    })()
  }

  return result
}

// Usage:
const runner = await runJob({
  jobId: export_2025_09,
  limit: 100,
  nonce: my_nonce_from_wp_localize,
  onProgress: ({ processed, total, failed, done }) => {
    // update UI
    console.log(Processed:, processed, Total:, total, Failed:, failed, Done:, done)
  }
})

// To cancel:
// runner.controller.abort()

Client-side UI hints

  • Render a progress bar using processed/total. If total is unknown, show items processed and a spinner.
  • Allow cancellation: call AbortController.abort() and optionally tell the server to mark job as aborted.
  • Persist job_id in localStorage so the UI can resume polling after browser reload (if the job is resumable server-side).

Advanced: locking, concurrency, and safe parallelism

If you want multiple workers (browser tabs or server cron) to process the same job concurrently, you need locking to avoid double-processing. Approaches:

  • Row locking (custom table): use SELECT … FOR UPDATE (where supported) or a status column with an atomic update (WHERE status = pending) to claim a batch.
  • Work queues: store item-level rows and atomically set claimed_by / claimed_at columns.
  • Optimistic locking: include a version or token that gets updated on each change if update fails, retry.

Error handling best practices

  • Return clear error codes and messages from REST endpoints (use WP_Error with proper status codes).
  • Log serious failures on the server and store per-item errors for later inspection.
  • Make per-item processing idempotent so retrying a batch is safe.
  • Limit retries on the client to avoid tight infinite loops.

Real-world plugin example: start poll endpoints

A typical workflow:

  1. User clicks Start job. Client calls REST endpoint /start-job which creates a job record (returns job_id).
  2. Client runs the poller loop calling /job with job_id, offset and limit until completion.
  3. Server marks job completed when done and returns final summary.

Example: start-job endpoint (PHP)

 POST,
        callback => my_plugin_start_job,
        permission_callback => function () {
            return current_user_can( manage_options )
        }
    ) )
} )

function my_plugin_start_job( WP_REST_Request request ) {
    // Determine total if possible (optional)
    total = wp_count_posts( post )->publish   wp_count_posts( post )->draft // simplistic example

    // Create a simple job id (use uniqid or wp_generate_uuid4 in newer WP)
    job_id = wp_generate_uuid4()

    state = array(
        job_id => job_id,
        status => running,
        created_by => get_current_user_id(),
        created_at => current_time( mysql ),
        processed => 0,
        failed => 0,
        total => total,
    )

    set_transient( my_plugin_job_{job_id}, state, 12  HOUR_IN_SECONDS )

    return rest_ensure_response( array( job_id => job_id, total => total ) )
}
?>

Testing and performance tips

  • Start with small limits and gradually increase while monitoring server time and memory usage.
  • Log the time taken by each batch server-side if a single item is slow, optimize that operation.
  • If your batch performs many DB writes, wrap item-level writes into a single transaction (when using custom tables) to reduce overhead.
  • Use object caching and avoid heavy WP_Query inside tight loops (fetch only IDs, then process in smaller queries).

Security checklist

  • Validate nonces and use capability checks in permission_callback.
  • Sanitize all input and escape output where appropriate.
  • Make sure the job_id is unguessable (use UUID) and authorize who can view/control a job.
  • Limit job lifetime with expirations and allow an admin to cancel stale jobs.

Summary: checklist to implement a robust batched runner

  1. Design your REST contract (job start, batch process endpoints, optional cancel/status endpoints).
  2. Choose a persistent storage strategy for job state (transient, option, custom table).
  3. Make per-item processing idempotent and short.
  4. Implement client polling with exponential backoff, cancellation, and progress UI updates.
  5. Add server-side validation, logging, and a way to inspect or resume jobs.
  6. Test with different chunk sizes, concurrency levels, and on real hosting to find safe defaults.

Further reading and libraries



Acepto donaciones de BAT's mediante el navegador Brave 🙂



Leave a Reply

Your email address will not be published. Required fields are marked *