• ianatkins

    (@ianatkins)


    Hello.

    We have some custom code that assigns taxonomies to an attachment. After assigning the taxonomy we want to update the Algolia index.

    At the moment we manually trigger do_action( 'attachment_updated') which then get’s picked up by the ‘sync_item’ function.

    This isn’t very efficient, as we then see multiple individual batch API requests in Algolia, updating a single record at a time.

    Is there anyway we can update multiple records at once, similar to how things run when triggering a manual re-index?

    Saw there is the sync_term_posts function, could we use that, and is so would we need to instantiate a class to use it?

    Thanks

    Ian.

Viewing 12 replies - 1 through 12 (of 12 total)
  • Plugin Contributor Michael Beckwith

    (@tw2113)

    The BenchPresser

    Hmm, as you point out, you’re using attachment_updated already, which indeed is the hook we listen for as well in the Posts changes watcher class.

    That said, does running a bulk re-index handle updating all the parts you’re expecting with the term data?

    I suppose my main question is how are you scripting things out here? Like is this a large bulk import process? or is it someone manually updating things individually as needed?

    Thread Starter ianatkins

    (@ianatkins)

    Hi Michael.

    Yes running a bulk re-index catches the updates, but the site has 100k images.

    On the frontend the user can assign 1 – 100 images to taxonomy ( for example ), so doesn’t make sense to trigger a full re-index, ideally just looking for a way to trigger a small batch, sending a array of the affected ID’s.

    Thanks,

    Ian

    Plugin Contributor Michael Beckwith

    (@tw2113)

    The BenchPresser

    To be clear, is it the term that is getting a picture associated with it? or is it the attachment post that’s getting a term associated? Asking because we have these term hooks as well https://github.com/WebDevStudios/wp-search-with-algolia/blob/main/includes/watchers/class-algolia-term-changes-watcher.php

    Thread Starter ianatkins

    (@ianatkins)

    Hi Michael,

    We’re assigning a taxonomy term to the attachment post, using wp_set_object_terms

    Thanks for the watchers link – looks like the handle_changes function is hooked in on wp_set_object_terms function. But if I’m reading that correctly, looks like the handle_changes function is syncing the terms themselves, not the posts attached to the terms.

    Not sure that’s logical, as the term themselves don’t change when calling wp_set_object_terms, but the term relationships do ( e.g. what terms are assigned to what do ). Shouldn’t that function should sync the $object_id’s not the $term_id’s?

    If there a function we can call to manually batch sync a list of post/attachment ID’s, then we should be able to hook it all together.

    Thanks,

    Ian

    • This reply was modified 3 months ago by ianatkins.
    Plugin Contributor Michael Beckwith

    (@tw2113)

    The BenchPresser

    It’s logical in that we do have the ability to index and search taxonomy terms, though i admit i don’t know how much that’s actively used.

    Out of curiousity, and forgive me if you have, but have you verified whether or not part or all of the attachment changes are being pushed, as they’re being saved?

    My head says this should be getting run each time, add_action( 'attachment_updated', array( $this, 'sync_item' ) ); if you’re triggering `do_action( 'attachment_updated') though you may want to also pass the attachment ID too.

    Core calls it like do_action( 'attachment_updated', $post_id, $post_after, $post_before ) but our plugin is only expecting/using $post_id

    Thread Starter ianatkins

    (@ianatkins)

    Hi Michael.

    Yes we see individual API requests in Algolia when attachment_updated, each request for a single attachment at a time. Ideally am trying to batch those requests.

    For example when I tag 16 images, I see 32 individual API requests, with data for 1 attachment per request. ( one request for the searchable index, another for the attachment index ).

    I’m trying to get it working more like when you re-index, then the API requests have 100 posts per request.

    Thanks.

    Plugin Contributor Michael Beckwith

    (@tw2113)

    The BenchPresser

    I hope I’m not coming off as dismissive in any ways, just trying to think through everything with a not super-common usecase (terms on attachments).

    To be certain, is it a case of you add a tag, it fires off the API requests for the 2 indexes, you add another tag, 2 more requests and repeat?

    or is it you add tags, my-tag, another-tag, one-more click a save button and you get a single API request for each of the 2 indexes?

    and ideally you’re wanting to “collect” the attachments to update and send say 16 pictures to update all with one request (for each index) ?

    Thread Starter ianatkins

    (@ianatkins)

    Hey Michael,

    Not at all, appreciate the support and that it’s a niche use case!

    Yes, is two API calls for each post updated at the moment, ( essentially with each trigger of attachment_updated )

    In the example below $collection_media_ids is an array of the attachment posts to update. Ideally would like to trigger a sync after the foreach, just sending the array of IDs to update, not making individual syncs.

    // Add to existing collection
    $collection_id = absint($collection_id);

    // Update
    foreach( $collection_media_ids as $media_id) {

    wp_set_object_terms( $media_id, $collection_id,'collection', true);

    // Trigger do_action so Algolia index updates
    do_action( 'attachment_updated', $media_id, false, false );

    }

    If I don’t trigger do_action, I don’t see any activity in the Algolia logs, so nothing seems to get synced.

    Thanks.

    Ian.

    Thread Starter ianatkins

    (@ianatkins)

    Hey Michael,

    Just tried the below, after my foreach loop, but that wiped the index, but did add the specific ID’s in one hit ( by passing the IDs as the second parameter on re_index ).

            // Sync to algolia
    $algolia_plugin = \Algolia_Plugin_Factory::create();
    $indices[] = new \Algolia_Posts_Index( 'attachment' );
    $client = $algolia_plugin->get_api()->get_client();
    $index_name_prefix = $algolia_plugin->get_settings()->get_index_name_prefix();

    foreach ( $indices as $index ) {
    $index->set_name_prefix( $index_name_prefix );
    $index->set_client( $client );

    try {
    $index->re_index(1, $collection_media_ids);
    } catch ( AlgoliaException $exception ) {
    error_log( $exception->getMessage() ); // phpcs:ignore -- Legacy.
    }
    }

    Will try the update_records method instead, once I dig into the code a bit more!

    Thread Starter ianatkins

    (@ianatkins)

    Hey Michael.

    Below is far as i’ve got, calling update_records doesn’t work as it’s a private method. Have just called sync_item on the individual index I’m interested in, rather than calling the action which other plugins might be hooked into.

    Do let me know if there is a more effecient way to batch multiple syncs in one hit, without rebuilding the whole index.

    Thanks!

            // Add to existing collection
    $collection_id = absint($collection_id);

    // Sync to algolia
    $algolia_plugin = \Algolia_Plugin_Factory::create();
    $synced_indices_ids = $algolia_plugin->get_settings()->get_synced_indices_ids();
    $index_name_prefix = $algolia_plugin->get_settings()->get_index_name_prefix();
    $client = $algolia_plugin->get_api()->get_client();

    $index = new \Algolia_Posts_Index( 'attachment' );
    $index->set_name_prefix( $index_name_prefix );
    $index->set_client( $client );
    $index->set_enabled( true );

    // Update
    foreach( $collection_media_ids as $media_id) {

    // Add Taxonomy Term
    wp_set_object_terms( $media_id, $collection_id,'collection', true);

    // Sync to Algolia index
    $media = get_post( $media_id );

    // Sync to Algolia
    try {
    $index->sync($media);
    } catch ( AlgoliaException $exception ) {
    error_log( $exception->getMessage() ); // phpcs:ignore -- Legacy.
    }
    }
    Plugin Contributor Michael Beckwith

    (@tw2113)

    The BenchPresser

    That last bit is very similar to some code we have in our premium Pro addon, used to handle indexing of sold out products or not. However for that we use $index->sync( $item ); and is still just one at a time. This method is found in the Algolia_Index class.

    Part of me wants to say to collect the attachment IDs and sync, but i need to remember we need to collect all the intended data for each still, which is done the get_records() methods you may have seen.

    Ultimately it’s a question of how to most efficiently get to a point that you could run $index->saveObjects( $sanitized_records ); which is a method from their PHP client, and i **think** would handle multiple posts in the WordPress context. I can’t promise on that part offhand.

    Plugin Contributor Michael Beckwith

    (@tw2113)

    The BenchPresser

    hey @ianatkins

    Did a solution ever get found here? or are you still needing some help?

Viewing 12 replies - 1 through 12 (of 12 total)
  • You must be logged in to reply to this topic.