9 min read 1 day ago

Embeddings Issues

Greyed Out Buttons

If you can’t use the Add button because it’s greyed out, it’s simply because you haven’t selected an environment to work with. On the right-side panel, there’s a dropdown menu—if it still says “Select”, it means no environment has been chosen yet. Select the environment you want to use, and the Add button will become active.

Error Icon

When adding a new embedding, you might see an error icon. This indicates the embedding could not be added due to an unexpected issue. Hover your mouse over the icon to view the specific error message. This message will help you identify and resolve the problem quickly.

Commun Errors

Unauthorized: Malformed domain – Unauthorized (401)

This means your Pinecone server domain is incorrect. Please double-check the host URL in your Pinecone account and ensure it’s set correctly in AI Engine. If the domain is correct, your API key might be invalid or expired, or your project settings may have changed.

Error code 3: Vector dimension 1536 does not match the dimension of the index 1024

This means that when you created your index, you chose a dimension that does not match the one used by your Embeddings model in AI Engine. There’s a mismatch — either change your Embeddings model to match your vector size, or delete your index and create a new one that matches the model you’re using in AI Engine. Make sure to follow the documentation when creating your index to avoid this issue.

If you’re not sure what your error message means, please contact support so we can help you identify and resolve the issue directly.

Embeddings are not used

Make sure you can find the embedding you are expecting by using the Search mode in the embeddings tab. If nothing appears, make sure your settings for the minimum score are low enough; if it’s too high, no embedding at all might be picked up.

Sometimes your query might not be precise enough for a vector to be returned. Be sure to use terms that match what you are looking for.

It’s a good idea to check whether embeddings are being queried at all. This means verifying if AI Engine actually sent a request to transform the user’s input into an embedding in order to match it with content from your vector database. You can use the “Queries tab” to ensure queries are being sent. You should see a request made from the ada model that is used to fetch the data from the vector database.

Then, you can use the Discussion tab to directly see which embeddings were found and added to the chatbot context. These will appear in gray just above the chatbot’s response. You can also inspect the content of each embedding by checking the Context section under Information.

Please ensure that there is adequate space within your context for your embeddings. In the settings, you will find the context max length parameter, which determines the number of characters your embeddings will occupy in the context. If this setting is empty, it will be considered as 0, which means that the embeddings will occupy 0 tokens and, consequently, won’t be used. If it is too low it might cut off your embeddings thus having a context that doest really help your chatbot.

All the embeddings are stored locally in the wp_mwai_vectors table. You should be able to view them here and perform manual operations if needed. It could be a faster approach to delete all embeddings at once instead of using the AI Engine embedding table. If you are unsure about what you are doing, it is recommended to create a backup using a reliable tool like the excellent BlogVault.

If everything seems to work fine but you are still not getting the answer you want, this might be because of the behavior of the model you are using. Sometimes, contextual data is ignored if the model judges it irrelevant or if the instructions you are using are contradictory and/or unrelated. For instance, the GPT model will most likely always respond that it can’t use actual data if your prompt is asking something related to that (thinking it will get this data from the embedding).

On Pinecone

If you are using Pinecone, you can log in to your account and look inside your index directly to see how many vectors are inserted in each namespace. It should be updating in real-time, so you can check the difference with what’s showing up on AI Engine.

If you are using the Pinecone free tier services, please note that inactivity of 7 days will result in the termination of your Pinecone Project. To ensure it still exists, please connect to your account.

Orphan Vectors

If some “N/A” or orphan embeddings are created, it may be due to discrepancies between your database and the Pinecone registry (learn more at the bottom of this documentaion). You should be able to delete them manually until they are all gone. If it keeps happening, look inside your PHP logs for any errors. If there are too many of them, please refer to the last part of this documentation. Try to change or refresh your index and namespace.

The Pinecone database contains only vectorized data, which means that by itself, it’s essentially a collection of numbers that can’t be directly used. This is why the AI Engine also utilizes your database to store the corresponding vectorized values in a textual format. You can think of the AI Engine’s database as a translation book for your Pinecone Database. Therefore, it’s important to ensure that both databases have the same number of entries.

When there’s a discrepancy between the two databases, you might encounter orphaned vectors appearing in the ‘Embedding’ table. This indicates that a result was returned from a request to Pinecone, but no corresponding textual value (translation) was found in your database. In such cases, you can either add the missing value or remove the orphaned vector to maintain a clean dataset. In some instances, Pinecone may store the textual value in metadata, and the AI Engine will automatically use it to create a valid ‘OK’ embedding.

To initiate this process, you can use the ‘Sync Pull’ option, which requests Pinecone for every vector saved, allowing you to either clean them or fill them with the corresponding textual data.

Notion Image

SSL related errors are related to your hosting service. On our side, unfortunately, we can’t do anything. Except for giving you a refund, but we’re sure you would prefer it to work .🙌

Usually, this error message indicates an SSL connection error, which typically occurs when your server has an outdated cURL package or SSL protocol. It’s also possible that there’s a firewall on your server. Your web host provider should have more information about this issue.

Embeddings Are Not Auto-Synced

If you notice that your embeddings are not updating automatically in the AI Engine plugin, it could be due to WordPress cron jobs not running as expected. This guide will help you understand the possible causes and how to resolve the issue.

Possible Causes

  • WordPress Cron Jobs Not Running: The AI Engine relies on scheduled events (cron jobs) to update embeddings every few minutes. If these cron jobs aren’t running, embeddings won’t auto-sync.
  • Server-Level Password Protection: If your website (including wp-cron.php) is behind a server-level password protection (like cPanel directory privacy), it can prevent cron jobs from executing.
  • Plugin Conflicts: Security, caching, or optimization plugins might interfere with scheduled events.
  • Misconfigured wp-config.php: Disabling WP-Cron in the configuration file stops all scheduled tasks.

How to Diagnose the Issue

You can go into the AI Engine settings and enable the Dev Tools. This will give you access to a new tab with various options to help you test how the crons are working on your website. The “Run Task” button will force AI Engine to run the tasks immediately—normally, these tasks are run every 10 minutes.

If you need to dive deeper into debugging crons, you can follow these steps:

Install WP Crontrol Plugin

  • Step: Install and activate the WP Crontrol plugin.
  • Purpose: This plugin lets you view and manage all cron events in WordPress.

Check Cron Events

  • Navigate to: Tools > Cron Events in your WordPress dashboard.
  • Look for:
    • The mwai_5mn event (used by AI Engine for embeddings).
    • Any errors or warnings next to scheduled events (e.g., HTTP 401 errors).

Identify Errors

  • HTTP 401 Errors: Indicates unauthorized access, possibly due to password protection blocking wp-cron.php.
  • No Scheduled Events: Suggests that WP-Cron might be disabled or not functioning.

Solutions

Remove Server-Level Password Protection

If your site is password-protected at the server level (e.g., via cPanel):

  • Disable Directory Privacy:
    • Log in to your hosting control panel.
    • Navigate to Directory Privacy or similar.
    • Remove the password protection from your website directory.
  • Alternative: Use a WordPress plugin like Password Protected to restrict site access without blocking wp-cron.php.

Check for Plugin Conflicts

  • Disable Other Plugins:
    • Temporarily deactivate security, caching, or optimization plugins.
    • Check if embeddings start auto-syncing.
  • Reactivate Plugins One by One:
    • Identify if a specific plugin is causing the conflict.

Verify WP-Cron Is Enabled

  • Check wp-config.php:
    • Access your website files via FTP or a file manager.
    • Open the wp-config.php file.
    • Ensure there’s no line that says define('DISABLE_WP_CRON', true);. If it exists, remove it or set it to false.

Use an External Cron Job

If you prefer or if WP-Cron isn’t reliable:

  • Set Up a Server Cron Job:
    • Schedule a cron job on your server to call wp-cron.php at regular intervals.
    • This bypasses the need for visitors to trigger cron events.

Confirm the Fix

  1. Edit a Post: Make a change to any post to initiate the embedding process.
  2. Visit Your Website: Load any page to trigger WP-Cron (if using WP-Cron).
  3. Check Embeddings:
    1. Go to AI Engine > Embeddings.
      • Verify that the embeddings have been updated.

Embeddings not auto-syncing is often due to issues with WordPress cron jobs. By ensuring wp-cron.php is accessible and cron jobs are running correctly, you can resolve the issue and keep your embeddings up-to-date automatically.