Skip to content

Vector Index API Reference

Index Management Operations

These operations are performed at the client level and don't require connecting to a specific index.

vectorstackai.PreciseSearch.list_indexes

list_indexes()

Lists information about all available indexes.

Retrieves metadata for all indexes associated with the current API key.

Returns:

Name Type Description
list_info List[Dict[str, Any]]

A list of dictionaries, where each dictionary contains information about an index with the following keys:

  • index_name (str): The name of the index.
  • status (str): The current status of the index ("initializing" or "ready").
  • num_records (int): The number of records stored in the index.
  • dimension (int): The dimensionality of the vectors in the index.
  • metric (str): The distance metric used for similarity search ("cosine" or "dotproduct").
  • features_type (str): The type of features stored ("dense" or "hybrid").
  • embedding_model_name (str): The name of the embedding model used (if applicable).
  • optimized_for_latency (bool): Indicates whether the index is optimized for low-latency queries.
Source code in src/vectorstackai/client.py
def list_indexes(self) -> List[Dict[str, Any]]:
    """Lists information about all available indexes.

    Retrieves metadata for all indexes associated with the current API key.

    Returns:
        list_info: A list of dictionaries, where each dictionary contains information about an index with the following keys:

            - index_name (str): The name of the index.
            - status (str): The current status of the index ("initializing" or "ready").
            - num_records (int): The number of records stored in the index.
            - dimension (int): The dimensionality of the vectors in the index.
            - metric (str): The distance metric used for similarity search ("cosine" or "dotproduct").
            - features_type (str): The type of features stored ("dense" or "hybrid").
            - embedding_model_name (str): The name of the embedding model used (if applicable).
            - optimized_for_latency (bool): Indicates whether the index is optimized for low-latency queries.
    """
    return api_resources.Index.list_indexes(connection_params=self.connection_params)

vectorstackai.PreciseSearch.create_index

create_index(index_name, embedding_model_name='none', dimension=None, metric='dotproduct', features_type='dense')

Creates a new vector index with the specified parameters.

Parameters:

Name Type Description Default
index_name str

Name of the index to create.

required
embedding_model_name str

Name of the embedding model to use. There are two kinds of embedding models: - Integrated models: These are pre-trained models hosted on the vector2search platform. - Non-integrated models: These are custom models hosted on your platform/application. - Set "embedding_model_name" to "none" for using your own embedding model (i.e. non-integrated model).

'none'
dimension Optional[int]

Vector dimension (required for non-integrated models).

None
metric Optional[str]

Distance metric for comparing dense and sparse vectors. Must be one of "cosine" or "dotproduct".

'dotproduct'
features_type Optional[str]

Type of features used in the index. Must be one of "dense" or "hybrid" (sparse + dense).

'dense'
Source code in src/vectorstackai/client.py
def create_index(self, 
                 index_name: str, 
                 embedding_model_name: str = 'none', 
                 dimension: Optional[int] = None, 
                 metric: Optional[str] = 'dotproduct', 
                 features_type: Optional[str] = 'dense') -> None:
    """Creates a new vector index with the specified parameters.

    Args:
        index_name: Name of the index to create.
        embedding_model_name: Name of the embedding model to use. There are two kinds of embedding models:
            - Integrated models: These are pre-trained models hosted on the vector2search platform.
            - Non-integrated models: These are custom models hosted on your platform/application.
            - Set "embedding_model_name" to "none" for using your own embedding model (i.e. non-integrated model).
        dimension: Vector dimension (required for non-integrated models).
        metric: Distance metric for comparing dense and sparse vectors. Must be one of "cosine" or "dotproduct".
        features_type: Type of features used in the index. Must be one of "dense" or "hybrid" (sparse + dense).
    """
    # Convert "None" to "none"
    if embedding_model_name == 'None':
        embedding_model_name = 'none'

    json_data = {
        "index_name": index_name,
        "embedding_model_name": embedding_model_name,
        "dimension": dimension,
        "metric": metric,
        "features_type": features_type,
    }
    response_json = api_resources.Index.create_index(json_data=json_data, 
                                                     connection_params=self.connection_params)
    print(f"{response_json['detail']}")

vectorstackai.PreciseSearch.index_info

index_info(index_name)

Retrieves information about a specific vector index.

This method searches for the index specified by index_name within the list of available indexes. If the index exists, it returns a dictionary containing information about the index. This method is useful to get information about the index without having to connect to it.

Parameters:

Name Type Description Default
index_name str

Name of the index to retrieve information for.

required

Returns:

Name Type Description
index_info Dict[str, Any]

A dictionary containing information about the index with the following keys:

  • index_name (str): The name of the index.
  • status (str): The current status of the index ("initializing" or "ready").
  • num_records (int): The number of records stored in the index.
  • dimension (int): The dimensionality of the vectors in the index.
  • metric (str): The distance metric used for similarity search.
  • features_type (str): The type of features stored.
  • embedding_model_name (str): The name of the embedding model used (if applicable).
  • optimized_for_latency (bool): Whether the index is optimized for low-latency queries.
Source code in src/vectorstackai/client.py
def index_info(self, index_name: str) -> Dict[str, Any]:
    """Retrieves information about a specific vector index.

    This method searches for the index specified by `index_name` within the list of available indexes. 
    If the index exists, it returns a dictionary containing information about the index. This method 
    is useful to get information about the index without having to connect to it.

    Args:
        index_name: Name of the index to retrieve information for.

    Returns:
        index_info: A dictionary containing information about the index with the following keys:

            - index_name (str): The name of the index.
            - status (str): The current status of the index ("initializing" or "ready").
            - num_records (int): The number of records stored in the index.
            - dimension (int): The dimensionality of the vectors in the index.
            - metric (str): The distance metric used for similarity search.
            - features_type (str): The type of features stored.
            - embedding_model_name (str): The name of the embedding model used (if applicable).
            - optimized_for_latency (bool): Whether the index is optimized for low-latency queries.
    """
    info_all_indexes = self.list_indexes()
    for info_index in info_all_indexes:
        if info_index['index_name'] == index_name:
            return info_index

    raise ValueError(f"Index {index_name} not found in the list of existing indexes")

vectorstackai.PreciseSearch.index_status

index_status(index_name)

Retrieves the status of a specific vector index.

This method retrieves the status of the index specified by index_name. Here are the possible statuses:

  • "initializing": The index is being initialized.
  • "ready": The index is ready for use.
  • "failed": The index failed to initialize.
  • "deleting": The index is being deleted.
  • "undergoing_optimization_for_latency": The index is undergoing optimization for better latency and throughput.

Parameters:

Name Type Description Default
index_name str

Name of the index to retrieve status for.

required

Returns:

Name Type Description
index_status str

The current status of the index.

Source code in src/vectorstackai/client.py
def index_status(self, index_name: str) -> Dict[str, Any]:
    """Retrieves the status of a specific vector index.

    This method retrieves the status of the index specified by `index_name`.
    Here are the possible statuses:

    - "initializing": The index is being initialized.
    - "ready": The index is ready for use.
    - "failed": The index failed to initialize.
    - "deleting": The index is being deleted.
    - "undergoing_optimization_for_latency": The index is undergoing optimization for better latency and throughput.

    Args:
        index_name: Name of the index to retrieve status for.

    Returns:
        index_status (str): The current status of the index.
    """
    return self.index_info(index_name)['status']

vectorstackai.PreciseSearch.connect_to_index

connect_to_index(index_name)

Connects to an existing vector index and returns an IndexObject for further operations.

This method searches for the index specified by index_name within the list of available indexes. If the index exists, it returns an IndexObject configured with the current connection parameters, which can be used to perform operations such as upsert, search, and more on the index.

Parameters:

Name Type Description Default
index_name str

The name of the index to connect to.

required

Returns:

Name Type Description
IndexObject IndexObject

An object that provides methods to interact with the specified vector index.

Source code in src/vectorstackai/client.py
def connect_to_index(self, index_name: str) -> IndexObject:
    """Connects to an existing vector index and returns an IndexObject for further operations.

    This method searches for the index specified by `index_name` within the list of available indexes. 
    If the index exists, it returns an `IndexObject` configured with the current connection parameters, 
    which can be used to perform operations such as upsert, search, and more on the index.

    Args:
        index_name (str): The name of the index to connect to.

    Returns:
        IndexObject: An object that provides methods to interact with the specified vector index.
    """ 

    info_all_indexes = self.list_indexes()
    for index in info_all_indexes:
        if index['index_name'] == index_name:
            return IndexObject(index_name=index_name, connection_params=self.connection_params)

    raise ValueError(f"Index {index_name} not found in the list of existing indexes")

vectorstackai.PreciseSearch.delete_index

delete_index(index_name, ask_for_confirmation=True)

Deletes a vector index by its name.

Permanently deletes the specified index and all its contents. The deletion is asynchronous, and the deleted index cannot be recovered. Note, this method is useful for deleting an index without having to connect to it.

Parameters:

Name Type Description Default
index_name str

Name of the index to delete.

required
ask_for_confirmation bool

Whether to ask for confirmation before deleting the index.

True
Source code in src/vectorstackai/client.py
def delete_index(self, index_name: str, ask_for_confirmation: bool = True) -> None:
    """Deletes a vector index by its name.

    Permanently deletes the specified index and all its contents. 
    The deletion is asynchronous, and the deleted index cannot be recovered. 
    Note, this method is useful for deleting an index without having to connect to it.

    Args:
        index_name (str): Name of the index to delete.
        ask_for_confirmation (bool): Whether to ask for confirmation before deleting the index.
    """

    # Ask the user to confirm the deletion
    #########################################################
    if ask_for_confirmation:
        print(f"Are you sure you want to delete index '{index_name}'? "
              f"This action is irreversible.")
        confirm = input("Type 'yes' to confirm: ")
        if confirm != 'yes':
            print("Deletion cancelled.")
            return

    response_json = api_resources.Index.delete_index(index_name=index_name, 
                                     connection_params=self.connection_params)
    print(f"{response_json['detail']}")

Index Operations

These operations are performed on a specific index after connecting to it.

vectorstackai.objects.index.IndexObject.set_similarity_scale

set_similarity_scale(dense_scale=1.0, sparse_scale=1.0)

Set the scale values for dense and sparse similarity scores in hybrid search.

The similarity in a hybrid index is computed as a weighted sum of the dense and sparse similarity scores:

similarity = dense_similarity * dense_scale + 
             sparse_similarity * sparse_scale

This method allows you to set the scale values for the dense and sparse similarity scores. The scale values must be between 0 and 1.

Note

In a dense index, the scale values are ignored. The similarity is computed as:

similarity = dense_similarity.

Parameters:

Name Type Description Default
dense_scale float

The scale value for the dense similarity score. Defaults to 1.0.

1.0
sparse_scale float

The scale value for the sparse similarity score. Defaults to 1.0.

1.0
Source code in src/vectorstackai/objects/index.py
def set_similarity_scale(self, 
                        dense_scale: float = 1.0, 
                        sparse_scale: float = 1.0) -> None:
    """Set the scale values for dense and sparse similarity scores in hybrid search.

    The similarity in a hybrid index is computed as a weighted sum of the dense and 
    sparse similarity scores:

        similarity = dense_similarity * dense_scale + 
                     sparse_similarity * sparse_scale

    This method allows you to set the scale values for the dense and sparse similarity 
    scores. The scale values must be between 0 and 1.

    Note:
        In a dense index, the scale values are ignored. The similarity is computed as:

            similarity = dense_similarity.

    Args:
        dense_scale: The scale value for the dense similarity score.
            Defaults to 1.0.
        sparse_scale: The scale value for the sparse similarity score.
            Defaults to 1.0.
    """
    if self.features_type == 'dense':
        warnings.warn("Setting scale values for dense and sparse features is redundant for dense indexes, since index is dense only; they will not be used for search..")

    # Validate scale values
    if dense_scale == 0.0 and sparse_scale == 0.0:
        raise ValueError("At least one of the scale values must be set to a non-zero value.")
    if dense_scale < 0.0 or dense_scale > 1.0:
        raise ValueError("dense_scale must be between 0.0 and 1.0")
    if sparse_scale < 0.0 or sparse_scale > 1.0:
        raise ValueError("sparse_scale must be between 0.0 and 1.0")

    self.dense_similarity_scale = dense_scale
    self.sparse_similarity_scale = sparse_scale

    if sparse_scale == 0.0:
        warnings.warn("Sparse similarity scale is set to 0.0; sparse features will not be used for search..")
    if dense_scale == 0.0:
        warnings.warn("Dense similarity scale is set to 0.0; dense features will not be used for search..")

vectorstackai.objects.index.IndexObject.upsert

upsert(batch_ids, batch_metadata=None, batch_vectors=None, batch_sparse_values=None, batch_sparse_indices=None)

Upsert a batch of vectors and associated metadata to the index.

This method upserts a batch of dense or sparse vectors, along with their associated metadata in the index. Note, if a datapoint with the same ID already exists, its metadata, vector, and sparse vector will be updated with the new values.

Parameters:

Name Type Description Default
batch_ids List[str]

List of unique identifiers for each vector.

required
batch_metadata Optional[List[Dict[str, Any]]]

List of dictionaries containing metadata for each vector. For indexes configured with an integrated embedding model, each dictionary should include a 'text' key whose value is used to compute the embeddings.

None
batch_vectors Optional[List[List[float]]]

List of dense vectors (each represented as a list of floats) corresponding to each ID. This field is required for indexes configured with a non-integrated embedding model and should be omitted for indexes configured with an integrated embedding model.

None
batch_sparse_values Optional[List[List[float]]]

List of values in the sparse vector (each as a list of floats) corresponding to each ID. Required for upserting in a hybrid index.

None
batch_sparse_indices Optional[List[List[int]]]

List of indices in the sparse vector (each as a list of ints) corresponding to each ID. Required for upserting in a hybrid index.

None
Source code in src/vectorstackai/objects/index.py
def upsert(self, 
           batch_ids: List[str], 
           batch_metadata: Optional[List[Dict[str, Any]]] = None, 
           batch_vectors: Optional[List[List[float]]] = None, 
           batch_sparse_values: Optional[List[List[float]]] = None, 
           batch_sparse_indices: Optional[List[List[int]]] = None) -> None:
    """Upsert a batch of vectors and associated metadata to the index.

    This method upserts a batch of dense or sparse vectors, along with their associated metadata in the index. Note, if a datapoint with the same ID already exists, its metadata, vector, and sparse vector will be updated with the new values.

    Args:
        batch_ids: List of unique identifiers for each vector.
        batch_metadata: List of dictionaries containing metadata for each vector. For indexes configured 
            with an integrated embedding model, each dictionary should include a 'text' key whose 
            value is used to compute the embeddings.
        batch_vectors: List of dense vectors (each represented as a list of floats) corresponding to 
            each ID. This field is required for indexes configured with a non-integrated embedding 
            model and should be omitted for indexes configured with an integrated embedding model.
        batch_sparse_values: List of values in the sparse vector (each as a list of floats) 
            corresponding to each ID. Required for upserting in a hybrid index.
        batch_sparse_indices: List of indices in the sparse vector (each as a list of ints) 
            corresponding to each ID. Required for upserting in a hybrid index.
    """
    # Validate input types
    self._validate_upsert_input(batch_ids, batch_metadata, batch_vectors, batch_sparse_values, batch_sparse_indices) 

    json_data = {
        "ids": batch_ids,
        "metadata": batch_metadata,
        "vectors": batch_vectors,
        "sparse_values": batch_sparse_values,
        "sparse_indices": batch_sparse_indices,
    }
    response = api_resources.Index.upsert(self.index_name, json_data, self.connection_params)

vectorstackai.objects.index.IndexObject.search

search(top_k=10, query_text=None, query_vector=None, query_sparse_values=None, query_sparse_indices=None, return_metadata=True)

Search the index for entries similar to the query.

Finds entries in the index that are most similar to the provided query. Query can be a text (if using an integrated embedding model) or a dense vector (if using a non-integrated embedding model), along with a sparse vector (if using a hybrid index).

Parameters:

Name Type Description Default
top_k int

Number of top-k results to return (should be >= 1).

10
query_text str

Query text (required for integrated embedding models).

None
query_vector List[float]

Query vector (required for non-integrated embedding models).

None
query_sparse_values List[float]

Query sparse values (required for hybrid indexes).

None
query_sparse_indices List[int]

Query sparse indices (required for hybrid indexes).

None
return_metadata bool

Whether to return metadata for each result (optional, defaults to True).

True

Returns:

Name Type Description
search_results Dict[str, Any]

List of dictionaries containing search results, sorted in descending order of similarity scores. Each dictionary contains:

  • id (str): ID of the retrieved vector.
  • similarity (float): Similarity score between the query and the retrieved vector.
  • metadata (dict): Metadata associated with the vector (present if return_metadata=True, otherwise defaults to an empty dict).
Source code in src/vectorstackai/objects/index.py
def search(self, 
           top_k: int = 10, 
           query_text: str = None, 
           query_vector: List[float] = None, 
           query_sparse_values: List[float] = None,
           query_sparse_indices: List[int] = None,
           return_metadata: bool = True) -> Dict[str, Any]:
    """Search the index for entries similar to the query.

    Finds entries in the index that are most similar to the provided query. Query can be a text (if using an integrated embedding model) or a dense vector (if using a non-integrated embedding model), along with a sparse vector (if using a hybrid index).

    Args:
        top_k: Number of top-k results to return (should be >= 1).
        query_text: Query text (required for integrated embedding models).
        query_vector: Query vector (required for non-integrated embedding models).
        query_sparse_values: Query sparse values (required for hybrid indexes).
        query_sparse_indices: Query sparse indices (required for hybrid indexes).
        return_metadata: Whether to return metadata for each result (optional, defaults to True).

    Returns:
        search_results: List of dictionaries containing search results, sorted in descending order of similarity scores. Each dictionary contains:

            - id (str): ID of the retrieved vector.
            - similarity (float): Similarity score between the query and the retrieved vector.
            - metadata (dict): Metadata associated with the vector (present if return_metadata=True, otherwise defaults to an empty dict).
    """
    self._validate_search_input(top_k, query_text, query_vector, query_sparse_values, query_sparse_indices)
    json_data = {
        "top_k": top_k,
        "query_text": query_text,
        "return_metadata": return_metadata,
        "query_vector": query_vector,
        "query_sparse_values": query_sparse_values,
        "query_sparse_indices": query_sparse_indices,
        "dense_similarity_scale": self.dense_similarity_scale,
        "sparse_similarity_scale": self.sparse_similarity_scale
    }
    response = api_resources.Index.search(self.index_name, json_data, self.connection_params)
    return response['search_results']

vectorstackai.objects.index.IndexObject.info

info()

Get information about the index.

If the index is still being created (i.e., not yet ready), the returned dictionary includes only "index_name" and a "status" (which is "initializing"). Once the index is configured, the status is set to"ready", the returned dictionary includes additional information.

Returns:

Name Type Description
index_info Dict[str, Any]

A dictionary containing information about the index with the following keys:

  • index_name (str): The name of the index.
  • status (str): The current status of the index ("initializing" or "ready").
  • num_records (int): The number of records stored in the index.
  • dimension (int): The dimensionality of the vectors in the index.
  • metric (str): The distance metric used for similarity search ("cosine" or "dotproduct").
  • features_type (str): The type of features stored ("dense" or "hybrid").
  • embedding_model_name (str): The name of the embedding model used (if applicable).
  • optimized_for_latency (bool): Indicates whether the index is optimized for low-latency queries.
Source code in src/vectorstackai/objects/index.py
def info(self) -> Dict[str, Any]:
    """Get information about the index.

    If the index is still being created (i.e., not yet ready), the returned dictionary 
    includes only `"index_name"` and a `"status"` (which is `"initializing"`). Once the index is configured, the status is set to`"ready"`, the returned dictionary includes additional information.

    Returns:
        index_info: A dictionary containing information about the index with the following keys:

            - index_name (str): The name of the index.
            - status (str): The current status of the index ("initializing" or "ready").
            - num_records (int): The number of records stored in the index.
            - dimension (int): The dimensionality of the vectors in the index.
            - metric (str): The distance metric used for similarity search ("cosine" or "dotproduct").
            - features_type (str): The type of features stored ("dense" or "hybrid").
            - embedding_model_name (str): The name of the embedding model used (if applicable).
            - optimized_for_latency (bool): Indicates whether the index is optimized for low-latency queries.
    """
    return api_resources.Index.info(self.index_name, self.connection_params)

vectorstackai.objects.index.IndexObject.delete

delete(ask_for_confirmation=True)

Deletes the vector index.

Permanently deletes the index and all its contents. The deletion is asynchronous, and the deleted index cannot be recovered.

Parameters:

Name Type Description Default
ask_for_confirmation bool

Whether to ask for confirmation before deleting the index. Defaults to True. When True, the user will be prompted to type 'yes' to confirm deletion.

True

Returns:

Type Description
None

None

Raises:

Type Description
ValueError

If the index doesn't exist or cannot be deleted.

Source code in src/vectorstackai/objects/index.py
def delete(self, ask_for_confirmation: bool = True) -> None:
    """Deletes the vector index.

    Permanently deletes the index and all its contents. The deletion is asynchronous, 
    and the deleted index cannot be recovered. 

    Args:
        ask_for_confirmation (bool): Whether to ask for confirmation before deleting the index. Defaults to True. When True, the user will be prompted to type 'yes' to confirm deletion.

    Returns:
        None

    Raises:
        ValueError: If the index doesn't exist or cannot be deleted.
    """
    # Ask the user to confirm the deletion
    #########################################################
    if ask_for_confirmation:
        print(f"Are you sure you want to delete index '{self.index_name}'? "
              f"This action is irreversible.")
        confirm = input("Type 'yes' to confirm: ")
        if confirm != 'yes':
            print("Deletion cancelled.")
            return 

    api_resources.Index.delete_index(self.index_name, 
                                     self.connection_params)
    print(f"Request accepted: Index '{self.index_name}' deletion scheduled.")

vectorstackai.objects.index.IndexObject.delete_vectors

delete_vectors(ids)

Deletes vectors from the index by their IDs.

Permanently removes the specified vectors from the index based on their unique identifiers. The deletion operation cannot be undone. This operation is performed synchronously. Even if one of the IDs does not exist, the operation will raise an error, and the index state will remain unchanged (i.e., no vectors will be deleted).

Parameters:

Name Type Description Default
ids List[str]

A list of string IDs identifying the vectors to delete from the index. Each ID must correspond to a vector previously added to the index.

required
Source code in src/vectorstackai/objects/index.py
def delete_vectors(self, ids: List[str]) -> None:
    """Deletes vectors from the index by their IDs.

    Permanently removes the specified vectors from the index based on their unique identifiers. The deletion operation cannot be undone. This operation is performed synchronously. Even if one of the IDs does not exist, the operation will raise an error, and the index state will remain unchanged (i.e., no vectors will be deleted).

    Args:
        ids: A list of string IDs identifying the vectors to delete from the index.
            Each ID must correspond to a vector previously added to the index.
    """
    assert isinstance(ids, list), "ids must be a list"
    assert len(ids) > 0, "ids must be a non-empty list"
    assert [isinstance(id, str) for id in ids], "each element of ids must be a string"

    json_data = {
        "delete_vector_ids": ids,
    }
    api_resources.Index.delete_vectors(self.index_name, json_data, self.connection_params)
    print(f"Successfully deleted {len(ids)} vectors from index {self.index_name}")

vectorstackai.objects.index.IndexObject.optimize_for_latency

optimize_for_latency()

Optimizes the index for better latency and throughput.

This method triggers an optimization process in the background to improve the latency and throughput of the index for search operations.

Source code in src/vectorstackai/objects/index.py
def optimize_for_latency(self) -> None:
    """
    Optimizes the index for better latency and throughput.

    This method triggers an optimization process in the background to improve the latency and throughput of the index for search operations.
    """
    api_resources.Index.optimize_for_latency(self.index_name, self.connection_params)
    print(f"Request accepted: Index '{self.index_name}' optimization scheduled.")