Cohere provides text generation and representation models powering business applications to generate text, summarize, search, cluster, classify, and utilize Retrieval Augmented Generation (RAG). Today, we’re announcing the availability of Cohere Command Light and Cohere Embed English and multilingual models on Amazon Bedrock. They’re joining the already available Cohere Command model.
Amazon Bedrock is a fully managed service that offers a choice of high-performing foundation models (FMs) from leading AI companies, including AI21 Labs, Anthropic, Cohere, Meta, Stability AI, and Amazon, along with a broad set of capabilities to build generative AI applications, simplifying the development while maintaining privacy and security. With this launch, Amazon Bedrock further expands the breadth of model choices to help you build and scale enterprise-ready generative AI. You can read more about Amazon Bedrock in Antje’s post here.
Command is Cohere’s flagship text generation model. It is trained to follow user commands and to be useful in business applications. Embed is a set of models trained to produce high-quality embeddings from text documents.
Embeddings are one of the most fascinating concepts in machine learning (ML). They are central to many applications that process natural language, recommendations, and search algorithms. Given any type of document, text, image, video, or sound, it is possible to transform it into a suite of numbers, known as a vector. Embeddings refer specifically to the technique of representing data as vectors in such a way that it captures meaningful information, semantic relationships, or contextual characteristics. In simple terms, embeddings are useful because the vectors representing similar documents are “close” to each other. In more formal terms, embeddings translate semantic similarity as perceived by humans to proximity in a vector space. Embeddings are typically generated through training algorithms or models.
Cohere Embed is a family of models trained to generate embeddings from text documents. Cohere Embed comes in two forms, an English language model and a multilingual model, both of which are now available in Amazon Bedrock.
There are three main use cases for text embeddings:
Semantic searches – Embeddings enable searching collections of documents by meaning, which leads to search systems that better incorporate context and user intent compared to existing keyword-matching systems.
Text Classification – Build systems that automatically categorize text and take action based on the type. For example, an email filtering system might decide to route one message to sales and escalate another message to tier-two support.
Retrieval Augmented Generation (RAG) – Improve the quality of a large language model (LLM) text generation by augmenting your prompts with data provided in context. The external data used to augment your prompts can come from multiple data sources, such as document repositories, databases, or APIs.
Imagine you have hundreds of documents describing your company policies. Due to the limited size of prompts accepted by LLMs, you have to select relevant parts of these documents to be included as context into prompts. The solution is to transform all your documents into embeddings and store them in a vector database, such as OpenSearch.
When a user wants to query this corpus of documents, you transform the user’s natural language query into a vector and perform a similarity search on the vector database to find the most relevant documents for this query. Then, you embed (pun intended) the original query from the user and the relevant documents surfaced by the vector database together in a prompt for the LLM. Including relevant documents in the context of the prompt helps the LLM generate more accurate and relevant answers.
You can now integrate Cohere Command Light and Embed models in your applications written in any programming language by calling the Bedrock API or using the AWS SDKs or the AWS Command Line Interface (AWS CLI).
Cohere Embed in action
Those of you who regularly read the AWS News Blog know we like to show you the technologies we write about.
We’re launching three distinct models today: Cohere Command Light, Cohere Embed English, and Cohere Embed multilingual. Writing code to invoke Cohere Command Light is no different than for Cohere Command, which is already part of Amazon Bedrock. So for this example, I decided to show you how to write code to interact with Cohere Embed and review how to use the embedding it generates.
To get started with a new model on Bedrock, I first navigate to the AWS Management Console and open the Bedrock page. Then, I select Model access on the bottom left pane. Then I select the Edit button on the top right side, and I enable access to the Cohere model.
Now that I know I can access the model, I open a code editor on my laptop. I assume you have the AWS Command Line Interface (AWS CLI) configured, which will allow the AWS SDK to locate your AWS credentials. I use Python for this demo, but I want to show that Bedrock can be called from any language. I also share a public gist with the same code sample written in the Swift programming language.
Back to Python, I first run the ListFoundationModels API call to discover the modelId
for Cohere Embed.
import boto3
import json
import numpy
bedrock = boto3.client(service_name='bedrock', region_name='us-east-1')
listModels = bedrock.list_foundation_models(byProvider='cohere')
print("n".join(list(map(lambda x: f"{x['modelName']} : { x['modelId'] }", listModels['modelSummaries']))))
Running this code produces the list:
Command : cohere.command-text-v14
Command Light : cohere.command-light-text-v14
Embed English : cohere.embed-english-v3
Embed Multilingual : cohere.embed-multilingual-v3
I select cohere.embed-english-v3
model ID and write the code to transform a text document into an embedding.
cohereModelId = 'cohere.embed-english-v3'
# For the list of parameters and their possible values,
# check Cohere's API documentation at https://docs.cohere.com/reference/embed
coherePayload = json.dumps({
'texts': ["This is a test document", "This is another document"],
'input_type': 'search_document',
'truncate': 'NONE'
})
bedrock_runtime = boto3.client(
service_name='bedrock-runtime',
region_name='us-east-1'
)
print("nInvoking Cohere Embed...")
response = bedrock_runtime.invoke_model(
body=coherePayload,
modelId=cohereModelId,
accept='application/json',
contentType='application/json'
)
body = response.get('body').read().decode('utf-8')
response_body = json.loads(body)
print(np.array(response_body['embeddings']))
The response is printed
[ 1.234375 -0.63671875 -0.28515625 ... 0.38085938 -1.2265625 0.22363281]
Now that I have the embedding, the next step depends on my application. I can store this embedding in a vector store or use it to search similar documents in an existing store, and so on.
To learn more, I highly recommend following the hands-on instructions provided by this section of the Amazon Bedrock workshop. This is an end-to-end example of RAG. It demonstrates how to load documents, generate embeddings, store the embeddings in a vector store, perform a similarity search, and use relevant documents in a prompt sent to an LLM.
Availability
The Cohere Embed models are available today for all AWS customers in two of the AWS Regions where Amazon Bedrock is available: US East (N. Virginia) and US West (Oregon).
AWS charges for model inference. For Command Light, AWS charges per processed input or output token. For Embed models, AWS charges per input tokens. You can choose to be charged on a pay-as-you-go basis, with no upfront or recurring fees. You can also provision sufficient throughput to meet your application’s performance requirements in exchange for a time-based term commitment. The Amazon Bedrock pricing page has the details.
With this information, you’re ready to use text embeddings with Amazon Bedrock and the Cohere Embed models in your applications.