How to Build a ChatGPT-like API using AWS Lambda

Sign Up to Build

About this Architecture

Here is some information about this architecture.

How to Build This Solution

Here are the steps you can follow to build this solution on your own.

In this project you’ll learn how to deploy a AWS Lambda function that recreates the functionality of ChatGPT’s backend (we are not building the frontend). This function will take in an array of messages, and form it’s response based on the collective of those messages.

This project builds off of the Basic Chat API project. They share the same base code, however there are a few differences:

  • The Lambda function will be updated to process a messages object.

  • The client input will be structured to send more context.

If you don’t have experience working with OpenAI’s GPT APIs, then check out the other project first.

How it Works

When we work with GPT in the context of a long flowing chat, we need to make sure ChatGPT has the whole chat. This helps the API formulate a response that takes the entire conversation into play.

To demonstrate how this works, we will show you how to build a Python function that processes an array of messages from a client. In our case, the client will be using the CURL command line client, but you could build a web or mobile interface to do the same.

The messages array that we’ll pass to the function will structure as follows:

        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "What's the capital of California?"},
        {"role": "assistant", "content": "The capital of California is Sacramento."},
        {"role": "user", "content": "How many people live in the capital?"}

A few notes on this:

  • The array should start out with a role=system. This is used to tell GPT what role it should emulate.

  • Any input from the user should have the role=user. This should be in the order that the user asks questions.

  • Any output from GPT should have the role=assistant. This should be in the order given.

Great, now let’s get started!

Get Your AWS Credentials

If you're using the Skillmix Labs feature, open the lab settings (the beaker icon) on the right side of the code editor. Then, click the Start Lab button to start hte lab environment.

Wait for the credentials to load. Then run this in the terminal:

$ aws configure
AWS Access Key ID [None]: 
AWS Secret Access Key [None]: 
Default region name [None]: us-west-2
Default output format [None]: json

Be sure to name your credentials profile 'smx-lab'.

Note: If you're using your own AWS account you'll need to ensure that you've created and configured a named AWS CLI profile named smx-lab.

Get Your OpenAI API Key

You will need your own OpenAI key for this project. Head over to and follow these steps.

  1. Create an account.

  2. Create an API Key by going to this page:

Remember: save your key somewhere safe. Don’t share it with anyone. You can also delete it after this lab for extra precaution.

Create the Python File

Once the lab has started, it’s time to create the Python file we’ll be working with. We’ll create the file in the lab environment (on the remote development server). Follow these steps to create it:

  1. In the Files pane, click on the + document icon.

  2. In the modal, name the file

  3. Click the Create button.

  4. Click on the file to open it for editing.

Write the Python Function

It’s now time to write the Python function. This is a relatively simple function. The first thing to note is that we will create it according to the AWS Lambda specification. Mainly, the function needs to accept the event and context objects, and return a status code and body.

In the file, enter in this Python code, click the Save button in the editor, and we’ll review the code afterwards.

import json
import openai

# API Key for OpenAI
openai.api_key = ""

def lambda_handler(event, context):
    body = json.loads(event['body'])
    messages = body['messages']

    response_text = ""
    ai_response = openai.ChatCompletion.create(

    response_text += ai_response.choices[0]['message']['content'].strip()

    return {
        'statusCode': 200,
        'body': json.dumps(response_text)

Code Review

  • import json: Imports the JSON library, which is used for parsing and generating JSON.

  • import openai: Imports the OpenAI library to facilitate interactions with the OpenAI API.

  • The openai.api_key is set to a hardcoded string. This provides the API key for the OpenAI service, allowing the function to authenticate and interact with the OpenAI API. Note in production we would want to use a secrets manager.

  • The function lambda_handler is defined with two parameters: event and context. These are standard parameters for AWS Lambda functions.

    • event: Contains data about the incoming request, e.g., headers, query string parameters, and body.

    • context: Provides information about the runtime and configuration settings.

  • The body of the incoming event is extracted and parsed from JSON format into a Python dictionary.

  • The parsed body is expected to contain a key 'messages', the value of which is taken as a prompt to be provided to the GPT model.

  • The openai.ChatCompletion.create() method is called with:

    • A specific model ID: "gpt-4".

    • The previously created messages.

    • A temperature value of 0, which makes the model's outputs deterministic, favoring the most likely response.

  • The response from the model is extracted from the choices attribute and is appended to the response_text.

  • The function returns a response with a statusCode of 200, indicating success. The body of the response contains the AI model's generated content in JSON format.

Deploy to AWS Lambda

Next, let’s deploy this code to AWS Lambda. We’ll use the AWS CLI in the Skillmix Editor to complete this step.

Note: this requires that you previously ran aws configure as specified above.

First, let’s deploy this function to AWS Lambda. We need to perform a few tasks to do this. Go go the terminal now and enter these commands.

# install dependencies
$ apt-get update
$ apt-get -y install dialog
$ apt-get install zip
$ apt-get install less

# create the openai package .zip (needed dependency)
$ pip install openai -t ./package
$ cd package
$ zip -r ../ .

# create the .zip that includes the package and python function
$ cd ..
$ zip -g

# create an IAM trust policy document for our role
$ cat <<EOL >> policy.json
    "Version": "2012-10-17",
    "Statement": [
            "Effect": "Allow",
            "Principal": {
                "Service": ""
            "Action": "sts:AssumeRole"

# create the IAM Role for our lambda function
aws iam create-role \
    --role-name LambdaExecutionRole \
    --assume-role-policy-document file://policy.json

# attach the policy to our role
aws iam attach-role-policy \
    --role-name LambdaExecutionRole \
    --policy-arn arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole

# create the lambda function
# be sure to replace **YOUR_ACCOUNT_ID with your account ID**
$ aws lambda create-function --function-name MyOpenAIFunction \
--zip-file fileb:// \
--handler main.lambda_handler \
--runtime python3.10 \
--timeout 180 \
--role arn:aws:iam::**YOUR_ACCOUNT_ID**:role/LambdaExecutionRole

NOTE: You may need to hit CTRL Z to cancel out of the response 

# create the function URL config
aws lambda create-function-url-config \
    --function-name MyOpenAIFunction \
    --auth-type NONE \
    --cors "AllowOrigins"="*"

# add permissions to open the Lambda to public access
aws lambda add-permission \
    --statement-id public-access \
    --function-name MyOpenAIFunction \
    --action lambda:InvokeFunctionUrl \
    --principal "*" \
    --function-url-auth-type NONE

# get the function URL
aws lambda get-function-url-config \
    --function-name MyOpenAIFunction


With that stuff done, we should be able to make a POST call to the function. Replace <your_function_url> with your function URL from above.

# first query
curl -X POST <your_function_url> \
     -H "Content-Type: application/json" \
     -d '{"messages":[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "What is the capital of California?"}

If everything works, you should get a response similar to "The capital of California is Sacramento."

Then, execute another query. This one will have the assistant's response, and a new question that implies the system knows what capital we're talking about.

# second query
curl -X POST <your_function_url> \
     -H "Content-Type: application/json" \
     -d '{"messages":[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Whats the capital of California?"},
        {"role": "assistant", "content": "The capital of California is Sacramento."},
        {"role": "user", "content": "What is the population of the capital?"}


If you’re having any issues, use your credentials to log in to the AWS Console and go to the CloudWatch Logs dashboard. You’ll see the logs for this Lambda function. You can look at those logs to see what’s happening.