Chevron RightKensho ExtractChevron Right

Authentication

Search

Authenticating with Public/Private Keypair

This form of authentication allows users to generate a keypair, send Kensho the public key, and sign requests with their private key.

The steps to configure authentication are as follows:

  1. Generate an RSA keypair per the instructions below
  2. Email support@kensho.com with your PEM encoded public key as an attachment, and we will respond with a Client ID
  3. Create and sign a JWT token using your private key
  4. Use Kensho's Okta API to generate an authentication token
  5. Use the returned token in API requests to Extract

Read on for detailed instructions.

Generate an RSA Keypair

In this guide, we will use the openssl library, which is available on Unix systems. First, generate a 2048-bit private key using RSA:

openssl genrsa -out private.pem 2048

Next, extract the public key:

openssl rsa -in private.pem -outform PEM -pubout -out public.pem

Send Kensho Your Public Key

Send an email to support@kensho.com with your PEM encoded public key as an attachment. We will respond with your Client ID. This ID is not a secret.

Important: Do not send us your private key!

Create and Sign a JWT

Most languages have JWT libraries. In this example, we make use of PyJWT, a JWT library for Python.

import jwt
import time
with open("private.pem", "rb") as f:
private_key = f.read()
client_id = "<from above email>"
iat = int(time.time())
encoded = jwt.encode(
{
"aud": "https://kensho.okta.com/oauth2/default/v1/token",
"exp": iat + (30 * 60), # expire in 30 minutes
"iat": iat,
"sub": client_id,
"iss": client_id,
},
private_key,
algorithm="RS256",
)

Generate an API Token

Make a request to Okta using the JWT to retrieve a non-expiring authentication token. Note that Content-Type is specified as application/x-www-form-urlencoded. When you call requests.post() in python, data dictionary will automatically be converted into a string formatted like this: client_assertion=xxxxxxx&scope=kensho:app:extract&grant_type=client_credentials&client_assertion_type=urn:ietf:params:oauth:client-assertion-type:jwt-bearer If you are using another programming language, you need to make sure you send data in the format specified above, rather than sending in JSON.

import requests
response = requests.post(
"https://kensho.okta.com/oauth2/default/v1/token",
headers={
"Content-Type": "application/x-www-form-urlencoded",
"Accept": "application/json",
},
data={
"scope": "kensho:app:extract",
"grant_type": "client_credentials",
"client_assertion_type": "urn:ietf:params:oauth:client-assertion-type:jwt-bearer",
"client_assertion": encoded,
}
)
token = response.json()["access_token"]

Verify Token

To test out your new token, run

curl -H "Authorization: Bearer <your token>" https://extract.kensho.com/me

If you get a response with your client ID, you're in the money!

Authenticating with Trial Account

This guide is meant for users who would like to access Extract programatically. If you do not yet have an account, sign up for access.

To get started, visit your account to retrieve your refresh token.

Once you have the refresh token, you can use it to generate an access token which you will include when you make requests to the API.

import requests
resp = requests.get("https://extract.kensho.com/oauth2/refresh?refresh_token=<YOUR TOKEN HERE>")
access_token = resp.json()["access_token"]

Once you have this token, you can send documents for extraction.

request_url = "https://extract.kensho.com/v2/extractions"
headers = {"Authorization": f"Bearer {access_token}"}
print("Sending a document to extract")
response = requests.post(request_url, files=dict(file=open(filename, 'rb')), headers=headers)
response.raise_for_status()
request_id = response.json()["request_id"]
response_url = f"{request_url}/{request_id}"
print("Waiting for job %s", request_id)
response = requests.get(response_url, headers=headers)
while response.status_code == 200 and response.json()['status'] == 'pending':
time.sleep(2)
response = requests.get(response_url, headers=headers)
if response.status_code == 200:
print(response.json())