Authenticating with Public/Private Keypair
This form of authentication allows users to generate a keypair, send Kensho the public key, and sign requests with their private key.
The steps to configure authentication are as follows:
- Generate an RSA keypair per the instructions below
- Email support@kensho.com with your PEM encoded public key as an attachment, and we will respond with a Client ID
- Create and sign a JWT token using your private key
- Use Kensho's Okta API to generate an authentication token
- Use the returned token in API requests to Extract
Read on for detailed instructions.
Generate an RSA Keypair
In this guide, we will use the openssl library, which is available on Unix systems. First, generate a 2048-bit private key using RSA:
openssl genrsa -out private.pem 2048
Next, extract the public key:
openssl rsa -in private.pem -outform PEM -pubout -out public.pem
Send Kensho Your Public Key
Send an email to support@kensho.com with your PEM encoded public key as an attachment. We will respond with your Client ID. This ID is not a secret.
Important: Do not send us your private key!
Create and Sign a JWT
Most languages have JWT libraries. In this example, we make use of PyJWT, a JWT library for Python.
import jwtimport timewith open("private.pem", "rb") as f:private_key = f.read()client_id = "<from above email>"iat = int(time.time())encoded = jwt.encode({"aud": "https://kensho.okta.com/oauth2/default/v1/token","exp": iat + (30 * 60), # expire in 30 minutes"iat": iat,"sub": client_id,"iss": client_id,},private_key,algorithm="RS256",)
Generate an API Token
Make a request to Okta using the JWT to retrieve a non-expiring authentication token.
Note that Content-Type
is specified as application/x-www-form-urlencoded
. When you call requests.post()
in python,
data
dictionary will automatically be converted into a string formatted like this:
client_assertion=xxxxxxx&scope=kensho:app:extract&grant_type=client_credentials&client_assertion_type=urn:ietf:params:oauth:client-assertion-type:jwt-bearer
If you are using another programming language, you need to make sure you send data
in the format specified above, rather than sending in JSON.
import requestsresponse = requests.post("https://kensho.okta.com/oauth2/default/v1/token",headers={"Content-Type": "application/x-www-form-urlencoded","Accept": "application/json",},data={"scope": "kensho:app:extract","grant_type": "client_credentials","client_assertion_type": "urn:ietf:params:oauth:client-assertion-type:jwt-bearer","client_assertion": encoded,})token = response.json()["access_token"]
Verify Token
To test out your new token, run
curl -H "Authorization: Bearer <your token>" https://extract.kensho.com/me
If you get a response with your client ID, you're in the money!
Authenticating with Trial Account
This guide is meant for users who would like to access Extract programatically. If you do not yet have an account, sign up for access.
To get started, visit your account to retrieve your refresh token.
Once you have the refresh token, you can use it to generate an access token which you will include when you make requests to the API.
import requestsresp = requests.get("https://extract.kensho.com/oauth2/refresh?refresh_token=<YOUR TOKEN HERE>")access_token = resp.json()["access_token"]
Once you have this token, you can send documents for extraction.
request_url = "https://extract.kensho.com/v2/extractions"headers = {"Authorization": f"Bearer {access_token}"}print("Sending a document to extract")response = requests.post(request_url, files=dict(file=open(filename, 'rb')), headers=headers)response.raise_for_status()request_id = response.json()["request_id"]response_url = f"{request_url}/{request_id}"print("Waiting for job %s", request_id)response = requests.get(response_url, headers=headers)while response.status_code == 200 and response.json()['status'] == 'pending':time.sleep(2)response = requests.get(response_url, headers=headers)if response.status_code == 200:print(response.json())