/ Cognitive Services

Getting started with Form Recognizer (preview)

When I first heard about Form Recognizer, I thought to myself, "Wow, there are so many forms in the real world that needs to be parsed and digitilized!". Once v2 preview came out, I finally decided to build an app that will parse through all of our user feedback from the last AI Hack Day.

Form Recognizer is an Azure Cognitive Services that allow us to parse text on forms in a structured format. Now we have the ability to extract the location and size (bounding box) for where information was entered or written along with the OCR'd text values.

For feedback forms this means, I can get feedback from users by merely uploading their scanned forms to trained custom Form Recognizer.

Prerequisite

In this blog post will attempt to recognize user feedback using Form Recognizer and Form Recognizer to do custom form recognition rather than built-in one.

We'll need:

  1. Filled out forms
  2. Form Recognizer service
  3. Azure Storage Account
  4. Form Recognizer tools

This will take a bit of setup but once it's working, you can add as many projects as you want to recognize different forms.

1. Filled out forms

You'll need at least 6 filled out forms for this demo.
As co-organizer of AI Hack Day, I'm using the SSW User Group Evaluation Survey which you can download, fill out, and scan if you don't have your forms.

Microsoft also have their own data set, if you prefer prefilled: https://github.com/Azure-Samples/cognitive-services-REST-api-samples/blob/master/curl/form-recognizer/sample_data.zip

2. Create Form Recognizer

To get started, you can either go to Azure Portal and create a new resource for Form Recognizer or use PowerShell with Azure CLI to create that resource faster.

az cognitiveservices account create --kind FormRecognizer --location EastUS --name [forms-name] --resource-group [resource-group-name] --sku F0 --yes
az cognitiveservices account keys list --name [forms-name] --resource-group [resource-group-name]

Copy Endpoint and Key1 for later use. If you forgot to copy them, you'll be able access them later from Azure Portal under Quick start in Form Recognizer resource.

form-recognizer-ps-create-service

3. Create and configure Azure Blob Storage

We need Azure Blob Storage to store our training data, which use to label our data and store the results.

Before starting, download Microsoft Azure Storage Explorer.

The easiest way to create blob storage is:

  1. Go to Azure Portal
  2. Go to your resource group
  3. On the left list click Storage
  4. Click "Storage Account - blob, file, table, queue"
  5. Basics tab
    1. Add name
    2. Change Replication to be "Locally-redundant storage (LRS)"
    3. Change Access tier to be "Cool" (we don't need high performance)
  6. Click "Review + create" and "Create" (no other changes required)
    form-recognizer-az-portal-create-storage
  7. Once resource created, open app Microsoft Azure Storage Explorer
    1. Add your subscription and find your storage account
  8. Right-click on the "Blob Containers", click "Configure CORS Settings..." and "Add"
    1. Allowed origins: *
    2. Allowed methods: DELETE,GET,HEAD,MERGE,POST,OPTIONS,PUT,PATCH
    3. Allowed headers: *
    4. Exposed headers: *
    5. Max age: 200
    6. 2x "Save"
      form-recognizer-az-portal-storage-cors-config
  9. Create a new blob container under "Blob Containers"
    1. Right-click the new container and click "Get Shared Access Signature..."
    2. Set Expiry Time to something distant
    3. Add all permissions (it will fail later on if not everything is selected)
    4. Click "Create"
    5. Copy "URL" (this URL will be lost if not copied and you need to create a new one!)
      form-recognizer-storage-create-shared-url

4. Get Form Recognizer Tools

For our demo, we'll use an existing Microsoft tool for making custom forms. Make sure you have Docker installed and running.

Run the following commands to get the tools up and running on your machine:

docker pull mcr.microsoft.com/azure-cognitive-services/custom-form/labeltool

Create a project

Whew, that's a lot of prep work for one service, but now we can start!

Run Form Recognizer Tool via Docker:

docker run -it -p 3000:80 mcr.microsoft.com/azure-cognitive-services/custom-form/labeltool eula=accept

Once running, go to "http://localhost:3000" and let us create a new project.

  1. Add name
  2. "Add Connection"
    1. Add a name for the connection
    2. Add URL you stored at the end when configuring Azure Blob Storage ("Get Shared Access Signature")
      form-recognizer-tool-add-connection
  3. Folder path is empty in this case
  4. Form Recognizer Service Uri and Api Key are Endpoint and Key1 from when we created the service

We finally have our project created!

form-recognizer-tool-new-project-created

Start labeling

You'll notice that the project is empty and it doesn't give you options to upload images. To get your forms in (images or PDF), you'll have to upload them to the Blob Storage we created for this project. You can use Microsoft Azure Storage Explorer to upload your forms.

TIP: Don't upload all of your images because you want to leave some of them for testing.

Once they are uploaded, refresh the page on the Form Recognizer Tool.

  1. Wait for OCR to complete
    form-recognizer-tool-wait-for-ocr
  2. Create tags in the right column you want to recognize (name, email, etc.)
  3. Select the detected text (it will be light green)
  4. Select the correct tag (e.g. "Position")
  5. The selected text will now have the same color as the tag
    form-recognizer-tool-tag-text

Do this for all of the text you want to be tagged on all of the forms.

NOTE: This tool supports only one form per image/PDF. Form with multiple pages will be treated as a single multipage form!

Training

  1. Go to Train icon
  2. Click "Train"
  3. Done! 🎉

form-recognizer-tool-train

Testing

  1. Go to 2 squares icon
  2. Browse
  3. Predict! 🎉

form-recognizer-tool-test

You can also download the JSON file or use HTTP request (coming in later blog post).

NOTE: If using are free tier, this may take a while because of the rate limit.

What's next

The form recognizer works mostly well however, there are a few issues I need to address:

  1. OCR isn't always great especially if someone's handwriting isn't great
  2. This version doesn't recognize checkboxes (the feature is on their backlog)
  3. When uploading a multipage PDF, it treats it as a single form on multiple pages. We need to split it up into multiple images/PDFs

At the moment I'm experimenting with recognizing checkboxes myself. However, I'm happy that we can quickly get images into structured data and do a bit of correction and add ratings rather than typing everything from scratch. I'm looking forward to the day when our feedback forms for hack days and user groups are automated and we can send feedback to guest speakers much quicker (and enjoying the weekend instead of entering data! 😁).