Document Detection

Detecting documents in the images and preprocessing them.

Overview

Document Detection is one of the intelligence services of Filestack platform. You can detect your document in the image, transform it to fully fit the image, and preprocess it such as de-noising and distortion reduction in order to increase the accuracy of OCR engine in text extraction. Please see following resources to learn more:

Resources

Document Detection API can only accept images with the resolution no more than 2000x2000 pixels. You can use Resize task in chain to configure your image size and make it compatible with Document Detection.

Processing API

Document Detection is available as a synchronous operation in the Processing API using following task:

doc_detection=coords:<coords>,preprocess:<preprocess>

Providing coords and preprocess in the above signature is not mandatory. If you do not configure these parameters in your URL and just use the signature /doc_detection/, default values of coords:false and preprocess:true would be set.
To use this task in Processing API, you have to use security policy and signature. Read more about security policies here.

Parameters

coords boolean false Indicates whether this task to return coordinates of detected document in the image.
preprocess boolean true Indicates whether this task to return preprocessed image or the warped one.

Response

  • Original image:

  • /doc_detection=coords:true/
{
    "coords": [
        [415, 727], "coordinates of top left edge (x, y)"
        [1065, 719], "coordinates of top right edge (x, y)"
        [1072, 1624], "coordinates of bottom right edge (x, y)"
        [421, 1633] "coordinates of bottom left edge (x, y)"
    ]
}

Response Parameters

coords array Includes the coordinates of four edges belonging to the detected document inside of your image.
  • /doc_detection=coords:false,preprocess:true/

  • /doc_detection=coords:false,preprocess:false/

Examples

  • Get the coordinates of the detected document in the image (the same result with both values of preprocess):
https://cdn.filestackcontent.com/security=p:<POLICY>,s:<SIGNATURE>/doc_detection=coords:true,preprocess:true/<HANDLE>
  • Get the preprocessed warped document from your original image:
https://cdn.filestackcontent.com/security=p:<POLICY>,s:<SIGNATURE>/doc_detection=coords:false,preprocess:true/<HANDLE>
  • Get the warped document from your original image:
https://cdn.filestackcontent.com/security=p:<POLICY>,s:<SIGNATURE>/doc_detection=coords:false,preprocess:false/<HANDLE>
  • Use doc_detection in a chain with other tasks such as resize:
https://cdn.filestackcontent.com/security=p:<POLICY>,s:<SIGNATURE>/resize=h:<HEIGHT>/doc_detection=coords:false,preprocess:true/<HANDLE>
  • Use doc_detection with an external URL:
https://cdn.filestackcontent.com/<FILESTACK_API_KEY>/security=p:<POLICY>,s:<SIGNATURE>/doc_detection=coords:<COORDS>,preprocess:<PREPROCESS>/<EXTERNAL_URL>
https://cdn.filestackcontent.com/<FILESTACK_API_KEY>/security=p:<POLICY>,s:<SIGNATURE>/doc_detection=coords:<COORDS>,preprocess:<PREPROCESS>/src://<STORAGE_ALIAS>/<PATH_TO_FILE>

Workflows Task Configuration

Visit Creating Workflows Tutorial to learn how you can use Workflows UI to configure your tasks and logic between them.

Document Detection task is available under Intelligence tasks category.

Workflows Parameters

Task Name string Unique name of the task. It will be included in the webhook response and can be used to build logic below.
coords boolean false Indicates whether this task to return coordinates of detected document in the image.
preprocess boolean true Indicates whether this task to return preprocessed image or the warped one.

Logic

Document Detection task returns following responses to the workflow:

  • If coords is enabled:
{
    "data": {
        "coords": [
            [
                "x", "horizontal coordinate of top left edge"
                "y", "vertical coordinate of top left edge"
            ],
            [
                "x", "horizontal coordinate of top right edge"
                "y", "vertical coordinate of top right edge"
            ],
            [
                "x", "horizontal coordinate of bottom right edge"
                "y", "vertical coordinate of bottom right edge"
            ],
            [
                "x", "horizontal coordinate of bottom left edge"
                "y", "vertical coordinate of bottom left edge"
            ]
        ]
    }
}
  • If coords is not enabled:
{
    "url": "the URL where the image is stored",
    "mimetype": "image/<image_format>",
    "size": "image size"
}

Logic Parameters

data dictionary Includes the coordinates of detected document.
coords array Indicates the coordinates of four edges belonging to the detected document.
url string Indicates the result file URL.
mimetype string Indicates the result file type and its format.
size integer Indicates result file size in bytes.

Considering the response from the task, you can build logic that tells the workflow how dependent tasks should be executed. For example, if you would like to implement another task if the image size is greater than or equal to a specific value, you can use the following rule:

size gte 300000

In Workflows UI this command looks like the following exaple:

You can visit Creating Workflows Tutorial to learn how to use Workflows UI to configure your tasks and logic between them.

Webhook

Below you can find an example webhook payload for a Document Detection task on a sample image:

  • If coords is enabled:
{
    "id": 77053449,
    "action": "fs.workflow",
    "timestamp": 1559236562,
    "text": {
        "workflow": "c587a2a7-9e66-4a88-8cd2-12b47fbba4dc",
        "jobid": "33083c2d-26f2-41f0-ba92-3617911f6828",
        "createdAt": "2019-05-30T17:09:45.67206082Z",
        "updatedAt": "2019-05-30T17:09:47.484986646Z",
        "sources": [
            "i6h2GStRSUexc6iR4fFq"
        ],
        "results": {
            "doc_detection_1554316107837": {
                "data": {
                    "coords": [
                        [
                            91,
                            56
                        ],
                        [
                            1980,
                            71
                        ],
                        [
                            1381,
                            2048
                        ],
                        [
                            87,
                            1938
                        ]
                    ]
                }
            }
        },
        "status": "Finished"
    }
}
  • If coords is not enabled:
{
    "id": 67015448,
    "action": "fs.workflow",
    "timestamp": 1551981663,
    "text": {
        "workflow": "0a3c0172-e7d5-4994-a6e0-b02hs7hv3q3t",
        "createdAt": "2019-03-07T16:28:13.520745456Z",
        "updatedAt": "2019-03-07T16:28:15.452548223Z",
        "sources": [
            "jTHlNSQRQgj5gsg5p0y3"
        ],
        "results": {
            "doc_detection_1551973696993": {
                "url": "http://cdn.filestackcontent.com/A3d9FYlESAMd92hc3G60W/wf://0a3c0172-e7d5-4994-a6e0-b02hs7hv3q3t/jTHlNSQRQgj5gsg5p0y3/3c17c5dc-dd93-4cf1-abac-0whs7b5v2p/7e900f75d25d12a93305171ts5b8ps3v",
                "mimetype": "image/jpeg",
                "size": 351697
            }
        },
    "status": "Finished"
    }
}

Please visit the webhooks documentation page to learn more.