Skip to content

Model capabilities

Each service relies on an underlying AI model.
The model capabilities define which type of output the service can produce and which parts of the response are populated.

A model can support:

  • Information extraction
  • Classification
  • Information extraction and classification

Note

Not all models have the same capabilities. Some models extract information only, some classify documents only, and others support both.

Available capabilities

Capability Description
information_extraction Extracts structured data from the document, such as names, identifiers, dates, or other configured fields.
classification Assigns the document to one of the available service categories.

Response content by capability

The response structure is generally consistent across services, but the content of some fields depends on the capabilities supported by the model.

Information extraction only

When a model supports information extraction only:

Response field Behaviour
service_fields Populated with the fields configured for the service.
service_categories Provided, but empty.
document.category null.
document_pages[].entities Populated with the extracted entities for each page.
document_pages[].aggregated_entities Populated when aggregated entities are available.
Information extraction only
{
  "service_fields": {
    "surname_1": {
      "tag_alias": "Surname",
      "tag_multiple_values": true
    }
  },
  "service_categories": {},
  "document": {
    "category": null
  },
  "document_pages": [
    {
      "entities": {
        "surname_1": [
          {
            "text": "VERDI",
            "confidence": 99.43
          }
        ]
      },
      "aggregated_entities": {}
    }
  ]
}

Classification only

When a model supports classification only:

Response field Behaviour
service_categories Populated with the available document categories.
document.category Populated with the selected document category.
service_fields Provided, but empty.
document_pages[].entities Provided, but empty.
document_pages[].aggregated_entities Provided, but empty.
Classification only
{
  "service_fields": {},
  "service_categories": {
    "id_card": "Identity Card",
    "passport": "Passport",
    "health_card": "Health Card",
    "other": "Other"
  },
  "document": {
    "category": {
      "value": "health_card",
      "confidence": 98.4
    }
  },
  "document_pages": [
    {
      "entities": {},
      "aggregated_entities": {}
    }
  ]
}

Information extraction and classification

When a model supports both information extraction and classification, the response contains both extracted data and classification results.

Response field Behaviour
service_fields Populated with the fields configured for the service.
service_categories Populated with the available document categories.
document.category Populated with the selected document category.
document_pages[].entities Populated with the extracted entities for each page.
document_pages[].aggregated_entities Populated when aggregated entities are available.
Information extraction and classification
{
  "service_fields": {
    "surname_1": {
      "tag_alias": "Surname",
      "tag_multiple_values": true
    },
    "name_1": {
      "tag_alias": "Name",
      "tag_multiple_values": true
    }
  },
  "service_categories": {
    "id_card": "Identity Card",
    "passport": "Passport",
    "health_card": "Health Card",
    "other": "Other"
  },
  "document": {
    "category": {
      "value": "health_card",
      "confidence": 98.4
    }
  },
  "document_pages": [
    {
      "entities": {
        "surname_1": [
          {
            "text": "VERDI",
            "confidence": 99.43
          }
        ],
        "name_1": [
          {
            "text": "GIUSEPPE",
            "confidence": 99.47
          }
        ]
      },
      "aggregated_entities": {}
    }
  ]
}

Note

A field can be present in the response but empty when the model does not support the related capability.

Warning

Do not assume that all services return the same populated content. The populated sections of the response depend on the capabilities of the model used by the service.