Skip to content

Document Structuration

Format

Structured data is received as a Dictionary inside anArray() for each document.

If the options.rdf_export is set to True, the documents data array will be inside a key data, and the rdf string inside rdf.

KeyTypeDescription
project_idstringIdentifier for the project associated with the document.
document_idstringIdentifier for the processed document.
cardsStructuration Card []Array of structuration cards containing detailed information.
question_infoobjectInformation related to any questions extracted from the document.
pattern_dataobjectInformation about patterns used in the structuration process.

Structuration Cards

The cards key contains an array of structuration cards, each providing detailed information about different aspects of the processed document.

Card Structure

Each card in the cards array has the following structure:

  • time: Timestamp indicating when the card was processed.
  • u_id: Unique identifier for the card.
  • class_id: Array of classification IDs associated with the card.
  • is_perso: Boolean indicating whether the card represents personal information.
  • is_lettria: Boolean indicating whether the card is generated by Lettria.
  • is_individual: Boolean indicating whether the card represents an individual.
  • long_path: Detailed path indicating the classification hierarchy.
  • positions: Array of positions within the document.
  • categories: Array of categories associated with the card.
  • lemma: Lemma associated with the card.
  • negation: Boolean indicating whether negation is present.
  • plural: Boolean indicating plurality.
  • document_paths: Array of paths within the document.
  • attribute: Object containing attribute information.
  • possessing: Object containing possession information.
  • taste: Object containing taste-related information.
  • event: Object containing event-related information.
  • social: Object containing social information.
  • link: Object containing linking information.

Next steps