Skip to content

ISCC - Main High-Level Functions#

SDK main top-level functions.

code_iscc(fp, name=None, description=None, meta=None, **options) #

Generate a complete ISCC-CODE for the given file.

This function creates a full ISCC-CODE by combining Meta, Content, Data, and Instance Codes. It automatically detects the media type and processes the file accordingly.

The function performs the following steps: 1. Reads the file and determines its media type. 2. Generates Data & Instance Codes for all file types using the code_sum function. 3. For supported media types, generates Content Code and optional Semantic Code. 4. If enabled, generates Meta-Code from embedded or provided metadata. 5. Combines all generated code units into a single ISCC-CODE. 6. Merges metadata from all ISCC units.

ISCC-CODE is a composite identifier that consists of multiple ISCC-UNITs, each serving a specific purpose: - Meta-Code: Based on normalized metadata (title, description) - Semantic-Code: Based on semantic features (experimental, requires additional packages) - Content-Code: Based on the content features (text, image, audio, video) - Data-Code: Based on the raw binary data (similarity preserving hash) - Instance-Code: Based on the exact binary data (cryptographic hash)

Note: - The behavior can be customized through the sdk_opts settings. For example, setting fallback to True will allow processing of unsupported media types in a fallback mode instead of raising an exception. - For processing container files (like EPUB with embedded files), set process_container to True to extract and process contained files.

Parameters:

Name Type Description Default
fp

Path object or str representing the filepath of the file to process.

required
name

Optional name to override extracted metadata.

None
description

Optional description to override extracted metadata.

None
meta

Optional metadata (dict or Data-URL as string) to override extracted metadata.

None
options

Keyword arguments forwarded to sdk_opts: extract_meta - Whether to extract metadata. Default: True; fallback - Process unsupported media types. Default: False; add_units - Include ISCC-UNITS in metadata. Default: False; create_meta - Create Meta-Code. Default: True; wide - Enable wide mode for ISCC-SUM with Data & Instance codes only. Default: False; experimental - Enable experimental semantic codes. Default: False; process_container - Process container files and extract contained files. Default: False; granular - Generate additional granular fingerprints. Default: False

{}

Returns:

Type Description

IsccMeta object with complete ISCC-CODE and merged metadata from all ISCC-UNITs.

Raises:

Type Description
idk.IsccUnsupportedMediatype

If the media type is not supported. By default, the function will raise this exception for unsupported media types, as sdk_opts.fallback is False by default.

code_iscc_mt(fp, name=None, description=None, meta=None, **options) #

Generate a complete ISCC-CODE for the given file using multithreading.

This function creates a full ISCC-CODE by combining Meta, Content, Data, and Instance Codes. It automatically detects the media type and processes the file accordingly.

The function performs the following steps: 1. Reads the file and determines its MIME type. 2. Generates Instance and Data Codes for all file types. 3. For supported media types, generates Content and Meta Codes. 4. Combines all generated codes into a single ISCC-CODE. 5. Merges metadata from all ISCC units.

Note:

  • This function uses multithreading to improve performance.
  • The behavior can be customized through the sdk_opts settings. For example, setting fallback to True will allow processing of unsupported media types in a fallback mode instead of raising an exception.

Parameters:

Name Type Description Default
fp

str or Path object representing the filepath of the file to process.

required
name

Optional name to override extracted metadata.

None
description

Optional description to override extracted metadata.

None
meta

Optional metadata (dict or Data-URL as string) to override extracted metadata.

None
options

Keyword arguments forwarded to sdk_opts: fallback - Process unsupported media types. Default: False; add_units - Include ISCC-UNITS in metadata. Default: False; create_meta - Create Meta-Code unit from embedded metadata. Default: True; wide - Enable wide mode for ISCC-SUM with Data & Instance codes only. Default: False; experimental - Enable experimental semantic codes. Default: False

{}

Returns:

Type Description

IsccMeta object with complete ISCC-CODE and merged metadata from all ISCC-UNITs.

Raises:

Type Description
idk.IsccUnsupportedMediatype

If the media type is not supported. By default, the function will raise this exception for unsupported media types, as sdk_opts.fallback is False by default.

code_meta(fp, name=None, description=None, meta=None, **options) #

Generate Meta-Code for digital asset.

Creates an ISCC Meta-Code based on normalized metadata extracted from the file. If no name is found in metadata, the filename will be used instead.

Parameters:

Name Type Description Default
fp

Filepath used for Meta-Code creation.

required
name

Optional name to override extracted metadata.

None
description

Optional description to override extracted metadata.

None
meta

Optional metadata (Data-URL as string or dict) to override extracted metadata.

None
options

Keyword arguments forwarded to sdk_opts: extract_meta - Whether to extract metadata. Default: True; bits - Bit-length of the generated Meta-Code UNIT. Default: 64

{}

Returns:

Type Description

ISCC metadata including Meta-Code and extracted metadata fields.

code_content(fp, **options) #

Detect mediatype and create corresponding Content-Code.

Analyzes the file to determine its media type and routes the processing to the appropriate specialized function (code_text, code_image, code_audio, or code_video).

Parameters:

Name Type Description Default
fp

Filepath

required
options

Keyword arguments forwarded to sdk_opts: extract_meta - Whether to extract metadata. Default: True; create_thumb - Whether to create a thumbnail. Default: True

{}

Returns:

Type Description

Content-Code wrapped in ISCC metadata.

Raises:

Type Description
idk.IsccUnsupportedMediatype

If the media type is not supported.

code_text(fp, text=None, **options) #

Generate Content-Code Text.

Creates a Text-Code by extracting and processing text content from document files. Can optionally extract metadata and create a thumbnail representation of the text.

Parameters:

Name Type Description Default
fp

Filepath used for Text-Code creation.

required
text

Optional cleaned text. If provided, the function will skip text extraction.

None
options

Keyword arguments forwarded to sdk_opts: extract_meta - Whether to extract metadata. Default: True; create_thumb - Whether to create a thumbnail. Default: True; bits - Bit-length of the generated Text-Code UNIT. Default: 64; granular - Whether to generate additional granular fingerprints. Default: False

{}

Returns:

Type Description

ISCC metadata including Text-Code.

code_text_semantic(fp, text=None, **options) #

Generate Semantic-Code Text. (Requires iscc-sct to be installed)

Parameters:

Name Type Description Default
fp

Filepath used for semantic Text-Code creation.

required
text

Optional cleaned text. If provided, the function will skip text extraction.

None

Raises:

Type Description
idk.EnvironmentError

If iscc-sct is not installed.

code_image(fp, **options) #

Generate Content-Code Image.

Creates an Image-Code by normalizing and processing the visual content of image files. The image is normalized according to SDK options (transparency handling, border trimming, ...).

Parameters:

Name Type Description Default
fp

Filepath used for Image-Code creation.

required
options

Keyword arguments forwarded to sdk_opts: extract_meta - Whether to extract metadata. Default: True; create_thumb - Whether to create a thumbnail. Default: True; bits - Bit-length of the generated Image-Code UNIT. Default: 64

{}

Returns:

Type Description

ISCC metadata including Image-Code.

code_image_semantic(fp, **options) #

Generate Semantic-Code Image. (Requires iscc-sci to be installed)

Parameters:

Name Type Description Default
fp

Filepath used for semantic Image-Code creation.

required

Raises:

Type Description
idk.EnvironmentError

If iscc-sci is not installed.

code_audio(fp, **options) #

Generate Content-Code Audio.

Creates an Audio-Code by extracting acoustic fingerprints from audio files. Uses chromaprint/fpcalc to generate audio features for similarity matching.

Parameters:

Name Type Description Default
fp

Filepath used for Audio-Code creation.

required
options

Keyword arguments forwarded to sdk_opts: extract_meta - Whether to extract metadata. Default: True; create_thumb - Whether to create a thumbnail. Default: True; bits - Bit-length of the generated Audio-Code UNIT. Default: 64

{}

Returns:

Type Description

ISCC metadata including Audio-Code.

code_video(fp, **options) #

Generate Content-Code Video.

Creates a Video-Code by extracting and processing visual features from video frames. Uses MPEG-7 signature tools to extract frame-based features and optionally detect scene changes.

Parameters:

Name Type Description Default
fp

Filepath used for Video-Code creation.

required
options

Keyword arguments forwarded to sdk_opts: extract_meta - Whether to extract metadata. Default: True; create_thumb - Whether to create a thumbnail. Default: True; granular - Generate additional fingerprints based on scenes. Default: False; video_store_mp7sig - Whether to store extracted MP7 Video signature file. Default: False; bits - Bit-length of the generated Video-Code UNIT. Default: 64

{}

Returns:

Type Description

ISCC metadata including Video-Code.

code_data(fp, **options) #

Create ISCC Data-Code.

The Data-Code is a similarity preserving hash of the raw input data that allows for detection of similar binary data regardless of file format or metadata differences.

Parameters:

Name Type Description Default
fp

Filepath used for Data-Code creation.

required
options

Keyword arguments forwarded to sdk_opts: bits - Bit-length of the generated Data-Code UNIT. Default: 64

{}

Returns:

Type Description

ISCC metadata including Data-Code.

code_instance(fp, **options) #

Create ISCC Instance-Code.

The Instance-Code is a cryptographic hash (blake3) of the input data. Its purpose is to serve as a checksum that detects even minimal changes to the data of the referenced media asset. For cryptographically secure integrity checking, a full 256-bit multihash is provided with the datahash field.

Parameters:

Name Type Description Default
fp

Filepath used for Instance-Code creation.

required
options

Keyword arguments forwarded to sdk_opts: bits - Bit-length of the generated Instance-Code UNIT. Default: 64

{}

Returns:

Type Description

ISCC metadata including Instance-Code, datahash, and filesize.

code_sum(fp, **options) #

Create an ISCC-CODE with Data- and Instance-Code UNITs in a single pass.

Parameters:

Name Type Description Default
fp

Filepath used for ISCC-CODE Sum creation.

required
options

Keyword arguments forwarded to sdk_opts: bits - Bit-length for Data-Code body. Default: 64; wide - Whether to use wide or narrow ISCC-CODE (64-bit or 128-bit UNITs); add_units - Include individual ISCC-UNITs in result. Default: False

{}

Returns:

Type Description

ISCC metadata.