ISCC - Main High-Level Functions#
SDK main top-level functions.
code_iscc(fp, name=None, description=None, meta=None, **options)
#
Generate a complete ISCC-CODE for the given file.
This function creates a full ISCC-CODE by combining Meta, Content, Data, and Instance Codes. It automatically detects the media type and processes the file accordingly.
The function performs the following steps:
1. Reads the file and determines its media type.
2. Generates Data & Instance Codes for all file types using the code_sum function.
3. For supported media types, generates Content Code and optional Semantic Code.
4. If enabled, generates Meta-Code from embedded or provided metadata.
5. Combines all generated code units into a single ISCC-CODE.
6. Merges metadata from all ISCC units.
ISCC-CODE is a composite identifier that consists of multiple ISCC-UNITs, each serving a specific purpose: - Meta-Code: Based on normalized metadata (title, description) - Semantic-Code: Based on semantic features (experimental, requires additional packages) - Content-Code: Based on the content features (text, image, audio, video) - Data-Code: Based on the raw binary data (similarity preserving hash) - Instance-Code: Based on the exact binary data (cryptographic hash)
Note:
- The behavior can be customized through the sdk_opts settings. For example, setting
fallback to True will allow processing of unsupported media types in a
fallback mode instead of raising an exception.
- For processing container files (like EPUB with embedded files), set process_container
to True to extract and process contained files.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
fp
|
Path object or str representing the filepath of the file to process. |
required | |
name
|
Optional name to override extracted metadata. |
None
|
|
description
|
Optional description to override extracted metadata. |
None
|
|
meta
|
Optional metadata (dict or Data-URL as string) to override extracted metadata. |
None
|
|
options
|
Keyword arguments forwarded to |
{}
|
Returns:
| Type | Description |
|---|---|
|
IsccMeta object with complete ISCC-CODE and merged metadata from all ISCC-UNITs. |
Raises:
| Type | Description |
|---|---|
idk.IsccUnsupportedMediatype
|
If the media type is not supported. By default, the function will raise this exception for unsupported media types, as sdk_opts.fallback is False by default. |
code_iscc_mt(fp, name=None, description=None, meta=None, **options)
#
Generate a complete ISCC-CODE for the given file using multithreading.
This function creates a full ISCC-CODE by combining Meta, Content, Data, and Instance Codes. It automatically detects the media type and processes the file accordingly.
The function performs the following steps: 1. Reads the file and determines its MIME type. 2. Generates Instance and Data Codes for all file types. 3. For supported media types, generates Content and Meta Codes. 4. Combines all generated codes into a single ISCC-CODE. 5. Merges metadata from all ISCC units.
Note:
- This function uses multithreading to improve performance.
- The behavior can be customized through the
sdk_optssettings. For example, settingfallbackto True will allow processing of unsupported media types in a fallback mode instead of raising an exception.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
fp
|
str or Path object representing the filepath of the file to process. |
required | |
name
|
Optional name to override extracted metadata. |
None
|
|
description
|
Optional description to override extracted metadata. |
None
|
|
meta
|
Optional metadata (dict or Data-URL as string) to override extracted metadata. |
None
|
|
options
|
Keyword arguments forwarded to |
{}
|
Returns:
| Type | Description |
|---|---|
|
IsccMeta object with complete ISCC-CODE and merged metadata from all ISCC-UNITs. |
Raises:
| Type | Description |
|---|---|
idk.IsccUnsupportedMediatype
|
If the media type is not supported. By default, the function will raise this exception for unsupported media types, as sdk_opts.fallback is False by default. |
code_meta(fp, name=None, description=None, meta=None, **options)
#
Generate Meta-Code for digital asset.
Creates an ISCC Meta-Code based on normalized metadata extracted from the file. If no name is found in metadata, the filename will be used instead.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
fp
|
Filepath used for Meta-Code creation. |
required | |
name
|
Optional name to override extracted metadata. |
None
|
|
description
|
Optional description to override extracted metadata. |
None
|
|
meta
|
Optional metadata (Data-URL as string or dict) to override extracted metadata. |
None
|
|
options
|
Keyword arguments forwarded to |
{}
|
Returns:
| Type | Description |
|---|---|
|
ISCC metadata including Meta-Code and extracted metadata fields. |
code_content(fp, **options)
#
Detect mediatype and create corresponding Content-Code.
Analyzes the file to determine its media type and routes the processing to the appropriate specialized function (code_text, code_image, code_audio, or code_video).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
fp
|
Filepath |
required | |
options
|
Keyword arguments forwarded to |
{}
|
Returns:
| Type | Description |
|---|---|
|
Content-Code wrapped in ISCC metadata. |
Raises:
| Type | Description |
|---|---|
idk.IsccUnsupportedMediatype
|
If the media type is not supported. |
code_text(fp, text=None, **options)
#
Generate Content-Code Text.
Creates a Text-Code by extracting and processing text content from document files. Can optionally extract metadata and create a thumbnail representation of the text.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
fp
|
Filepath used for Text-Code creation. |
required | |
text
|
Optional cleaned text. If provided, the function will skip text extraction. |
None
|
|
options
|
Keyword arguments forwarded to |
{}
|
Returns:
| Type | Description |
|---|---|
|
ISCC metadata including Text-Code. |
code_text_semantic(fp, text=None, **options)
#
Generate Semantic-Code Text. (Requires iscc-sct to be installed)
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
fp
|
Filepath used for semantic Text-Code creation. |
required | |
text
|
Optional cleaned text. If provided, the function will skip text extraction. |
None
|
Raises:
| Type | Description |
|---|---|
idk.EnvironmentError
|
If iscc-sct is not installed. |
code_image(fp, **options)
#
Generate Content-Code Image.
Creates an Image-Code by normalizing and processing the visual content of image files. The image is normalized according to SDK options (transparency handling, border trimming, ...).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
fp
|
Filepath used for Image-Code creation. |
required | |
options
|
Keyword arguments forwarded to |
{}
|
Returns:
| Type | Description |
|---|---|
|
ISCC metadata including Image-Code. |
code_image_semantic(fp, **options)
#
Generate Semantic-Code Image. (Requires iscc-sci to be installed)
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
fp
|
Filepath used for semantic Image-Code creation. |
required |
Raises:
| Type | Description |
|---|---|
idk.EnvironmentError
|
If iscc-sci is not installed. |
code_audio(fp, **options)
#
Generate Content-Code Audio.
Creates an Audio-Code by extracting acoustic fingerprints from audio files. Uses chromaprint/fpcalc to generate audio features for similarity matching.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
fp
|
Filepath used for Audio-Code creation. |
required | |
options
|
Keyword arguments forwarded to |
{}
|
Returns:
| Type | Description |
|---|---|
|
ISCC metadata including Audio-Code. |
code_video(fp, **options)
#
Generate Content-Code Video.
Creates a Video-Code by extracting and processing visual features from video frames. Uses MPEG-7 signature tools to extract frame-based features and optionally detect scene changes.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
fp
|
Filepath used for Video-Code creation. |
required | |
options
|
Keyword arguments forwarded to |
{}
|
Returns:
| Type | Description |
|---|---|
|
ISCC metadata including Video-Code. |
code_data(fp, **options)
#
Create ISCC Data-Code.
The Data-Code is a similarity preserving hash of the raw input data that allows for detection of similar binary data regardless of file format or metadata differences.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
fp
|
Filepath used for Data-Code creation. |
required | |
options
|
Keyword arguments forwarded to |
{}
|
Returns:
| Type | Description |
|---|---|
|
ISCC metadata including Data-Code. |
code_instance(fp, **options)
#
Create ISCC Instance-Code.
The Instance-Code is a cryptographic hash (blake3) of the input data.
Its purpose is to serve as a checksum that detects even minimal changes
to the data of the referenced media asset. For cryptographically secure integrity
checking, a full 256-bit multihash is provided with the datahash field.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
fp
|
Filepath used for Instance-Code creation. |
required | |
options
|
Keyword arguments forwarded to |
{}
|
Returns:
| Type | Description |
|---|---|
|
ISCC metadata including Instance-Code, datahash, and filesize. |
code_sum(fp, **options)
#
Create an ISCC-CODE with Data- and Instance-Code UNITs in a single pass.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
fp
|
Filepath used for ISCC-CODE Sum creation. |
required | |
options
|
Keyword arguments forwarded to |
{}
|
Returns:
| Type | Description |
|---|---|
|
ISCC metadata. |