ISCC - Main High-Level Functions#
SDK main top-level functions.
code_iscc(fp, name=None, description=None, meta=None, **options)
#
Generate a complete ISCC-CODE for the given file.
This function creates a full ISCC-CODE by combining Meta, Content, Data, and Instance Codes. It automatically detects the media type and processes the file accordingly.
The function performs the following steps:
1. Reads the file and determines its media type.
2. Generates Data & Instance Codes for all file types using the code_sum
function.
3. For supported media types, generates Content Code and optional Semantic Code.
4. If enabled, generates Meta-Code from embedded or provided metadata.
5. Combines all generated code units into a single ISCC-CODE.
6. Merges metadata from all ISCC units.
ISCC-CODE is a composite identifier that consists of multiple ISCC-UNITs, each serving a specific purpose: - Meta-Code: Based on normalized metadata (title, description) - Semantic-Code: Based on semantic features (experimental, requires additional packages) - Content-Code: Based on the content features (text, image, audio, video) - Data-Code: Based on the raw binary data (similarity preserving hash) - Instance-Code: Based on the exact binary data (cryptographic hash)
Note:
- The behavior can be customized through the sdk_opts
settings. For example, setting
fallback
to True will allow processing of unsupported media types in a
fallback mode instead of raising an exception.
- For processing container files (like EPUB with embedded files), set process_container
to True to extract and process contained files.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
fp
|
Path object or str representing the filepath of the file to process. |
required | |
name
|
Optional name to override extracted metadata. |
None
|
|
description
|
Optional description to override extracted metadata. |
None
|
|
meta
|
Optional metadata (dict or Data-URL as string) to override extracted metadata. |
None
|
|
extract_meta
|
Whether to extract metadata. Default: True |
required | |
fallback
|
Process unsupported media types. Default: False |
required | |
add_units
|
Include ISCC-UNITS in metadata. Default: False |
required | |
create_meta
|
Create Meta-Code. Default: True |
required | |
wide
|
Enable wide mode for ISCC-SUM with Data & Instance codes only. Default: False |
required | |
experimental
|
Enable experimental semantic codes. Default: False |
required | |
process_container
|
Process container files and extract contained files. Default: False |
required | |
granular
|
Generate additional granular fingerprints. Default: False |
required |
Returns:
Type | Description |
---|---|
IsccMeta object with complete ISCC-CODE and merged metadata from all ISCC-UNITs. |
Raises:
Type | Description |
---|---|
idk.IsccUnsupportedMediatype
|
If the media type is not supported. By default, the function will raise this exception for unsupported media types, as sdk_opts.fallback is False by default. |
code_meta(fp, name=None, description=None, meta=None, **options)
#
Generate Meta-Code for digital asset.
Creates an ISCC Meta-Code based on normalized metadata extracted from the file. If no name is found in metadata, the filename will be used instead.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
fp
|
Filepath used for Meta-Code creation. |
required | |
name
|
Optional name to override extracted metadata. |
None
|
|
description
|
Optional description to override extracted metadata. |
None
|
|
meta
|
Optional metadata (Data-URL as sting or dict) to override extracted metadata. |
None
|
|
extract_meta
|
Whether to extract metadata. Default: True |
required | |
bits
|
Bit-length of the generated Meta-Code UNIT. Default: 64 |
required |
Returns:
Type | Description |
---|---|
ISCC metadata including Meta-Code and extracted metadata fields. |
code_content(fp, **options)
#
Detect mediatype and create corresponding Content-Code.
Analyzes the file to determine its media type and routes the processing to the appropriate specialized function (code_text, code_image, code_audio, or code_video).
Parameters:
Name | Type | Description | Default |
---|---|---|---|
fp
|
Filepath |
required | |
extract_meta
|
Whether to extract metadata. Default: True |
required | |
create_thumb
|
Whether to create a thumbnail. Default: True |
required |
Returns:
Type | Description |
---|---|
Content-Code wrapped in ISCC metadata. |
Raises:
Type | Description |
---|---|
idk.IsccUnsupportedMediatype
|
If the media type is not supported. |
code_text(fp, text=None, **options)
#
Generate Content-Code Text.
Creates a Text-Code by extracting and processing text content from document files. Can optionally extract metadata and create a thumbnail representation of the text.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
fp
|
Filepath used for Text-Code creation. |
required | |
text
|
Optional cleaned text. If provided, the function will skip text extraction. |
None
|
|
extract_meta
|
Whether to extract metadata. Default: True |
required | |
create_thumb
|
Whether to create a thumbnail. Default: True |
required | |
bits
|
Bit-length of the generated Text-Code UNIT. Default: 64 |
required | |
granular
|
Whether to generate additional granular fingerprints. Default: False |
required |
Returns:
Type | Description |
---|---|
ISCC metadata including Text-Code. |
code_text_semantic(fp, text=None, **options)
#
Generate Semantic-Code Text. (Requires iscc-sct to be installed)
Parameters:
Name | Type | Description | Default |
---|---|---|---|
fp
|
Filepath used for semantic Text-Code creation. |
required | |
text
|
Optional cleaned text. If provided, the function will skip text extraction. |
None
|
Raises:
Type | Description |
---|---|
idk.EnvironmentError
|
If iscc-sct is not installed. |
code_image(fp, **options)
#
Generate Content-Code Image.
Creates an Image-Code by normalizing and processing the visual content of image files. The image is normalized according to SDK options (transparency handling, border trimming, ...).
Parameters:
Name | Type | Description | Default |
---|---|---|---|
fp
|
Filepath used for Image-Code creation. |
required | |
extract_meta
|
Whether to extract metadata. Default: True |
required | |
create_thumb
|
Whether to create a thumbnail. Default: True |
required | |
bits
|
Bit-length of the generated Image-Code UNIT. Default: 64 |
required |
Returns:
Type | Description |
---|---|
ISCC metadata including Image-Code. |
code_image_semantic(fp, **options)
#
Generate Semantic-Code Image. (Requires iscc-sci to be installed)
Parameters:
Name | Type | Description | Default |
---|---|---|---|
fp
|
Filepath used for semantic Image-Code creation. |
required |
Raises:
Type | Description |
---|---|
idk.EnvironmentError
|
If iscc-sci is not installed. |
code_audio(fp, **options)
#
Generate Content-Code Audio.
Creates an Audio-Code by extracting acoustic fingerprints from audio files. Uses chromaprint/fpcalc to generate audio features for similarity matching.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
fp
|
Filepath used for Audio-Code creation. |
required | |
extract_meta
|
Whether to extract metadata. Default: True |
required | |
create_thumb
|
Whether to create a thumbnail. Default: True |
required | |
bits
|
Bit-length of the generated Audio-Code UNIT. Default: 64 |
required |
Returns:
Type | Description |
---|---|
ISCC metadata including Audio-Code. |
code_video(fp, **options)
#
Generate Content-Code Video.
Creates a Video-Code by extracting and processing visual features from video frames. Uses MPEG-7 signature tools to extract frame-based features and optionally detect scene changes.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
fp
|
Filepath used for Video-Code creation. |
required | |
extract_meta
|
Whether to extract metadata. Default: True |
required | |
create_thumb
|
Whether to create a thumbnail. Default: True |
required | |
granular
|
Generate additional fingerprints based on scenes. Default: False |
required | |
video_store_mp7sig
|
Whether to store extracted MP7 Video signature file. Default: False |
required | |
bits
|
Bit-length of the generated Video-Code UNIT. Default: 64 |
required |
Returns:
Type | Description |
---|---|
ISCC metadata including Video-Code. |
code_data(fp, **options)
#
Create ISCC Data-Code.
The Data-Code is a similarity preserving hash of the raw input data that allows for detection of similar binary data regardless of file format or metadata differences.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
fp
|
Filepath used for Data-Code creation. |
required | |
bits
|
Bit-length of the generated Data-Code UNIT. Default: 64 |
required |
Returns:
Type | Description |
---|---|
ISCC metadata including Data-Code. |
code_instance(fp, **options)
#
Create ISCC Instance-Code.
The Instance-Code is a cryptographic hash (blake3) of the input data.
Its purpose is to serve as a checksum that detects even minimal changes
to the data of the referenced media asset. For cryptographically secure integrity
checking, a full 256-bit multihash is provided with the datahash
field.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
fp
|
Filepath used for Instance-Code creation. |
required | |
bits
|
Bit-length of the generated Instance-Code UNIT. Default: 64 |
required |
Returns:
Type | Description |
---|---|
ISCC metadata including Instance-Code, datahash, and filesize. |
code_sum(fp, **options)
#
Create and ISCC-CODE with Data- and Instance-Code UNITs
Reads file data only once and creates both Data-Code and Instance-Code in one go.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
fp
|
Filepath used for ISCC-CODE Sum creation. |
required | |
wide
|
Whether to use wide or narrow ISCC-CODE (64-bit or 128-bit UNITs) |
required |
Returns:
Type | Description |
---|---|
ISCC metadata. |