forge.onnx.inference¶

forge.onnx.inference.get_inference_function ¶

get_inference_function(model: Union[ModelProto, str, bytes, PathLike], providers: Optional[Union[str, List[str]]] = None, opt_level: Union[int, GraphOptimizationLevel] = ORT_DISABLE_ALL) -> Callable

Creates an ONNX Runtime inference function from the given model.

This function loads an ONNX model and returns a callable inference function that can be used to run predictions. The returned function automatically handles input and output names and shapes, and executes inference using the specified execution providers and graph optimization level.

Parameters:

Name	Type	Description	Default
`model`	`Union[ModelProto, str, bytes, PathLike]`	The ONNX model to be loaded. Can be a `ModelProto` object, a path to the model file, a serialized model in bytes, or an `os.PathLike` object representing a file path.	required
`providers`	`Optional[Union[str, List[str]]]`	The execution providers to use for inference. Can be a string, e.g. "CUDAExecutionProvider" or a list of provider strings to try in priority order. If not provided, defaults to "CPUExecutionProvider".	`None`
`opt_level`	`Optional[Union[int, GraphOptimizationLevel]]`	The level of graph optimization to apply during model loading. Defaults to `GraphOptimizationLevel.ORT_DISABLE_ALL`.	`ORT_DISABLE_ALL`

Returns:

Name	Type	Description
`Callable`	`Callable`	A callable function that takes input data as arguments and returns a dictionary mapping output names to their corresponding NumPy arrays. The function has additional metadata attributes such as `input_names`, `input_shapes`, `output_names`, `output_shapes`, and `session`.

Metadata Attributes of the Returned Function

input_names: A tuple of input tensor names.
input_shapes: A tuple of input tensor shapes.
output_names: A tuple of output tensor names.
output_shapes: A tuple of output tensor shapes.
session: The ONNX Runtime InferenceSession used for inference.

Example

inference_fn = get_inference_function("model.onnx")
output = inference_fn(input_data)
output_name = inference_fn.output_names[0])
print(output[output_name])

Raises:

Type	Description
`ValueError`	If the model, providers, or optimization level are invalid.

forge.onnx.inference.get_inference_session ¶

get_inference_session(model: Union[str, bytes, PathLike], providers: Union[str, List[str]], opt_level: Union[int, GraphOptimizationLevel], **kwargs) -> InferenceSession

Creates an ONNX Runtime Inference Session.

This helper function initializes and returns an ONNX Runtime InferenceSession with the specified model, execution providers, and optimization level. Additional session options can be set via keyword arguments.

Parameters:

Name	Type	Description	Default
`model`	`Union[str, bytes, PathLike]`	The ONNX model to load. Can be a path to the model file, a serialized model in bytes, or an object that implements the os.PathLike interface.	required
`providers`	`Union[str, List[str]]`	The execution providers to use. This can be a string (e.g., "CPUExecutionProvider", "CUDAExecutionProvider") or a list of providers to try in priority order.	required
`opt_level`	`Union[int, GraphOptimizationLevel]`	The level of graph optimization to apply. It should be an instance of `onnxruntime.GraphOptimizationLevel`, which controls how much the graph is optimized.	required
`**kwargs`		Additional session options to customize ONNX Runtime behavior. These are dynamically passed as attributes to the session options, see onnruntime.SessionOptions()	`{}`

Returns:

Name	Type	Description
`InferenceSession`	`InferenceSession`	The initialized ONNX Runtime `InferenceSession` object that can be used for running inference on the provided model.

Raises:

Type	Description
`ValueError`	If the model, providers, or optimization level are invalid.

Example

session = get_inference_session("model.onnx", "CPUExecutionProvider")
result = session.run(None, {"input": input_data})