FastDeploy  latest
Fast & Easy to Deploy!
Public Member Functions | List of all members
fastdeploy.RuntimeOption Class Reference

Option object used when create a new Runtime object. More...

Public Member Functions

void SetModelPath (string model_path, string params_path="", ModelFormat format=ModelFormat.PADDLE)
 Set path of model file and parameter file. More...
 
void SetModelBuffer (string model_buffer, string params_buffer="", ModelFormat format=ModelFormat.PADDLE)
 Specify the memory buffer of model and parameter. Used when model and params are loaded directly from memory. More...
 
void UseCpu ()
 Use cpu to inference, the runtime will inference on CPU by default.
 
void UseGpu (int gpu_id=0)
 Use Nvidia GPU to inference.
 
void UseRKNPU2 (rknpu2_CpuName rknpu2_name=rknpu2_CpuName.RK3588, rknpu2_CoreMask rknpu2_core=rknpu2_CoreMask.RKNN_NPU_CORE_0)
 Use RKNPU2 e.g RK3588/RK356X to inference.
 
void UseTimVX ()
 Use TimVX e.g RV1126/A311D to inference.
 
void UseAscend ()
 Use Huawei Ascend to inference.
 
void UseKunlunXin (int kunlunxin_id=0, int l3_workspace_size=0xfffc00, bool locked=false, bool autotune=true, string autotune_file="", string precision="int16", bool adaptive_seqlen=false, bool enable_multi_stream=false)
 Turn on KunlunXin XPU. More...
 
void UseSophgo ()
 Use Sophgo to inference.
 
void UsePaddleInferBackend ()
 Set Paddle Inference as inference backend, support CPU/GPU.
 
void UseOrtBackend ()
 Set ONNX Runtime as inference backend, support CPU/GPU.
 
void UseSophgoBackend ()
 Set SOPHGO Runtime as inference backend, support SOPHGO.
 
void UseTrtBackend ()
 Set TensorRT as inference backend, only support GPU.
 
void UsePorosBackend ()
 Set Poros backend as inference backend, support CPU/GPU.
 
void UseOpenVINOBackend ()
 Set OpenVINO as inference backend, only support CPU.
 
void UseLiteBackend ()
 Set Paddle Lite as inference backend, only support arm cpu.
 
void UsePaddleLiteBackend ()
 Set Paddle Lite as inference backend, only support arm cpu.
 

Detailed Description

Option object used when create a new Runtime object.

Member Function Documentation

◆ SetModelBuffer()

void fastdeploy.RuntimeOption.SetModelBuffer ( string  model_buffer,
string  params_buffer = "",
ModelFormat  format = ModelFormat.PADDLE 
)
inline

Specify the memory buffer of model and parameter. Used when model and params are loaded directly from memory.

Parameters
[in]model_bufferThe string of model memory buffer
[in]params_bufferThe string of parameters memory buffer
[in]formatFormat of the loaded model

◆ SetModelPath()

void fastdeploy.RuntimeOption.SetModelPath ( string  model_path,
string  params_path = "",
ModelFormat  format = ModelFormat.PADDLE 
)
inline

Set path of model file and parameter file.

Parameters
[in]model_pathPath of model file, e.g ResNet50/model.pdmodel for Paddle format model / ResNet50/model.onnx for ONNX format model
[in]params_pathPath of parameter file, this only used when the model format is Paddle, e.g Resnet50/model.pdiparams
[in]formatFormat of the loaded model

◆ UseKunlunXin()

void fastdeploy.RuntimeOption.UseKunlunXin ( int  kunlunxin_id = 0,
int  l3_workspace_size = 0xfffc00,
bool  locked = false,
bool  autotune = true,
string  autotune_file = "",
string  precision = "int16",
bool  adaptive_seqlen = false,
bool  enable_multi_stream = false 
)
inline

Turn on KunlunXin XPU.

Parameters
kunlunxin_idthe KunlunXin XPU card to use (default is 0).
l3_workspace_sizeThe size of the video memory allocated by the l3 cache, the maximum is 16M.
lockedWhether the allocated L3 cache can be locked. If false, it means that the L3 cache is not locked, and the allocated L3 cache can be shared by multiple models, and multiple models sharing the L3 cache will be executed sequentially on the card.
autotuneWhether to autotune the conv operator in the model. If true, when the conv operator of a certain dimension is executed for the first time, it will automatically search for a better algorithm to improve the performance of subsequent conv operators of the same dimension.
autotune_fileSpecify the path of the autotune file. If autotune_file is specified, the algorithm specified in the file will be used and autotune will not be performed again.
precisionCalculation accuracy of multi_encoder
adaptive_seqlenIs the input of multi_encoder variable length
enable_multi_streamWhether to enable the multi stream of KunlunXin XPU.

The documentation for this class was generated from the following file: