What is the correct, future-proof, way of deploying a pytorch python model in C++ for inference?
Can you please point me to a working example of pytorch python model deployed in a simple C++ piece of code?
I followed, step by step, the example, and got, in the compilation phase of the inference, this error :
(.aoti) (base) raphy@raohy:~/AOTInductor/example/build$ CMAKE_PREFIX_PATH=/path/to/python/install/site-packages/torch/share/cmake cmake ..
-- The C compiler identification is GNU 13.3.0
-- The CXX compiler identification is GNU 14.2.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
CMake Error at CMakeLists.txt:4 (find_package):
By not providing "FindTorch.cmake" in CMAKE_MODULE_PATH this project has
asked CMake to find a package configuration file provided by "Torch", but
CMake did not find one.
Could not find a package configuration file provided by "Torch" with any of
the following names:
TorchConfig.cmake
torch-config.cmake
Add the installation prefix of "Torch" to CMAKE_PREFIX_PATH or set
"Torch_DIR" to a directory containing one of the above files. If "Torch"
provides a separate development package or SDK, be sure it has been
installed.
-- Configuring incomplete, errors occurred!
This what I’ve done:
(base) raphy@raohy:~$ mkdir AOTInductor
(base) raphy@raohy:~$ cd AOTInductor/
(base) raphy@raohy:~/AOTInductor$ python -m venv .aoti
(base) raphy@raohy:~/AOTInductor$ source .aoti/bin/activate
(.aoti) (base) raphy@raohy:~/AOTInductor$
(.aoti) (base) raphy@raohy:~/AOTInductor$ pip3 install torch torchvision torchaudio --index-url
https://download.pytorch.org/whl/cpu
Looking in indexes: https://download.pytorch.org/whl/cpu
Collecting torch
(.aoti) (base) raphy@raohy:~/AOTInductor/example$ python model.py
/usr/bin/ld: warning: /tmp/torchinductor_raphy/
cx7jxbnff2tlwdz2gpv4yy5zoxvd7b6o2t5zekqulqe6zo5ld5vs/
ctwashdztcg4lyazvnlkmavrejyhfhfrtcama5gexx73mlv3sp2u/
cdxfaagbu5nqhrxwdtuvuvihnixco5qjerruqr26ubzmganyzfeq.o: missing .note.GNU-stack section
implies executable stack
/usr/bin/ld: NOTE: This behaviour is deprecated and will be removed in a future version of the
linker
@desertfire The previous error was due to my silly mistake. Solved. But now I get a different kind of error:
(.aoti) (base) raphy@raohy:~/AOTInductor/example/build$ CMAKE_PREFIX_PATH=/home/raphy/AOTInductor/.aoti/lib/python3.12/site-packages/torch/share/cmake cmake ..
-- The C compiler identification is GNU 13.3.0
-- The CXX compiler identification is GNU 14.2.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
CMake Warning at /home/raphy/AOTInductor/.aoti/lib/python3.12/site-packages/torch/share/cmake/Torch/TorchConfig.cmake:22 (message):
static library kineto_LIBRARY-NOTFOUND not found.
Call Stack (most recent call first):
/home/raphy/AOTInductor/.aoti/lib/python3.12/site-packages/torch/share/cmake/Torch/TorchConfig.cmake:121 (append_torchlib_if_found)
CMakeLists.txt:4 (find_package)
-- Found Torch: /home/raphy/AOTInductor/.aoti/lib/python3.12/site-packages/torch/lib/libtorch.so
-- Configuring done (0.5s)
-- Generating done (0.0s)
-- Build files have been written to: /home/raphy/AOTInductor/example/build
(.aoti) (base) raphy@raohy:~/AOTInductor/example/build$ cmake --build . --config Release
[ 50%] Building CXX object CMakeFiles/aoti_example.dir/inference.cpp.o
[100%] Linking CXX executable aoti_example
[100%] Built target aoti_example
(.aoti) (base) raphy@raohy:~/AOTInductor/example/build$ ls
aoti_example CMakeCache.txt CMakeFiles cmake_install.cmake Makefile
(.aoti) (base) raphy@raohy:~/AOTInductor/example/build$ ./aoti_example
terminate called after throwing an instance of 'std::runtime_error'
what(): Failed to initialize zip archive: file open failed
Aborted (core dumped)
Is the initial CMake Warning related ? :
CMake Warning at /home/raphy/AOTInductor/.aoti/lib/python3.12/site-packages/torch/share/cmake/Torch/TorchConfig.cmake:22 (message):
static library kineto_LIBRARY-NOTFOUND not found.
Call Stack (most recent call first):
/home/raphy/AOTInductor/.aoti/lib/python3.12/site-packages/torch/share/cmake/Torch/TorchConfig.cmake:121 (append_torchlib_if_found)
CMakeLists.txt:4 (find_package)
I searched info about this warning and found this recent, unsolved github issue :
If you’re using CPU-only, ExecuTorch which is the Edge focused runtime might be a good solution for you as well: Setting Up ExecuTorch — ExecuTorch 0.5 documentation
It is also a runtime for export-ed models that is more constrained (you don’t have access to full libtorch) but it has a much smaller footprint (~kB runtime).
(executorch) raphy@raohy:~/executorch$ ./cmake-out/executor_runner --model_path ../example_files/add.pte
I 00:00:00.000294 executorch:executor_runner.cpp:82] Model file ../example_files/add.pte is loaded.
I 00:00:00.000308 executorch:executor_runner.cpp:91] Using method forward
I 00:00:00.000317 executorch:executor_runner.cpp:138] Setting up planned buffer 0, size 48.
I 00:00:00.000350 executorch:executor_runner.cpp:161] Method loaded.
I 00:00:00.000369 executorch:executor_runner.cpp:171] Inputs prepared.
I 00:00:00.000395 executorch:executor_runner.cpp:180] Model executed successfully.
I 00:00:00.000399 executorch:executor_runner.cpp:184] 1 outputs:
Output 0: tensor(sizes=[1], [2.])
I’ve already trained for fine-tuning the model, and saved the fine-tuned model as safetensors :
(.bftner) (base) raphy@raohy:~/BertFineTuningForNERPyTorch$ ls -lah
total 132K
drwxrwxr-x 7 raphy raphy 4,0K feb 19 14:49 .
drwxr-x--- 156 raphy raphy 12K feb 19 13:31 ..
-rw-rw-r-- 1 raphy raphy 5,3K feb 18 11:50 '=0.26.0'
-rw-rw-r-- 1 raphy raphy 8,3K feb 19 14:35 BERT-NER-ExportableToExecuteTorch.py
-rw-rw-r-- 1 raphy raphy 8,2K feb 18 15:42 BERT-NER.py
drwxrwxr-x 6 raphy raphy 4,0K feb 18 11:41 .bftner
drwxrwxr-x 2 raphy raphy 4,0K feb 18 14:47 ner_model
drwxrwxr-x 5 raphy raphy 4,0K feb 18 17:14 results
drwxrwxr-x 2 raphy raphy 4,0K feb 18 14:47 tokenizer
(.bftner) (base) raphy@raohy:~/BertFineTuningForNERPyTorch$
(.bftner) (base) raphy@raohy:~/BertFineTuningForNERPyTorch$ ls -lah ./ner_model/
total 416M
drwxrwxr-x 2 raphy raphy 4,0K feb 18 14:47 .
drwxrwxr-x 7 raphy raphy 4,0K feb 19 15:24 ..
-rw-rw-r-- 1 raphy raphy 896 feb 18 18:47 config.json
-rw-rw-r-- 1 raphy raphy 416M feb 18 18:47 model.safetensors
(.bftner) (base) raphy@raohy:~/BertFineTuningForNERPyTorch$ ls -lah ./results/
total 20K
drwxrwxr-x 5 raphy raphy 4,0K feb 18 17:14 .
drwxrwxr-x 7 raphy raphy 4,0K feb 19 15:24 ..
drwxrwxr-x 2 raphy raphy 4,0K feb 18 17:14 checkpoint-1756
drwxrwxr-x 2 raphy raphy 4,0K feb 18 14:03 checkpoint-2634
drwxrwxr-x 2 raphy raphy 4,0K feb 18 14:47 checkpoint-3512
(.bftner) (base) raphy@raohy:~/BertFineTuningForNERPyTorch$ ls -lah ./tokenizer/
total 940K
drwxrwxr-x 2 raphy raphy 4,0K feb 18 14:47 .
drwxrwxr-x 7 raphy raphy 4,0K feb 19 15:24 ..
-rw-rw-r-- 1 raphy raphy 125 feb 18 18:47 special_tokens_map.json
-rw-rw-r-- 1 raphy raphy 1,2K feb 18 18:47 tokenizer_config.json
-rw-rw-r-- 1 raphy raphy 695K feb 18 18:47 tokenizer.json
-rw-rw-r-- 1 raphy raphy 227K feb 18 18:47 vocab.txt
I’m confused and lost. How should I proceed now in order to be able to convert this fine-tuned model into a model executable by ExecuTorch in a desktop environment?
Hi @raphael10-collab thanks for giving ExecuTorch a try. The workflow is as follows:
Load your .safetensor into your BERT model (torch.nn.Module)
from safetensors.torch import load_model
load_model(model, "model.safetensors")
# Instead of model.load_state_dict(load_file("model.safetensors"))
Use torch.export() to export the torch.nn.Module into a ExportedProgram. You need to prepare the example input and dynamic shape information. See this wiki for instructions. Some code examples:
import torch
ep = torch.export.export(model, example_args)
Then you would need to use ExecuTorch APIs to export the ExportedProgram into .pte files. Please find the instruction here. You can look at the example in export_hf_util.py (may not directly apply to your case), and your code will look similar to this:
from executorch.exir import to_edge
from executorch.backends.xnnpack.partition.xnnpack_partitioner import XnnpackPartitioner
program = to_edge(ep).to_backend(XnnpackPartitioner()).to_executorch()
filename = "model.pte"
with open(filename, "wb") as f:
program.write_to_file(f)
Once you have the model.pte you can run it using ./cmake-out/executor_runner:
Thank you Larry. I’ve asked help in the discord channel: Discord , because I don’t understand which dynamic shapes information should I use when exporting the torch.nn.Module into ExprtedProgram, since the model.safetensors I produced is just a fine-tuning of the Bert Model. Should I use the shapes from here: bert/modeling.py at master · google-research/bert · GitHub : input_ids: int32 Tensor of shape [batch_size, seq_length] containing word ids ? (edited)
import torch
from torch.export import export
# https://pytorch.org/docs/stable/export.html
class Mod(torch.nn.Module):
def forward(self, x: torch.Tensor) -> torch.Tensor:
ner_results = nlp(text)
return ner_results
seq_length = len(text)
batch_size = 1
example_args= (torch.rand(batch_size, seq_length))
#exported_program: torch.export.ExportedProgram = export(
#Mod(), args=example_args
#)
ep = torch.export.export(Mod(), example_args)
print(exported_program)
ep = torch.export.export(Mod(), example_args)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/raphy/BertFineTuningForNERPyTorch/.bftner/lib/python3.12/site-packages/torch/export/__init__.py", line 368, in export
return _export(
^^^^^^^^
File "/home/raphy/BertFineTuningForNERPyTorch/.bftner/lib/python3.12/site-packages/torch/export/_trace.py", line 1035, in wrapper
raise e
File "/home/raphy/BertFineTuningForNERPyTorch/.bftner/lib/python3.12/site-packages/torch/export/_trace.py", line 1008, in wrapper
ep = fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^
File "/home/raphy/BertFineTuningForNERPyTorch/.bftner/lib/python3.12/site-packages/torch/export/exported_program.py", line 128, in wrapper
return fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^
File "/home/raphy/BertFineTuningForNERPyTorch/.bftner/lib/python3.12/site-packages/torch/export/_trace.py", line 1970, in _export
return _export_for_training(
^^^^^^^^^^^^^^^^^^^^^
File "/home/raphy/BertFineTuningForNERPyTorch/.bftner/lib/python3.12/site-packages/torch/export/_trace.py", line 1035, in wrapper
raise e
File "/home/raphy/BertFineTuningForNERPyTorch/.bftner/lib/python3.12/site-packages/torch/export/_trace.py", line 1008, in wrapper
ep = fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^
File "/home/raphy/BertFineTuningForNERPyTorch/.bftner/lib/python3.12/site-packages/torch/export/exported_program.py", line 128, in wrapper
return fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^
File "/home/raphy/BertFineTuningForNERPyTorch/.bftner/lib/python3.12/site-packages/torch/export/_trace.py", line 1821, in _export_for_training
) = _process_export_inputs(mod, args, kwargs, dynamic_shapes)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/raphy/BertFineTuningForNERPyTorch/.bftner/lib/python3.12/site-packages/torch/export/_trace.py", line 1075, in _process_export_inputs
raise UserError(
torch._dynamo.exc.UserError: Expecting `args` to be a tuple of example positional inputs, got <class 'torch.Tensor'>