Name		Name	Last commit message	Last commit date
parent directory ..
src		src
.gitignore		.gitignore
API.md		API.md
Cargo.toml		Cargo.toml
README.md		README.md
mistralrs.pyi		mistralrs.pyi
pyproject.toml		pyproject.toml

README.md

mistral.rs PyO3 Bindings: `mistralrs`

mistralrs is a Python package which provides an API for mistral.rs. We build mistralrs with the maturin build manager.

Installation

Install required packages
- libssl (ex., sudo apt install libssl-dev)
- pkg-config (ex., sudo apt install pkg-config)

Install Rust: https://rustup.rs/

curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
source $HOME/.cargo/env

Set HF token correctly (skip if already set or your model is not gated, or if you want to use the token_source parameters in Python or the command line.)
```
mkdir ~/.cache/huggingface
touch ~/.cache/huggingface/token
echo <HF_TOKEN_HERE> > ~/.cache/huggingface/token
```

Download the code

git clone https://github.com/EricLBuehler/mistral.rs.git
cd mistral.rs

cd into the correct directory for building mistralrs: cd mistralrs-pyo3
Install maturin, our Rust + Python build system: Maturin requires a Python virtual environment such as venv or conda to be active. The mistralrs package will be installed into that environment.
```
pip install maturin[patchelf]
```
Install mistralrs Install mistralrs by executing the following in this directory where features such as cuda or flash-attn may be specified with the --features argument just like they would be for cargo run.

The base build command is:
```
maturin develop -r
```
- To build for CUDA:
```
maturin develop -r --features cuda
```
- To build for CUDA with flash attention:
```
maturin develop -r --features "cuda flash-attn"
```
- To build for Metal:
```
maturin develop -r --features metal
```
- To build for Accelerate:
```
maturin develop -r --features accelerate
```
- To build for MKL:
```
maturin develop -r --features mkl
```

Please find API docs here and the type stubs here, which are another great form of documentation.

We also provide a cookbook here!

Example

from mistralrs import ModelKind, MistralLoader, ChatCompletionRequest

kind = ModelKind.QuantizedGGUF
loader = MistralLoader(
    model_id="mistralai/Mistral-7B-Instruct-v0.1",
    kind=kind,
    no_kv_cache=False,
    repeat_last_n=64,
    quantized_model_id="TheBloke/Mistral-7B-Instruct-v0.1-GGUF",
    quantized_filename="mistral-7b-instruct-v0.1.Q4_K_M.gguf",
)
runner = loader.load()
res = runner.send_chat_completion_request(
    ChatCompletionRequest(
        model="mistral",
        messages=[
            {"role": "user", "content": "Tell me a story about the Rust type system."}
        ],
        max_tokens=256,
        frequency_penalty=1.0,
        top_p=0.1,
        temperature=0.1,
    )
)
print(res)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

mistralrs-pyo3

mistralrs-pyo3

README.md

mistral.rs PyO3 Bindings: `mistralrs`

Installation

Example

Files

mistralrs-pyo3

Directory actions

More options

Directory actions

More options

Latest commit

History

mistralrs-pyo3

Folders and files

parent directory

README.md

mistral.rs PyO3 Bindings: mistralrs

Installation

Example

mistral.rs PyO3 Bindings: `mistralrs`