Pulse · ggerganov/llama.cpp

September 15, 2024 – September 22, 2024

Overview

52 Active pull requests

76 Active issues

32 Releases published by 1 person

b3759
published Sep 15, 2024
b3760
published Sep 15, 2024
b3761
published Sep 15, 2024
b3763
published Sep 16, 2024
b3764
published Sep 16, 2024
b3766
published Sep 16, 2024
b3765
published Sep 16, 2024
b3767
published Sep 16, 2024
b3770
published Sep 16, 2024
b3771
published Sep 16, 2024
b3772
published Sep 16, 2024
b3774
published Sep 17, 2024
b3775
published Sep 17, 2024
b3777
published Sep 17, 2024
b3778
published Sep 17, 2024
b3779
published Sep 17, 2024
b3781
published Sep 18, 2024
b3782
published Sep 18, 2024
b3783
published Sep 18, 2024
b3785
published Sep 18, 2024
b3786
published Sep 19, 2024
b3787
published Sep 19, 2024
b3788
published Sep 20, 2024
b3789
published Sep 20, 2024
b3790
published Sep 20, 2024
b3795
published Sep 20, 2024
b3797
published Sep 21, 2024
b3798
published Sep 21, 2024
b3799
published Sep 21, 2024
b3800
published Sep 22, 2024
b3801
published Sep 22, 2024
b3802
published Sep 22, 2024

67 Unresolved conversations

Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.

vocab: refactor tokenizer to reduce the overhead of creating multi times tokenizer
#9449 commented on Sep 20, 2024 • 11 new comments
Support video understanding
#9165 commented on Sep 17, 2024 • 3 new comments
Update clip.cpp
#9482 commented on Sep 18, 2024 • 2 new comments
ggml: Add run-time detection of neon, i8mm and sve
#9331 commented on Sep 20, 2024 • 2 new comments
IBM Granite MoE Architecture
#9438 commented on Sep 17, 2024 • 1 new comment
server : add Hermes-3 tool call support (WIP)
#9254 commented on Sep 19, 2024 • 1 new comment
Add Intel Advanced Matrix Extensions (AMX) support to ggml
#8998 commented on Sep 18, 2024 • 1 new comment
Merging #7568 with #7430(Implementing LLaMA 3 torch to gguf conversion)
#7651 commented on Sep 20, 2024 • 1 new comment
server : ability to disable context shift
#9390 commented on Sep 21, 2024 • 0 new comments
Bug: Mac build failed using make
#9157 commented on Sep 21, 2024 • 0 new comments
Feature Request: Add support for Phi-3.5 MoE and Vision Instruct
#9119 commented on Sep 21, 2024 • 0 new comments
Bug: llama-server api first query very slow
#9492 commented on Sep 21, 2024 • 0 new comments
Feature Request: Support for Qwen2-VL
#9246 commented on Sep 21, 2024 • 0 new comments
Phi 3 medium/small support
#7439 commented on Sep 21, 2024 • 0 new comments
Bug: NikolayKozloff/madlad400-10b-mt-Q8_0-GGUF works with llama-cli but doesn't work with llama-server
#9030 commented on Sep 21, 2024 • 0 new comments
obtain attention matrices during inference, similar to the output_attentions=True parameter in the transformers package
#9122 commented on Sep 21, 2024 • 0 new comments
Bug: cannot create std::vector larger than max_size()
#9391 commented on Sep 21, 2024 • 0 new comments
Feature Request: Pixtral by Mistral support (pixtral-12b-240910)
#9440 commented on Sep 20, 2024 • 0 new comments
llama : store token ids in the KV Cache
#9113 commented on Sep 20, 2024 • 0 new comments
Refactor: decide the future of llama_tensor_get_type()
#8736 commented on Sep 20, 2024 • 0 new comments
llama : add test for saving/loading sessions to the CI
#2631 commented on Sep 17, 2024 • 0 new comments
Improve `cvector-generator`
#8724 commented on Sep 21, 2024 • 0 new comments
Bug: Throughput (tokens/sec) does not scale with increasing batch sizes in Intel GPUs
#9097 commented on Sep 22, 2024 • 0 new comments
Bug: High CPU usage and bad output with flash attention on ROCm
#8893 commented on Sep 22, 2024 • 0 new comments
Bug: Llava not working on android
#8436 commented on Sep 22, 2024 • 0 new comments
llama : fix K-shift with quantized K (wip)
#5653 commented on Sep 20, 2024 • 0 new comments
Server: enable lookup decoding
#6828 commented on Sep 18, 2024 • 0 new comments
added implementation of DRY sampler
#6839 commented on Sep 22, 2024 • 0 new comments
Changes for the existing quant strategies / FTYPEs and new ones
#8836 commented on Sep 19, 2024 • 0 new comments
Add lora test workflow (WIP)
#9058 commented on Sep 20, 2024 • 0 new comments
server: add repeat penalty sigmoid
#9076 commented on Sep 15, 2024 • 0 new comments
llama : initial Mamba-2 support
#9126 commented on Sep 18, 2024 • 0 new comments
convert : refactor rope_freqs generation
#9396 commented on Sep 18, 2024 • 0 new comments
naming : normalize the name of callback-related identifiers
#9405 commented on Sep 16, 2024 • 0 new comments
Question: How to generate an MPS gputrace
#6506 commented on Sep 17, 2024 • 0 new comments
changelog : `libllama` API
#9289 commented on Sep 17, 2024 • 0 new comments
llama : support sliding window attention
#3377 commented on Sep 17, 2024 • 0 new comments
Bug: On a 3 GPU System [A-C] not using CUDA_VISIBLE_DEVICES but using tensor split [1,1,0] should not allocate ANY memory on GPU C
#8827 commented on Sep 17, 2024 • 0 new comments
Bug: GGML_ASSERT(llama_add_eos_token(model) != 1) failed llama-server critical error with flan-t5 models
#8990 commented on Sep 17, 2024 • 0 new comments
Bug: Couldn't load GGUF file into Transformers
#9021 commented on Sep 17, 2024 • 0 new comments
Vulkan adreno error
#9064 commented on Sep 17, 2024 • 0 new comments
llama : support reranking API endpoint and models
#8555 commented on Sep 16, 2024 • 0 new comments
llama : speed-up grammar sampling
#4218 commented on Sep 16, 2024 • 0 new comments
Bug: andriod compiling bug, with vulkan open
#9489 commented on Sep 16, 2024 • 0 new comments
Feature Request: T-MAC: CPU Renaissance via Table Lookup for Low-Bit LLM Deployment on Edge
#8485 commented on Sep 16, 2024 • 0 new comments
Bug: src/llama.cpp:15099: Deepseek2 does not support K-shift
#8862 commented on Sep 16, 2024 • 0 new comments
Feature Request: Support vulkan when building on Android
#8933 commented on Sep 16, 2024 • 0 new comments
Feature Request: RDMA support for rpc back ends
#9493 commented on Sep 15, 2024 • 0 new comments
Bug: There is an issue to execute llama-baby-llama.
#9478 commented on Sep 15, 2024 • 0 new comments
server : improvements and maintenance
#4216 commented on Sep 15, 2024 • 0 new comments
Feature Request: NPU Support
#9181 commented on Sep 20, 2024 • 0 new comments
Bug: MinGW build fails to load models with "error loading model: PrefetchVirtualMemory unavailable"
#9311 commented on Sep 20, 2024 • 0 new comments
Feature Request: InternVL2 Support ?
#8848 commented on Sep 20, 2024 • 0 new comments
Encounter the "newline in constant" error while compiling with MSVC
#8334 commented on Sep 20, 2024 • 0 new comments
Bug: llama3.1 8B GGUF parallel inferring process leads to endless repeating results
#9104 commented on Sep 20, 2024 • 0 new comments
llama : refactor llama_vocab
#9369 commented on Sep 19, 2024 • 0 new comments
Bug: llama_print_timings seems to accumulate load_time/total_time in `llama-bench`
#9286 commented on Sep 19, 2024 • 0 new comments
Feature Request: Support Codestral Mamba
#8519 commented on Sep 19, 2024 • 0 new comments
BF16 has no CUDA support
#8941 commented on Sep 19, 2024 • 0 new comments
Bug: Unable to load phi3:3B(2.2GB) model on Apple M1 Pro
#9049 commented on Sep 19, 2024 • 0 new comments
[CANN]Feature Request: Support OrangeAIPRO 310b CANN
#9481 commented on Sep 18, 2024 • 0 new comments
Support speculative decoding in `server` example
#5877 commented on Sep 18, 2024 • 0 new comments
Running Lllava in interactive mode just Quits after generating response without waiting for next prompt.
#3593 commented on Sep 18, 2024 • 0 new comments
llama : tool for evaluating quantization results per layer
#2783 commented on Sep 18, 2024 • 0 new comments
metal : compile-time kernel args and params
#4085 commented on Sep 18, 2024 • 0 new comments
ggml : unified CMake build
#6913 commented on Sep 18, 2024 • 0 new comments
How to utilize GPU on Android to accelerate inference?
#8705 commented on Sep 18, 2024 • 0 new comments

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

September 15, 2024 – September 22, 2024

Overview

Could not load contribution data

32 Releases published by 1 person

40 Pull requests merged by 21 people

12 Pull requests opened by 11 people

54 Issues closed by 15 people

22 Issues opened by 22 people

67 Unresolved conversations

Insights: ggerganov/llama.cpp

September 15, 2024 – September 22, 2024

Overview

Could not load contribution data

32 Releases published by 1 person

40 Pull requests merged by 21 people

12 Pull requests opened by 11 people

54 Issues closed by 15 people

22 Issues opened by 22 people

67 Unresolved conversations