• 0 Posts
  • 7 Comments
Joined 3 years ago
cake
Cake day: June 28th, 2023

help-circle


  • Well, if you’re that sensitive, then fine. In fact, it’s just a classic fear of public speaking. If you’re uncomfortable expressing your opinion in front of a wide audience, why are you doing it? Not everyone likes you in the audience, yes, that’s right. There is no tragedy in this. And this is not a reason to ban everyone. Just don’t speak out publicly, gather an interest group and discuss some topics there. There are worse things in the world than voting against.

    P.S. and you don’t have to stress so much that this is not a criticism of me. because it doesn’t matter. gather your will and just say what you want and let people do what they want with it. That’s how it works. There is a moderators for everything else.

    UPD: I’m Russian in general, and I can catch downvotes just for breathing. not very often, but still. Well, that’s life.



  • I don’t know about you, but I believe that the people who gave me a negative review can still create content that interests me. After all, if a person doesn’t agree with my opinion, it doesn’t make their comments and posts any worse for me. Otherwise, i can say something stupid that i would be ashamed to reread in 5 years and then block half of the platform.

    Lemmi already has one of the softest voting systems. on Habr, for example, if your rating is negative, you can’t write comments at all.


  • Yes, it is. But I have llama-swap, openweb-ui. If you spend some time on the llama-swap configuration, then the you have a good chance to run the model on 2 cards is through llama.cpp. The winnings, however, will not be x2 of course and will fall non-linearly from the number of cards. And you need motherboard with good PCI-E lines (2 pci-e x16 or more). But it’s still cheaper than one large card. Example:

    HIP_VISIBLE_DEVICES=0,1 \
    /opt/llama.cpp/build/bin/llama-server \
      --host 127.0.0.1 \
      --port 8082 \
      --model /storage/models/model.gguf \
      --n-gpu-layers all \
      --split-mode layer \
      --tensor-split 1,1 \
      --ctx-size 32768 \
      --batch-size 512 \
      --ubatch-size 512 \
      --flash-attn on \
      --parallel 1
    

    There is a less stable but more productive one: --split-mode row

    P.S. By the way, one RX9070XT on my instance translates posts and comments. You can test it if you want. =)