@sntx

sntx@lemm.ee · 21 days ago

https://en.m.wikipedia.org/wiki/Gestapo

sntx@lemm.ee · 27 days ago

This. I’ve been hosting+using searxng for a while and the experience is just great ^^

No ads, a manageable amount of websites build on AI content, fast searches, resillient service, …

sntx@lemm.ee · 1 month ago

I’m curious, how do you run the 4x3090s? The FE Cards would be 4x3=12 PCIe slots and 4x16=64 PCIe lanes… Did you nvlink them? What about transient power spikes? Any clock or even VBIOS mods?

sntx@lemm.ee · 1 month ago

I’m also on p2p 2x3090 with 48GB of VRAM. Honestly it’s a nice experience, but still somewhat limiting…

I’m currently running deepseek-r1-distill-llama-70b-awq with the aphrodite engine. Though the same applies for llama-3.3-70b. It works great and is way faster than ollama for example. But my max context is around 22k tokens. More VRAM would allow me more context, even more VRAM would allow for speculative decoding, cuda graphs, …

Maybe I’ll drop down to a 35b model to get more context and a bit of speed. But I don’t really want to justify the possible decrease in answer quality.

sntx@lemm.ee · 2 months ago

I’m running such a setup!

This is my nixos config, though feel free to ignore it, since it’s optmized for me and not others.

How did I achieve your described setup?

nixos + flakes & colmena: Sync system config & updates
impermanence through btrfs snapshots: destroy all non-declarative state between reboots to avoid drift between systems
syncthing: synchronise ALL user files between systems (at least my server is always online to reduce sync inconsistencies from only having a single device active at the time)
rustic: hourly backups from all devices to the same repos, since this is deduplicated and my systems are mostly synchronised, I have a very clear record of my file histories

sntx@lemm.ee · 2 months ago

The added info from pv is also nice ^^

sntx@lemm.ee · 4 months ago

It’s affected by the write-hole phenomenon. In BTRFS case that can mean that perfectly good old data might corrupt without any notice.

sntx@lemm.ee · 4 months ago

83 Posts, 1626 Comments of completely unliked 0-bit information posts without metadata like time of post.

sntx@lemm.ee · 5 months ago

Thanks for the writeup! So far I’ve been using ollama, but I’m always open for trying out alternatives. To be honest, it seems I was oblivious to the existence of alternatives.

Your post is suggesting that the same models with the same parameters generate different result when run on different backends?

I can see how the backend would have an influence hanfling concurrent api calls, ram/vram efficiency, supported hardware/drivers and general speed.

But going as far as having different context windows and quality degrading issues is news to me.

sntx@lemm.ee · 5 months ago

Is there an inherent benefit for using NVLINK? Should I specifically try out Aprodite over the other recommendations when having 2x 3090 with NVLINK available?

sntx@lemm.ee · 6 months ago

This sounds like a horror story to me.

sntx@lemm.ee · 7 months ago

nah, we have run0 at home

sntx@lemm.ee · edit-2 8 months ago

Yes, Taler by design allows identifiction of the receiver.

It does not reveal the sender.

It allows you to create and arbitrate your own tokens and to create your own “bank”.

Here’s a Video doing a good job at explaining it jn detail.

sntx@lemm.ee · 8 months ago

I have three things to say:

Everyone, please make sure you’ve set up sound disk encryption
That’s not a suprise (for me at least)
It’s not much different on mobile (db is unecrypted) - check out molly (signal fork) if you want to encrypt it. However encrypted db means no messages until you decrypt it.

sntx@lemm.ee · 8 months ago

@ https://lemmy.ml/c/simplex

sntx@lemm.ee · 9 months ago

This is the same setup I’m running, I can highly recommend it.