

One thing I would do differently is setup LDAP and OIDC so you can use the same authentication credentials for different apps (at least the ones that support them). I use LLDAP and Authelia for this purpose.
One thing I would do differently is setup LDAP and OIDC so you can use the same authentication credentials for different apps (at least the ones that support them). I use LLDAP and Authelia for this purpose.
I found a VRAM calculator for LLMs here: https://huggingface.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Wow it seems like for 128K context size you do need a lot of VRAM (~55 GB). Qwen 72B will take up ~39 GB so you would either need 4x 24GB Nvidia cards or the Mac Pro 192 GB RAM. Probably the cheapest option would be to deploy GPU instances on a service like Runpod. I think you would have to do a lot of processing before you get to the breakeven point of your own machine.
The context cache doesn’t take up too much memory compared to the model. The main benefit of having a lot of VRAM is that you can run larger models. I think you’re better off buying a 24 GB Nvidia card from a cost and performance standpoint.
I would suggest an Intel N100 mini PC if you are planning to transcode video files with Plex. Intel Quick Sync performs better than AMD for media transcoding.
I wasn’t sure if it was AI or not. According to the description on GitHub:
Utilizes state-of-the-art algorithms to identify duplicates with precision based on hashing values and FAISS Vector Database using ResNet152.
Isn’t ResNet152 a neural network model? I was careful to say neural network instead of AI or machine learning.
Yeah, the duplicate finder uses a neural network to find duplicates I think. I went through my wedding album that had a lot of burst shots and it was able to detect similar images well.
Not sure if you’re aware, but Immich has a duplicate finder
Zotero + WebDAV for me too (I use the Caddy WebDAV module).