@WalnutLum

WalnutLum@lemmy.ml · 12 days ago

I highly recommend Obtainium to anyone who wants to keep their apps updated without needing a central report (save for the APKs that only publish on f-droid etc)

WalnutLum@lemmy.ml · 14 days ago

It was over eleven years ago at this point so my memory may be hazy on the details but I remember something happening in the major version change that pissed me off enough to switch off of it. 🤔

WalnutLum@lemmy.ml · edit-2 15 days ago

Licenses for sublime text 2 just said “and future updates”. I remember the “lifetime” thing being a selling point on producthunt. This was back in 2013 though, and the weird way the licensing change was handled made me switch to emacs.

WalnutLum@lemmy.ml · edit-2 14 days ago

Before sublime text 3 all updates were included in the single license, not just major revision updates. This was back in 2013.

WalnutLum@lemmy.ml · 15 days ago

Before the one license=one version switch in 2013 the license stated “and future updates” which they did, but they switched to needing to pay for new licenses for some reason. I remember that being the primary reason I switched to emacs.

WalnutLum@lemmy.ml · 15 days ago

After having been shafted by sublime text I will never believe anything called a “lifetime subscription” is such.

A “lifetime subscription” is just a “until we decide otherwise” subscription

WalnutLum@lemmy.ml · 1 month ago

You’re going to have to learn python.

Here’s a good overview: https://huggingface.co/docs/transformers/training

WalnutLum@lemmy.ml · 1 month ago

Or open source groups can make a fully open repro of it: https://github.com/huggingface/open-r1

WalnutLum@lemmy.ml · 4 months ago

Yea a surprisingly small number of people don’t know a git remote can literally be any folder outside of your tree, over almost any kind of connection.

I thought about doing a forge but realized that if I was the only one working on this stuff then I could do the same thing by setting my remote to a folder on my NAS.

WalnutLum@lemmy.ml · 4 months ago

It’s your queries + your IP combined with the rest of the data the net collects from you that identifies you.

WalnutLum@lemmy.ml · edit-2 5 months ago

Oh, yes but the DRM exemption clause means that you can backwards engineer the changes and continue releasing them under GPL

Edit: as an example we should probably be looking at the duckststion situation evolving right now:

https://vimuser.org/duckstation.html

WalnutLum@lemmy.ml · 5 months ago

“releasing the modified version to the public” would cover them re-closing the source and then subsequently releasing that newly closed source, so they can’t relicense it and then release the built version of the code.

At least not easily, this is where court history would likely need to be visited because the way it’s worded the interpretability of “modified” in this context would need to be examined.

WalnutLum@lemmy.ml · 7 months ago

One of the few practical things AI might be good at:

https://github.com/CorentinJ/Real-Time-Voice-Cloning

WalnutLum@lemmy.ml · 8 months ago

NewPipe can do peertube as well

WalnutLum@lemmy.ml · 9 months ago

Yes of course, there’s nothing gestalt about model training, fixed inputs result in fixed outputs

WalnutLum@lemmy.ml · 9 months ago

I suppose the importance of the openness of the training data depends on your view of what a model is doing.

If you feel like a model is more like a media file that the model loaders are playing back, where the prompt is more of a type of control over how you access this model then yes I suppose from a trustworthiness aspect there’s not much to the model’s training corpus being open

I see models more in terms of how any other text encoder or serializer would work, if you were, say, manually encoding text. While there is a very low chance of any “malicious code” being executed, the importance is in the fact that you can check the expectations about how your inputs are being encoded against what the provider is telling you.

As an example attack vector, much like with something like a malicious replacement technique for anything, if I were to download a pre-trained model from what I thought was a reputable source, but was man-in-the middled and provided with a maliciously trained model, suddenly the system I was relying on that uses that model is compromised in terms of the expected text output. Obviously that exact problem could be fixed with some has checking but I hope you see that in some cases even that wouldn’t be enough. (Such as malicious “official” providence)

As these models become more prevalent, being able to guarantee integrity will become more and more of an issue.

WalnutLum@lemmy.ml · 9 months ago

I’ve seen this said multiple times, but I’m not sure where the idea that model training is inherently non-deterministic is coming from. I’ve trained a few very tiny models deterministically before…

WalnutLum@lemmy.ml · 9 months ago

I’m not sure where you get that idea. Model training isn’t inherently non-deterministic. Making fully reproducible models is 360ai’s apparent entire modus operandi.

WalnutLum@lemmy.ml · 9 months ago

There are VERY FEW fully open LLMs. Most are the equivalent of source-available in licensing and at best, they’re only partially open source because they provide you with the pretrained model.

To be fully open source they need to publish both the model and the training data. The importance is being “fully reproducible” in order to make the model trustworthy.

In that vein there’s at least one project that’s turning out great so far:

https://www.llm360.ai/

WalnutLum@lemmy.ml · 9 months ago

Holy crap there are still working nitter instances? God bless