Abstract solutions for content recognition with a bot on a server is not a platform specific issue. The dev is skilled and likely on Matrix too.
It is a bot that identifies CSAM images. They are a very skilled dev. The problem is content recognition on a server. So in abstract, it is the same problem.
Search for posts or contact db0. IIRC they worked with LW admin and others to create a filter for this using a very small AI model. It should be on their Git.
My computer was 150 watts back 20 years ago and is 750 now. I am sitting beside a second monitor, window AC, portable heater, DMM, Oscilloscope, soldering iron, hotplate, toaster oven, 2nd soldering iron, laser printer, 3d printer, soldering fume extractor, HEPA air purifier, router, server, three different lamps, two fans, a rack mounted programmable power supply, three other benchtop power supplies, solder pot, laminator, hot air rework station, home theater stereo, and countless chargers for projects and devices. That is all just within my bedroom. Not everyone is a Maker like me, or under life circumstances where they need such things in a compact and accessible space due to physical disability. Perhaps that makes me biased to one extreme. This kind of accessibility for Maker tools is something that did not exist 20 years ago. You might find a few outliers with lots of space, money, skills, and time that did similar things, but these were not some below average middle class fool like me and they were far more rare than they are now. I am very careful about what I have plugged in and powered at the same time even with my room wired for more load than is typical.
The whole line is dangerous. Often from something like a window AC or portable heater added to to same outlet as room lighting and a TV or computer. The AC or heater is already near the maximum current for the line, usually 1400 watts. If you feel the outlets or wire in the walls they will be warm or hot. We have a lot more stuff plugged in than the era when most houses were built. Any connection issues can cause further problems. The wall outlet and circuit breaker are common. If your lights start flickering at all, you need to find the issue quickly. That isn’t always going to happen, but in many cases flickering lights are a sign of the problem.
There are many levels of objections to this behavior. Some may choose to remove comments after a given amount of time, although that is rather pointless. Any company can setup an instance and use the initial synchronization of federated instances to capture all data.
Some people might simply avoid sharing any personal information that can dox them or correlate them with other profiles.
Then there are people like myself. I do not really care about how this account is correlated. I am only here for the human social connectedness. I object to all collection of personal data and view the trade of such data as digital slavery and a gross violation of fundamental human rights.
My personal threat model is that I avoid any potential situation where my dwell time, and page views are monitored and used to manipulate and exploit me. I’m particularly concerned with how the best and brightest psychology majors have been getting into social media and marketing jobs. I noticed a pattern of how I was motivated to make frivolous purchases over time when I engaged with corporate media sources. I have never responded to ads directly, but when I shopped on a platform, suddenly I encountered more content relative to that platform. After many projects I started asking myself why I chose to do x/y/z, and it was usually due to some suggested content I had watched.
Around the time I came to Lemmy, I disconnected from all corporate social media. I won’t even run most apps if they are connected to the internet. I do most stuff in a browser only. I separate social media from any shopping. I also run a whitelist firewall for most of my devices.
I am protecting myself from any viewer retention algorithms that might directly or indirectly use the human propensity for masochistic negative attraction and attachment. I found this damaging on platforms like FB in the first years of my physical disability a decade ago.
So one might say, my actionable threat model is direct manipulation based concerns.
I use it if I do not have my own server running. There are many times I am looking for a word or info that is not very compatible with a search engine but is effective for AI.
I think there are a lot of people that just do not understand AI, and how to use it effectively. There seems to be a user filter where low Machiavellian individuals with poor abstractive thinking skills struggle to understand the subjective nature of the world in every space and how all of our collective incorrectness is present in AI too.
Perhaps they only played with very small models of the Llama 2 era, which were much harder to effectively use. Or maybe they played with llama.cpp before April of last year when the special tokens were incorrectly defaulted to GPT2 tokens for all models in the first ~260 positions.
I forget the name of the issue, but seem to recall there is some kind of unsolved database indexing paradox that is why search engines suck now, but that is a vague memory. Likely AI is just the whipping boy for people that are getting left behind by technology they are unable to learn effectively.
OCR 5 from F-droid was really good for me like 2+ years ago, but when I tried it more recently it was garbage. It really stood out to me around 2 years ago because around 5 years ago I tried translating a Chinese datasheet for one of the Atmel uC clone microcontrollers and OCR was not fun then.
Maybe have a look at Huggingface spaces and see if anyone has a better methodology setup as an example. Or look at the history of the models and see if one of the older ones is still available.
I’ve never had great results with tesseract if the image has compression so the mixed background sounds like a nightmare. There is probably some JavaScript stream in there but good luck accessing it. BR is hot garbage for a standard.
Plan 9
Need max AVX instructions. Anything with P/E cores is junk. Only enterprise P cores have the max AVX instructions. When P/E are mixed the advanced AVX is disabled in microcode because the CPU scheduler is unable to determine if a process thread contains an AVX instruction and there is no asymmetrical scheduler that handles this. Prior to early 12k series Intel, the microcode for P enterprise could allegedly run if swapped manually. This was “fused off” to prevent it, probably because Linux could easily be adapted to asymmetrical scheduling but Windows would probably not. The whole reason W11 had to be made was because of the E-cores and the way the scheduler and spin up of idol cores works, at least according to someone on Linux Plumbers for the CPU scheduler ~2020. There are already asymmetric schedulers in Android ARM.
Anyways I think it was on Gamer’s Nexus in the last week or two that Intel was doing some all P core consumer stuff. I’d look at that. According to chips and cheese, the primary CPU bottleneck for tensors is the bus width and clock management of the L2 to L1 cache.
I do alright with my laptop, but haven’t tried R1 stuff yet. The 70B llama2 stuff that I ran was untenable for CPU only with a 12700 with just CPU. It is a little slower than my reading pace when split with a 16 GB GPU, and that was running a 4 bit quantization version.
Not unless an http port is open too. If the only port is https, you have to have the certificate. Like with my AI stuff it acts like the host is down if I try to connect with http. You have to have the certificate to decrypt anything at all from the host.
Sorta, you have to install your certificate authority into the browser and it might complain about verifying that but it will still connect with the encryption.
deleted by creator
I mean more like a self signed TLS certificate with your own host manually set in the browser. Then only make the TLS port available, or something like that. If you have access to both(all) devices, you should be able to fully encrypt by bruit force and without registering the certificate with anyone. That is what I do with AI at home.
I’ve half ass thought about this but never have tried to actually self host. If you have access to all devices, why not just use your own self signed certificates to encrypt everything and require the certificate for all connections? Then there is never a way to log in or connect right? The only reason for any authentication is to make it possible to use any connection to dial into your server. So is that a bug or a feature. Maybe I’m missing something fundamental in this abstract concept that someone will tell me?
White list firewall
You generally want to use a trusted protection module (TPM) chip like what is on most current computers and Pixel phones. The thing to understand about the TPM chips is that they have a set of unique internal keys that cannot be accessed at all. These keys are used to hash against and create other keys. The inaccessibility of this unique keyset is the critical factor.
If you store keys in any regular memory, you are taking a chance. Maybe check out Joe Grand’s YT stuff. He had posted about hacking legit keys to recover large crypto amounts. Joe is behind the JTAGulator, if you have ever seen that one, and was a famous child hacker going by “Kingpin.”
I recall reading somewhere about a software implementation of TPM for secure boot, but I didn’t look into it very deeply and do not recall where I read about it. Probably on Gentoo, Arch, or maybe in the book Beyond Bios (terrible)
Andrew Huang used to have stuff up on YT that would be relevant to real security of such a device, but you usually need to know where he wrote articles to find links because most of his stuff is publicly listed on YT. He has also removed a good bit over the years when certain exploits are unfixable like accessing the 8051 microcontroller built into most SD cards and running transparently. Andrew is the author of Hacking the Xbox which involved basically a man in the middle attack on a high speed PCIE (IIRC) connection.
It would be a ton of work to try to reverse engineer what you have created and implemented in such a device. Unless you’re storing millions, it is probably not something anyone is going to mess with.
datatas
They are in a lot of IoT devices that are not hobby and dev related too. Like my folk’s smoker grill has one that is also on a ridiculous AWS connection and designed to try and stay on 24/7 like proper stalkerware nonsense.