Off-and-on trying out an account over at @tal@oleo.cafe due to scraping bots bogging down lemmy.today to the point of near-unusability.

  • 1 Post
  • 176 Comments
Joined 2 years ago
cake
Cake day: October 4th, 2023

help-circle
  • Have a limited attack surface will reduce exposure.

    If, say, the only thing that you’re exposing is, oh, say, a Wireguard VPN, then unless there’s a misconfiguration or remotely-exploitable bug in Wireguard, then you’re fine regarding random people running exploit scanners.

    I’m not too worried about stuff like (vanilla) Apache, OpenSSH, Wireguard, stuff like that, the “big” stuff that have a lot of eyes on them. I’d be a lot more dubious about niche stuff that some guy just threw together.

    To put perspective on this, you gotta remember that most software that people run isn’t run in a sandbox. It can phone home. Games on Steam. If your Web browser has bugs, it’s got a lot of sites that might attack it. Plugins for that Web browser. Some guy’s open-source project. That’s a potential vector too. Sure, some random script kiddy running an exploit scanner is a potential risk, but my bet is that if you look at the actual number of compromises via that route, it’s probably rather lower than plain old malware.

    It’s good to be aware of what you’re doing when you expose the Internet to something, but also to keep perspective. A lot of people out there run services exposed to the Internet every day; they need to do so to make things work.




  • https://stackoverflow.com/questions/30869297/difference-between-memfree-and-memavailable

    Rik van Riel’s comments when adding MemAvailable to /proc/meminfo:

    /proc/meminfo: MemAvailable: provide estimated available memory

    Many load balancing and workload placing programs check /proc/meminfo to estimate how much free memory is available. They generally do this by adding up “free” and “cached”, which was fine ten years ago, but is pretty much guaranteed to be wrong today.

    It is wrong because Cached includes memory that is not freeable as page cache, for example shared memory segments, tmpfs, and ramfs, and it does not include reclaimable slab memory, which can take up a large fraction of system memory on mostly idle systems with lots of files.

    Currently, the amount of memory that is available for a new workload, without pushing the system into swap, can be estimated from MemFree, Active(file), Inactive(file), and SReclaimable, as well as the “low” watermarks from /proc/zoneinfo.

    However, this may change in the future, and user space really should not be expected to know kernel internals to come up with an estimate for the amount of free memory.

    It is more convenient to provide such an estimate in /proc/meminfo. If things change in the future, we only have to change it in one place.

    Looking at the htop source:

    https://github.com/htop-dev/htop/blob/main/MemoryMeter.c

       /* we actually want to show "used + shared + compressed" */
       double used = this->values[MEMORY_METER_USED];
       if (isPositive(this->values[MEMORY_METER_SHARED]))
          used += this->values[MEMORY_METER_SHARED];
       if (isPositive(this->values[MEMORY_METER_COMPRESSED]))
          used += this->values[MEMORY_METER_COMPRESSED];
    
       written = Meter_humanUnit(buffer, used, size);
    

    It’s adding used, shared, and compressed memory, to get the amount actually tied up, but disregarding cached memory, which, based on the above comment, is problematic, since some of that may not actually be available for use.

    top, on the other hand, is using the kernel’s MemAvailable directly.

    https://gitlab.com/procps-ng/procps/-/blob/master/src/free.c

    	printf(" %11s", scale_size(MEMINFO_GET(mem_info, MEMINFO_MEM_AVAILABLE, ul_int), args.exponent, flags & FREE_SI, flags & FREE_HUMANREADABLE));
    

    In short: You probably want to trust /proc/meminfo’s MemAvailable, (which is what top will show), and htop is probably giving a misleadingly-low number.



  • If databases are involved they usually offer some method of dumping all data to some kind of text file. Usually relying on their binary data is not recommended.

    It’s not so much text or binary. It’s because a normal backup program that just treats a live database file as a file to back up is liable to have the DBMS software write to the database while it’s being backed up, resulting in a backed-up file that’s a mix of old and new versions, and may be corrupt.

    Either:

    1. The DBMS needs to have a way to create a dump — possibly triggered by the backup software, if it’s aware of the DBMS — that won’t change during the backup

    or:

    1. One needs to have filesystem-level support to grab an atomic snapshot (e.g. one takes an atomic snapshot using something like btrfs and then backs up the snapshot rather than the live filesystem). This avoids the issue of the database file changing while the backup runs.

    In general, if this is a concern, I’d tend to favor #2 as an option, because it’s an all-in-one solution that deals with all of the problems of files changing while being backed up: DBMSes are just a particularly thorny example of that.

    Full disclosure: I mostly use ext4 myself, rather than btrfs. But I also don’t run live DBMSes.

    EDIT: Plus, #2 also provides consistency across different files on the filesystem, though that’s usually less-critical. Like, you won’t run into a situation where you have software on your computer update File A, then does a sync(), then updates File B, but your backup program grabs the new version of File B but then the old version of File A. Absent help from the filesystem, your backup program won’t know where write barriers spanning different files are happening.

    In practice, that’s not usually a huge issue, since fewer software packages are gonna be impacted by this than write ordering internal to a single file, but it is permissible for a program, under Unix filesystem semantics, to expect that the write order persists there and kerplode if it doesn’t…and a traditional backup won’t preserve it the way that a backup with help from the filesystem can.



  • I think that the problem will be if software comes out that’s doesn’t target home PCs. That’s not impossible. I mean, that happens today with Web services. Closed-weight AI models aren’t going to be released to run on your home computer. I don’t use Office 365, but I understand that at least some of that is a cloud service.

    Like, say the developer of Video Game X says “I don’t want to target a ton of different pieces of hardware. I want to tune for a single one. I don’t want to target multiple OSes. I’m tired of people pirating my software. I can reduce cheating. I’m just going to release for a single cloud platform.”

    Nobody is going to take your hardware away. And you can probably keep running Linux or whatever. But…not all the new software you want to use may be something that you can run locally, if it isn’t released for your platform. Maybe you’ll use some kind of thin-client software — think telnet, ssh, RDP, VNC, etc for past iterations of this — to use that software remotely on your Thinkpad. But…can’t run it yourself.

    If it happens, I think that that’s what you’d see. More and more software would just be available only to run remotely. Phones and PCs would still exist, but they’d increasingly run a thin client, not run software locally. Same way a lot of software migrated to web services that we use with a Web browser, but with a protocol and software more aimed at low-latency, high-bandwidth use. Nobody would ban existing local software, but a lot of it would stagnate. A lot of new and exciting stuff would only be available as an online service. More and more people would buy computers that are only really suitable for use as a thin client — fewer resources, closer to a smartphone than what we conventionally think of as a computer.

    EDIT: I’d add that this is basically the scenario that the AGPL is aimed at dealing with. The concern was that people would just run open-source software as a service. They could build on that base, make their own improvements. They’d never release binaries to end users, so they wouldn’t hit the traditional GPL’s obligation to release source to anyone who gets the binary. The AGPL requires source distribution to people who even just use the software.


  • I will say that, realistically, in terms purely of physical distance, a lot of the world’s population is in a city and probably isn’t too far from a datacenter.

    https://calculatorshub.net/computing/fiber-latency-calculator/

    It’s about five microseconds of latency per kilometer down fiber optics. Ten microseconds for a round-trip.

    I think a larger issue might be bandwidth for some applications. Like, if you want to unicast uncompressed video to every computer user, say, you’re going to need an ungodly amount of bandwidth.

    DisplayPort looks like it’s currently up to 80Gb/sec. Okay, not everyone is currently saturating that, but if you want comparable capability, that’s what you’re going to have to be moving from a datacenter to every user. For video alone. And that’s assuming that they don’t have multiple monitors or something.

    I can believe that it is cheaper to have many computers in a datacenter. I am not sold that any gains will more than offset the cost of the staggering fiber rollout that this would require.

    EDIT: There are situations where it is completely reasonable to use (relatively) thin clients. That’s, well, what a lot of the Web is — browser thin clients accessing software running on remote computers. I’m typing this comment into Eternity before it gets sent to a Lemmy instance on a server in Oregon, much further away than the closest datacenter to me. That works fine.

    But “do a lot of stuff in a browser” isn’t the same thing as “eliminate the PC entirely”.









  • I’m not familiar with FreshRSS, but assuming that there’s something in the protocol that lets a reader push up a “read” bit on an per article basis — this page references a “GReader” API — I’d assume that that’d depend on the client, not the server.

    If the client attempts an update and fails and that causes it to not retry again later, then I imagine that it wouldn’t work. If it does retry until it sets the bit, I’d imagine that it does work. The FreshRSS server can’t really be a factor, because it won’t know whether the client has tried to talk to it when it’s off.

    EDIT: Some of the clients in the table on the page I linked to say that they “work offline”, so I assume that the developers at least have some level of disconnected operation in mind.

    The RSS readers I’ve always used are strictly pull. They don’t set bits on the server, and any “read” flag lives only on the client.


  • tal@lemmy.todaytoSelfhosted@lemmy.worldLVM question
    link
    fedilink
    English
    arrow-up
    2
    ·
    20 days ago

    Secondly, is there a benefit to creating an LVM volume with a btrfs filesystem vs just letting btrfs handle it?

    Like, btrfs on top of LVM versus btrfs? Well, the latter gives you access to LVM features. If you want to use lvmcache or something, you’d want it on LVM.



  • Some other interesting tidbits, if you’re not familiar with them.

    Japan twice shelled locations in the contiguous US from the deck gun on submarines, though the shelling caused very limited damage.

    https://en.wikipedia.org/wiki/Bombardment_of_Ellwood

    Firing in the dark from a submarine buffeted by waves, it was inevitable that rounds would miss their target. One round passed over Wheeler’s Inn, whose owner Laurence Wheeler promptly called the Santa Barbara County Sheriff’s Office. A deputy sheriff assured him that warplanes were already on their way, but none arrived. The Japanese shells destroyed a derrick and a pump house, while the Ellwood Pier and a catwalk suffered minor damage.

    https://en.wikipedia.org/wiki/Bombardment_of_Fort_Stevens

    Most Japanese rounds landed in a nearby baseball field or a swamp, although one landed close to Battery Russell and another next to a concrete pillbox. One round damaged several large telephone cables, the only real damage that Tagami caused. A total of seventeen explosive shells were fired at the fort.[5]

    Japan tried to firebomb the contiguous US to start forest fires from a floatplane that was carried, disassembled, in a large submarine.

    The impact was also limited, though one did start a fire that was extinguished.

    https://en.wikipedia.org/wiki/Lookout_Air_Raids

    On September 9, 1942, a Japanese Yokosuka E14Y Glen floatplane, launched from a Japanese submarine, dropped two incendiary bombs with the intention of starting a forest fire.

    Fujita dropped two bombs, one on Wheeler Ridge on Mount Emily in Oregon. The location of the other bomb is unknown. The Wheeler Ridge bomb started a small fire 16 km (9.9 mi) due east of Brookings.[6]

    The two men proceeded to the location and were able to keep the fire under control.

    Ultimately, the town and the pilot wound up ending things on pretty good terms:

    Twenty years later, Fujita was invited back to Brookings. Before he made the trip the Japanese government was assured he would not be tried as a war criminal. In Brookings, Fujita served as Grand Marshal for the local Azalea Festival.[1] At the festival, Fujita presented his family’s 400-year-old samurai sword to the city as a symbol of reconciliation. Fujita made a number of additional visits to Brookings, serving as an “informal ambassador of peace and friendship”.[7] Impressed by his welcome in the United States, in 1985 Fujita invited three students from Brookings to Japan. During the visit of the Brookings-Harbor High School students to Japan, Fujita received a dedicatory letter from an aide of President Ronald Reagan “with admiration for your kindness and generosity”. Fujita returned to Brookings in 1990, 1992, and 1995. In 1992 he planted a tree at the bomb site as a gesture of peace. In 1995, he moved the samurai sword from the Brookings City Hall into the new library’s display case. He was made an honorary citizen of Brookings several days before his death on September 30, 1997, at the age of 85.[8] In October 1998, his daughter, Yoriko Asakura, buried some of Fujita’s ashes at the bomb site.

    Japan had used biological weapons in China and also considered use of biological weapons in the US, dropping bubonic plague.

    This was not ultimately carried out. Had it been done, it would have happened very late in the war, with Japan being in pretty dire military straits at that point.

    https://en.wikipedia.org/wiki/Operation_PX

    Operation PX (Japanese: PX作戦, romanized: PX Sakusen), also known as Operation Cherry Blossoms at Night (夜桜作戦 Yozakura Sakusen)[1] was a planned Japanese military attack on civilians in the United States using biological weapons, devised during World War II. The proposal was for Imperial Japanese Navy submarines to launch seaplanes that would deliver weaponized bubonic plague, developed by Unit 731 of the Imperial Japanese Army, to the West Coast of the United States. The operation was abandoned shortly after its planning was finalized in March 1945 due to the strong opposition of General Yoshijirō Umezu, Chief of the Army General Staff.

    The plan for the attack involved Seiran aircraft launched by submarine aircraft carriers upon the West Coast of the United States—specifically, the cities of San Diego, Los Angeles, and San Francisco. The planes would spread weaponized bubonic plague, cholera, typhus, dengue fever, and other pathogens in a biological terror attack upon the population. The submarine crews would infect themselves and run ashore in a suicide mission.[3][4][5][6]