Skip Navigation

InitialsDiceBearhttps://github.com/dicebear/dicebearhttps://creativecommons.org/publicdomain/zero/1.0/„Initials” (https://github.com/dicebear/dicebear) by „DiceBear”, licensed under „CC0 1.0” (https://creativecommons.org/publicdomain/zero/1.0/)M
Posts
18
Comments
455
Joined
2 yr. ago

  • AI tools were apparently used for locating the bugs but the reports were real and legit.

    Yes, but the FFMPEG developers do not know this until after they triage all the bug reports they are getting swamped with. If Google really wants a fix for their 6.0 CVE immediately (because again, part of the problem here was google's security team was breathing down the necks of the maintainers), then google can submit a fix. Until then, fffmpeg devs have to work on figuring out if any more criticial looking issues they receive, are actually critical.

    It’s nuts to suggest continuing to ship something with known vulnerabilities without, at minimum,

    Again, the problem is false positive vulnerabilities. "9.0 CVE's" (that are potentially real) must be triaged before Google's 6.0 CVE.

    It would be great if Google could fix it, but ffmpeg is very hard to work in, not just because of the code organization but because of the very specialized knowledge needed to mess around inside a codec. It would be simpler and probably better for Google to contribute development funding since they depend on the software so heavily.

    Except google does fix these issues and contribute funding. Summer of code, bug bounties, and other programs piloted by Google contribute both funding and fixes to these programs. We are mad because Google has paid for more critical issues in the past, but all of a sudden they are demanding free labor for medium severity security issues from swamped volunteers.

    Being able to find bugs (say by fuzzing

    Fuzzing is great! But Google's Big Sleep project is GenAI based. Fuzzing is in the process, but the inputs and outputs are not significantly distinct from the other GenAI reports that ffmpeg receives.

    Those approaches would be ridiculous bloat, the idea is just supply some kind of wrapper that runs the codec in a chrooted separate process communicating through pipes under ptrace control or however that’s done these days.

    Chroot only works on Linux/Unix and requires root to use, making it not work in rootless environments. Every single sandboxing software comes with some form of tradeoff, and it's not ffmpeg's responsibilities to make those decisions for you or your organization.

    Anyway, sandboxing on Linux is basically broken when it comes to high value targets like google. I don't want to go into detail, but but I would recommend reading maidaden's insecurities (I mentioned gvisor earlier because gvisor is google's own solution to flaws in existing linux sandboxing solutions). Another problem is that ffmpeg people care about performance a lot more than security, probably. They made the tradeoff, and if you want to undo the tradeoff, it's not really their job to make that decision for you. It's not such a binary, but more like a sliding scale, and "secure enough for google" is not the same as "secure enough for average desktop user".

    I saw earlier you mentioned google keeping vulnerabilities secret, and using them against people or something like that, but it just doesn't work that way lmao. Google is such a large and high value organization, that they essentially have to treat every employee as a potential threat, so "keeping vulns internal" doesn't really work. Trying to keep a vulnerability internal will 100% result in it getting leaked and then used against them.It would be great if Google could fix it, but ffmpeg is very hard to work in, not just because of the code organization but because of the very specialized knowledge needed to mess around inside a codec. It would be simpler and probably better for Google to contribute development funding since they depend on the software so heavily.

    It’s nuts to suggest continuing to ship something with known vulnerabilities without, at minimum, removing it from the default build and labelling it as having known issues. If you don’t have the resources to fix the bug that’s understandable, but own up to it and tell people to be careful with that module.

    You have no fucking clue how modern software development and deployment works. Getting rid of all CVE's is actually insanely hard, something that only orgs like Google can reasonably do, and even Google regularly falls short. The vast majority of organizations and institutions have given up on elimination of CVE's from the products they use. "Don't ship software with vulnerabilities" sounds good in a vacuum, but the reality is that most people simply settle for something secure enough for their risk level. I bet you if you go through any piece of software on your system right now you can find CVE's in it.

    You don't need to outrun a hungry bear, you just need to outrun the person next to you Cybersecurity is about risk management, not risk elimination. You can't afford risk elimination.

  • It might be appropriate for ffmpeg to get rid of such obscure codecs

    This is why compilation flags exist. You can compile software to not include features, and the code is removed, decreasing the attack surface. But it's not really ffmpegs job to tell you which compilation flags you should pick, that is the responsibility of the people integrating and deploying it into the systems (Google).

    Sandbox them somehow so RCE’s can’t escape from them, even at an efficiency cost

    This is similar to the above. It's not really ffmpeg's job to pick a sandboxing software (docker, seccomp, selinux, k8s, borg, gvisor, kata), but instead the responsibility of the people integrating and deploying the software.

    That's why it's irritating when these companies whine about stuff that should be handled by the above two practices, asking for immediate fixes via their security programs. Half of our frustration is them asking for volunteers to fix CVE's with a score less than a 6 promptly (but while simultaneously being willing to contribute fixes or pay for CVE's with greater scores under their bug bounty programs). This is a very important thing to note. In further comments, you seem to be misunderstanding the relationship Google and ffmpeg have here: Google's (and other companies') security program is apply pressure to fix the vulnerabilities promptly. This is not the same thing as "Here's a bug, fix it at your leisure". Dealing with this this pressure is tiring and burns maintainers out.

    The other half is when they reveal that their security practices aren't up to par when they whine about stuff like this and demand immediate fixes. I mean, it says it in the article:

    Thus, as Mark Atwood, an open source policy expert, pointed out on Twitter, he had to keep telling Amazon to not do things that would mess up FFmpeg because, he had to keep explaining to his bosses that “They are not a vendor, there is no NDA, we have no leverage, your VP has refused to help fund them, and they could kill three major product lines tomorrow with an email. So, stop, and listen to me … ”

    Anyway, the CVE being mentioned has been fixed, if you dig into it: https://xcancel.com/FFmpeg/status/1984178359354483058#m

    But it really should have been fixed by Google, since they brought it up. Because there is no real guarantee that volunteers will fix it again in the future, and burnt out volunteers will just quit instead. Libxml decided to just straight up stop doing responsible disclosure because they got tired of people asking for them to fix vulnerabilities with free labor, and put all security issues as bug reports that get fixed when maintainers have the time instead.

    The other problem is that the report was AI generated, and part of the issue here is that ffmpeg (and curl, and a few other projects), have been swamped with false positives. These AI, generate a security report that looks plausible, maybe even have a non working POC. This wastes a ton of volunteer time, because they have to spend a lot of time filtering through these bug reports and figuring out what's real and what is not.

    So of course, ffmpeg is not really going to prioritize the 6.0 CVE when they are swamped with all of these potentially real "9.0 UlTrA BaD CrItIcAl cVe" and have to figure out if any of them are real first before even doing work on them.

  • Microblog clients, which may expect mastodon like interfaces, do this by default.

  • I like Incus a lot, but it's not as easy to create complex virtual networksnas it is with proxmox, which is frustrating in educational/learning environments.

  • This is untrue, proxmox is not a wrapper around libvirt. It has it's own API and it's own methods of running VM's.

  • Yes, this is where docker's limitations begin to show, and people begin looking at tools like Kubernetes, for things like advanced, granular control over the flow of network traffic.

    Because such a thing is basically impossible in Docker AFAIK. You're getting these responses (and in general, responses like those you are seeing) appear when the thing a user is attempting to do is anywhere from significantly non trivial to basically impossible.

    An easy way around this, if you still want to use Docker, is addressing the below bit, directly:

    no isolation anymore, i.e qbit could access (or at least ping) to linkwarden’s database since they are all in the same VPN network.

    As long as you have changed the default passwords for the databases and services, and kept the services up to date, it should not be a concern that the services have network level access to eachother, as without the ability to authenticate or exploit eachother, there is nothing that they can do, and there are no concerns.

    If you insist on trying to get some level of network isolation between services, while continuing to use Docker, your only real option is iptables* rules. This is where things would get very painful, because iptables rules have no persistence by default, and they are kind of a mess to deal with. Also, docker implements their own iptables setup, instead of using standard ones, which result in weird setups like Docker containers bypassing the firewall when they expose ports.

    You will need a fairly good understanding of iptables in order to do this. In addition to this, if you decide this in advance, I will warn you that you cannot create iptables rules based on ip addresses, as the ip addresses of docker containers are ephemeral and change, you must create rules based on the hostnames of containers, which adds further complexity as opposed to just blocking by ip. EDIT: OR, you could give your containers static ip addresses.

    A good place to start is here. You will probably have to spend a lot of time learning all of the terminology and concepts listed here, and more. Perhaps you have better things to do with your time?

    *Um, 🤓 ackshually it's nftables, but the iptables-nft command offers a transparent compatibility layer enabling easier migrations from the older and no longer used iptables

    EDIT: And of course nobody has done this before and chatgpt isn't helpful. These problems are the kinds of problems where chatgpt/llm's begin to fall apart and are completely unhelpful. Just "no you're wrong" over and over again as you have to force your way through using actual expertise.

    You can block traffic to a Docker container by its hostname using iptables, but there’s an important nuance: iptables works with IP addresses, not hostnames. So you’ll first need to resolve the container’s hostname to its IP address and then apply the rule.

    You’re right—container IPs change, so matching a single IP is brittle. Here are robust, hostname-friendly ways to block a container that keep working across restarts.

    Exactly — good catch. The rule: sudo iptables -I DOCKER-USER 1 -m set --match-set blocked_containers dst -j DROP matches any packet whose destination is in that set, regardless of direction, so it also drops outgoing packets from containers to those IPs.

    You’re absolutely right on both points:

    With network_mode: "container:XYZ", there is no “between-containers” network hop. Both containers share the same network namespace (same interfaces, IPs, routing, conntrack, and iptables). There’s nothing to firewall “between” them at L3/L2—the kernel sees only one stack.

    Alright I will confess that I didn't know this. This piece of info from chatgpt changes what you want to do from "significantly non trivial" to "basically impossible". This means that containers do not have seperate ip addresses/networking for you to isolate from each other, they all share a single network namespace. You would have to isolate traffic based on other factors, like the process ID or user ID, which are not really inherently tied to the container.

    As a bonus:

    Docker’s ICC setting historically controls inter-container comms on bridge networks (default bridge or a user-defined bridge with enable_icc=). It doesn’t universally control every mode, and it won’t help when two containers share a netns.

    Useful for understanding terminology I guess, but there is a class of these problems these tools really struggle to solve. I like to assign problems like this to people and then they will often attempt to use chatgpt at first, but then they will get frustrated and quickly realize chatgpt is not an alternative for using your brain.

  • This is a very good writeup.

    Do you think supabase or other similar solutions also have these pitfalls?

  • Okay. This sounds very strange, but I had a similar issue with the nintendo switch pro controller and binding of isaac. I played around with antimicrox, but the real solution I found was to launch steam and leave it running. Then, my pro controller would magically work.

    I didn't have to launch the game via steam either, which is what made it even stranger to me.

  • Databases are special. They ofte implement their own optimizations, faster than more general system optimizations.

    For examole: https://www.postgresql.org/docs/current/wal-intro.html

    Because WAL restores database file contents after a crash, journaled file systems are not necessary for reliable storage of the data files or WAL files. In fact, journaling overhead can reduce performance, especially if journaling causes file system data to be flushed to disk. Fortunately, data flushing during journaling can often be disabled with a file system mount option, e.g., data=writeback on a Linux ext3 file system. Journaled file systems do improve boot speed after a crash.

    I didn't see much in the docs about swap, but I wouldn't be suprised if postgres also had memory optimizations, like it included it's own form of in memory compression.

    Your best bet is probably to ask someone who is familiar with the internals of postgres.

  • The cloud, and any form of managed database, inverts this. User accounts are extremely easy, as they are automatically provisioned with secrets you can easily rotate, along with the database itself. There is less of a worry about user rights as well, as you can dedicate one "instance" of a database to certain types of data, instead of having more than one database within one instance.

    And then, traffic is commonly going to be routed through untrusted networks, hence the desire for encryption in transit.

  • There are a few apps that I think fit this use case really well.

    Languagetool is a spelling and grammer checker that has a server client model. Libreoffice now has built in languagetool integration, where it can acess a server of your choosing. I make it access the server I run locally, since archlinux packages languagetool.

    Another is stirling-pdf. This is a really good pdf manipulation program that people like, that comes as a server with a web interface.

  • I use this for kubernetes secrets with sops. It works great.

  • I've seen three cases where the docker socket gets exposed to the container (perhaps there are more but I haven't seen any?):

    1. Watchtower, which does auto updates and/or notifies people
    2. Nextcloud AIO, which uses a management container that controls the docker socket to deploy the rest of the stuff nextcloud wants.
    3. Traefik, which reads the docker socket to automatically reverse proxy services.

    Nextcloud does the AIO, because Nextcloud is a complex service, but it grows to be very complex if you want more features or performance. The AIO handles deploying all the tertiary services for you, but something like this is how you would do it yourself: https://github.com/pimylifeup/compose/blob/main/nextcloud/signed/compose.yaml . Also, that example docker compose does not include other services, like collabara office, which is the google docs/sheets/slides alternative, a web based office.

    Compare this to the kubernetes deployment, which yes, may look intimidating at first. But actually, many of the complexities that the docker deploy of nextcloud has are automated away. Enabling the Collabara office is just collabara.enabled: true in the configuration of it. Tertiary services like Redis or the database, are included in the Kubernetes package as well. Instead of configuring the containers itself, it lets you configure the database parameters via yaml, and other nice things.

    For case 3, Kubernetes has a feature called an "Ingress", which is essentially a standardized configuration for a reverse proxy that you can either separate out, or one is provided as part of the packages. For example, the nextcloud kubernetes package I linked above, has a way to handle ingresses in the config.

    Kubernetes handles these things pretty well, and it's part of why I switched. I do auto upgrade, but I only auto upgrade my services, within the supported stable release, which is compatible for auto upgrades and won't break anything. This enables me to get automatic security updates for a period of time, before having to do a manual and potentially breaking upgrade.

    TLDR: You are asking questions that Kubernetes has answers to.

  • Try the yaml language server by red hat, it comes with a docker compose validator.

    But in general, off the top of my head, dashes = list. No dashes is a dictionary.

    So this is a list:

     
        
    thing:
        - 1
        - 2
    
      

    And this is a dictionary:

     
        
    dict:
        key1: value1
        key2: value2
    
      

    And then when they can be combined into a list of dictionaries.

     
        
    listofdicts:
        - key1dict1: value1dict1
        - key1dict2: value1dict2
          key2dict2: value2dict2
    
      

    And then abother thing to note is that yaml wilL convert things into a string. So if you have ports 8080:80, this will be converted into a string, which is a clue that this is a string in a list, rather than a dictionary.

  • The amazon appstore had this crazy setup where you could get microtransactions in certain games without spending any real money. I must have spent over $1000 on jetpack joyride. I unlocked everything.