• 0 Posts
Joined 1 year ago
Cake day: June 17th, 2023


  • No, it’s built into the protocol: think of it like as if every http request forces you to attach some tiny additional box containing the solution to a math puzzle.

    The twist is that you want the math puzzle to be easy to create and verify, but hard to compute. The harder the puzzle you solve, the more you get prioritized by the service that sent you the puzzle.

    If your puzzle is cheaper to create than hosting your service is, then it’s much harder to ddos you since attackers get stuck at the puzzle, rather than getting to your expensive service

  • Standard lossless compression (without further assumptions) is already very close to being as optimal as it can get: At some point the pure entropy of these huge datasets just is not containable anymore.

    The most likely savior in this case would be procedural rendering (i.e. instead of storing textures and meshes, you store a function that deterministically generates the meshes and textures). These already are starting to become popular due to better engine support, but pose a huge challenge from a design POV (the nice e.g. blender-esque interfaces don’t really translate well to this kind of process).

  • ZickZack@kbin.socialtoTechnology@lemmy.world*Permanently Deleted*
    1 year ago

    They will make it open source, just tremendously complicated and expensive to comply with.
    In general, if you see a group proposing regulations, it’s usually to cement their own positions: e.g. openai is a frontrunner in ML for the masses, but doesn’t really have a technical edge against anyone else, therefore they run to congress to “please regulate us”.
    Regulatory compliance is always expensive and difficult, which means it favors people that already have money and systems running right now.

    There are so many ways this can be broken in intentional or unintentional ways. It’s also a great way to detect possible e.g. government critics to shut them down (e.g. if you are Chinese and everything is uniquely tagged to you: would you write about Tiananmen square?), or to get monopolies on (dis)information.
    This is not literally trying to force everyone to get a license for producing creative or factual work but it’s very close since you can easily discriminate against any creative or factual sources you find unwanted.

    In short, even if this is an absolutely flawless, perfect implementation of what they want to do, it will have catastrophic consequences.

  • Everything using the activityPub standard has open likes (see https://www.w3.org/TR/2018/REC-activitypub-20180123/ for the standard), and logically it makes sense to do this to allow for verification of “likes”:
    If you did not do that, a malicious instance could much more easily just shove a bunch of likes onto another instance’s post, while, if you have “like authors” it’s much easier to do like moderation.
    Effectively ActivityPub treats all interactions like comments, where you have a “from” and “to” field just like email does (just imagine you could send messages without having an originator: email would have unusable levels of spam and harassment).
    Specfically, here is an example of a simple activity:

    POST /outbox/ HTTP/1.1
    Host: dustycloud.org
    Authorization: Bearer XXXXXXXXXXX
    Content-Type: application/ld+json; profile="https://www.w3.org/ns/activitystreams"
      "@context": ["https://www.w3.org/ns/activitystreams",
                   {"@language": "en"}],
      "type": "Like",
      "actor": "https://dustycloud.org/chris/",
      "name": "Chris liked 'Minimal ActivityPub update client'",
      "object": "https://rhiaro.co.uk/2016/05/minimal-activitypub",
      "to": ["https://rhiaro.co.uk/#amy",
      "cc": "https://e14n.com/evan"

    As you can see this has a very “email like” structure with a sender, receiver, and content. The difference is mostly that you can also publish a “type” that allows for more complex interactions (e.g. if type is comment, then lemmy knows to put it into the comments, if type is like it knows to put it to the likes, etc…).
    The actual protocol is a little more complex, but if you replace “ActivityPub” with “typed email” you are correct 99% of the time.

    The different services, like lemmy, kbin, mastodon, or peertube are now just specific instantiations of this standard. E.g. a “like” might have slightly different effects on different services (hence also the confusion with “boosting” vs “liking” on kbin)

  • It really depends on what you want: I really like obsidian which is cross-platform and uses basically vanilla markdown which makes it easy to switch should this project go down in flames (there are also plugins that add additional syntax which may not be portable, but that’s as expected).

    There’s also logseq which has much more bespoke syntax (major extensions to markdown), but is also OSS meaning there’s no real danger of it suddenly vanishing from one day to the next.
    Specifically Logseq is much heavier than obsidian both in the app itself and the features it adds to markdown, while obsidian is much more “markdown++” with a significant part of the “++” coming from plugins.

    In my experience logseq is really nice for short-term note taking (e.g. lists, reminders, etc) and obsidian is much nicer for long-term notes.

    Some people also like notion, but i never got into that: it requires much more structure ahead of time and is very locked down (it also obviously isn’t self-hosted). I can see notion being really nice for people that want less general note-taking and more custom “forms” to fill out (e.g. traveling checklists, production planning, etc…).

    Personally, I would always go with obsidian, just for the piece of mind that the markdown plays well with other markdown editors which is important for me if I want a long-running knowledge base.
    Unfortunately I cannot tell you anything with regards to collaboration since I do not use that feature in any note-taking system

  • For example, if you had an 8-bit integer represented by a bunch of qbits in a superposition of states, it would have every possible value from 0-256 and could be computed with as though it were every possible value at once until it is observed, the probability wave collapses, and a finite value emerges. Is this not the case?

    Not really, or at least it’s not a good way of thinking about it. Imagine it more like rigging coin tosses: You don’t have every single configuration at the same time, but rather you have a joint probability over all bits which get altered to produce certain useful distributions.
    To get something out, you then make a measurement that returns the correct result with a certain probability (i.e. it’s a probabilistic turing machine rather than a nondeterministic one).

    This can be very useful since sampling from a distribution can sometimes be much nicer than actually solving a problem (e.g. you replace a solver with a simulator of the output).
    In traditional computing this can also be done but that gives you the fundamental problem of sampling from very complex probability distributions which involves approximating usually intractable integrals.

    However, there are also massive limitations to the type of things a quantum computer can model in this way since quantum theory is inherently linear (i.e. no climate modelling regardless of how often people claim they want to do it).
    There’s also the question of how many things exist where it is more efficient to build such a distribution and sample from it, rather than having a direct solver.
    If you look at the classic quantum algorithms (e.g. https://en.wikipedia.org/wiki/Quantum_algorithm), you can see that there aren’t really that many algorithms out there (this is of course not an exhaustive list but it gives a pretty good overview) where it makes sense to use quantum computing and pretty much all of them are asymptotically barely faster or the same speed as classical ones and most of them rely on the fact that the problem you are looking at is a black-box one.

    Remember that one of the largest useful problems that was ever solved on a quantum computer up until now was factoring the number 21 with a specialised version of Shor’s algorithm that only works for that number (since the full shor would need many orders of magnitude more qbits than exist on the entire planet).

    There’s also the problem of logical vs physical qbits: In computer science we like to work with “perfect” qbits that are mathematically ideal, i.e. are completely noise free. However, physical qbits are really fragile and attenuate to pretty much anything and everything, which adds a lot of noise into the system. This problem also gets worse the larger you scale your system.

    The latter is a fundamental problem: the entire clue of quantum computers is that you can combine random states to “virtually” build a complex distribution before you sample from it. This can be much faster since the virtual model can look dependencies that are intractable to work with on a classical system, but that dependency monster also means that any noise in the system is going to negatively affect everything else as you scale up to more qbits.
    That’s why people expect real quantum computers to have many orders of magnitude more qbits than you would theoretically need.

    It also means that you cannot trivially scale up a physical quantum algorithm: Physical grovers on a list with 10 entries might look very different than a physical grover with 11 entries.
    This makes quantum computing a nonstarter for many problems where you cannot pay the time it takes to engineer a custom solution.
    And even worse: you cannot even test whether your fancy new algorithm works in a simulator, since the stuff you are trying to simulate is specifically the intractable quantum noise (something which, ironically, a quantum computer is excellent at simulating).

    In general you should be really careful when looking at quantum computing articles, since it’s very easy to build some weird distribution that is basically impossible for a normal computer to work with, but that doesn’t mean it’s something practical e.g. just starting the quantum computer, “boop” one bit, then waiting for 3ns will give you a quantum noise distribution that is intractable to simulate with a computer (same thing is true if you don’t do anything with a computer: there’s literal research teams of top scientists whose job boils down to “what are quantum computers computing if we don’t give them instructions”).

    Meanwhile, the progress of classical or e.g. hybrid analog computing is much faster than that of quantum computing, which means that the only people really deeply invested into quantum computing are the ones that cannot afford to miss, just in case there is in fact something:

    • finance
    • defence
    • security

  • Peertube is inherently very scalable with relatively little cost due to an artifact of all social media platforms: Most of the traffic is driven by a tiny amount of videos/magazines/etc…

    For services like youtube, you can use this as a way to quickly cache data close to the place it’s going to be streamed: e.g. Netflix works with ISPs to install small servers at their locations to lessen the burden on their (and the ISPs) systems.
    But with centralised systems you can only push this so far since ultimately everything is still concentrated at one central location.

    Hypothetically, if you could stop this super-linear scaling for each user (you need to pay per user plus overhead generated from managing them at scale), you could easily compete against the likes of youtube simply because, at sufficient scale, all the other effects get ammortized away.

    Peertube does exactly this by serving the videos as webtorrents: essentially this means that for every “chunk” of a video you downloaded, you also host that chunk for other people to download. That means that peertube itself theoretically only has to host every unique video once (or less than once since the chunks are in the network for a while), meaning you rid yourself of the curse of linear user scaling against users and only scale sub-linearly with the number of unique videos (how sub-linear depends on the lifetime for your individual torrents; i.e. how long a single video chunk stays available for others).

    The costs that remain for every peertube instance is essentially the file hosting costs (and encoding the video, but that also only scales in the number of videos and could be pushed onto the uploader using WASM video encoders).
    Storage itself isn’t cheap, but also not ungodly expensive (especially since you can ammortize the costs over a long time as you platform grows with storage prices in a continual massive decline).

    Platforms like Netflix and youtube cannot do this because

    1. Netflix is a paid-service and people don’t want to do the hosting job for netflix after having already paid for the service
    2. Youtube has to serve adds which is incompatible with the “users host the content” method

    In general torrenting is a highly reliable and well tested method that scales fantastically well to large data needs (it quite literally becomes more efficient the more people use it)