I don't understand.
If someone writes a reddit post and says "I'm fasting for Ramadan," can I not infer from that public post that the user is probably Muslim?
Permanently Deleted
Permanently Deleted
I don't understand.
If someone writes a reddit post and says "I'm fasting for Ramadan," can I not infer from that public post that the user is probably Muslim?
To be precise, the "lossless" compression is still a compression algorithm. They just didn't implement the steps that actually make the compression algorithm lossless.
From the write up:
JBIG2, the image format used in the affected PDFs, usually has lossless and lossy operation modes. Pattern Matching & Substitution„ (PM&S) is one of the standard operation modes for lossy JBIG2, and „Soft Pattern Matching“ (SPM) for lossless JBIG2 (Read here or read the papery by Paul Howard et al.1)). In the JBIG2 standard, the named techniques are called „Symbol Matching“.
PM&S works lossy, SPM lossless. Both operation modes have the basics in common: Images are cut into small segments, which are grouped by similarity. For every group only a representative segment is is saved that gets reused instead of other group members, which may cause character substitution. Different to PM&S, SPM corrects such errors by additionally saving difference images containing the differences of the reused symbols in comparison to the original image. This correction step seems to have been left out by Xerox.
Why does this image look like an AI-generated screenshot? The letter spacing and weights are all wrong.
He's written up his findings in English, for anyone who prefers English over German or text over video.
But basically the JBIG2 image compression algorithm used in those scanners looked for certain repeating patterns, and incorrectly compressed certain portions of the image into "close enough" blocks of pixels. Unfortunately, that meant that scanned number data wasn't guaranteed to be accurate, even when the decoded output clearly looked like a number with no distortion or noise.
It's worth the full read.
They prosecuted and convicted a guy under the CFAA for figuring out the URL schema for an AT&T website designed to be accessed by the iPad when it first launched, and then just visiting that site by trying every URL in a script. And then his lawyer (the foremost expert on the CFAA) got his conviction overturned:
https://www.eff.org/cases/us-v-auernheimer
We have to maintain that fight, to make sure that the legal system doesn't criminalize normal computer tinkering, like using scripts or even browser settings in ways that site owners don't approve of.
That doesn’t logically follow so no, that would not make an ad blocker unauthorized under the CFAA.
The CFAA also criminalizes "exceeding authorized access" in every place it criminalizes accessing without authorization. My position is that mere permission (in a colloquial sense, not necessarily technical IT permissions) isn't enough to define authorization. Social expectations and even contractual restrictions shouldn't be enough to define "authorization" in this criminal statute.
To purposefully circumvent that access would be considered unauthorized.
Even as a normal non-bot user who sees the cloudflare landing page because they're on a VPN or happen to share an IP address with someone who was abusing the network? No, circumventing those gatekeeping functions is no different than circumventing a paywall on a newspaper website by deleting cookies or something. Or using a VPN or relay to get around rate limiting.
The idea of criminalizing scrapers or scripts would be a policy disaster.
gaining unauthorized access to a computer system
And my point is that defining "unauthorized" to include visitors using unauthorized tools/methods to access a publicly visible resource would be a policy disaster.
If I put a banner on my site that says "by visiting my site you agree not to modify the scripts or ads displayed on the site," does that make my visit with an ad blocker "unauthorized" under the CFAA? I think the answer should obviously be "no," and that the way to define "authorization" is whether the website puts up some kind of login/authentication mechanism to block or allow specific users, not to put a simple request to the visiting public to please respect the rules of the site.
To me, a robots.txt is more like a friendly request to unauthenticated visitors than it is a technical implementation of some kind of authentication mechanism.
Scraping isn't hacking. I agree with the Third Circuit and the EFF: If the website owner makes a resource available to visitors without authentication, then accessing those resources isn't a crime, even if the website owner didn't intend for site visitors to use that specific method.
Fuck that. I don't need prosecutors and the courts to rule that accessing publicly available information in a way that the website owner doesn't want is literally a crime. That logic would extend to ad blockers and editing HTML/js in an "inspect element" tag.
Permanently Deleted
How does this compare to Maia, which is a similar project for an engine that's supposed to play more human like?
The value a thing creates is only part of whether the investment into it is worth it.
It's entirely possible that all of the money that is going into the AI bubble will create value that will ultimately benefit someone else, and that those who initially invested in it will have nothing to show for it.
In the late 90's, U.S. regulatory reform around telecom prepared everyone for an explosion of investment in hard infrastructure assets around telecommunications: cell phones were starting to become a thing, consumer internet held a ton of promise. So telecom companies started digging trenches and laying fiber, at enormous expense to themselves. Most ended up in bankruptcy, and the actual assets eventually became owned by those who later bought those assets for pennies on the dollar, in bankruptcy auctions.
Some companies owned fiber routes that they didn't even bother using, and in the early 2000's there was a shitload of dark fiber scattered throughout the United States. Eventually the bandwidth needs of near universal broadband gave that old fiber some use. But the companies that built it had already collapsed.
If today's AI companies can't actually turn a profit, they're going to be forced to sell off their expensive data at some point. Maybe someone else can make money with it. But the life cycle of this tech is much shorter than the telecom infrastructure I was describing earlier, so a stale LLM might very well become worthless within years. Or it's only a stepping stone towards a distilled model that costs a fraction to run.
So as an investment case, I'm not seeing a compelling case for investing in AI today. Even if you agree that it will provide value, it doesn't make sense to invest $10 to get $1 of value.
Intel is best thought of as two businesses, where their historical dominance in one (actually fabricating semiconductors) protected their dominance in another (designing logic chips), despite not actually being the best at that.
Intel's fabs represented the cutting edge in semiconductor manufacturing, and their superiority in that business almost killed AMD, who just couldn't keep up. Eventually, AMD decided they wouldn't try to keep up with cutting edge semiconductor manufacturing, and spun off their fabs as an independent company called Global Foundries in 2009.
But Intel hit a wall in progressing in semiconductor manufacturing, and made very slow progress with a new type of transistor known as a finFET, with lots of roadblocks and challenges. The biggest delays came around Intel's 10nm process, where they never got yields quite to where they should have been, while other foundries like Samsung and TSMC passed them up. And so their actual CPU business suffered because AMD, now a fabless chip designer, could go all in on TSMC's more advanced processes. Plus because they were fabless, they pioneered advanced packaging for "chiplet" designs where different pieces of silicon could be connected in a way that they acted like a single chip, but where the different components could be small enough that imperfections wouldn't hurt yield as badly, and where they could mix and match the cheap processes and the expensive processes to the part of the "chip" that actually needed the performance and precision.
Meanwhile, Apple was competing with Qualcomm and Samsung in the mobile System on a Chip (SoC) systems for phones, and developed its own silicon expertise. Eventually, they were able to scale up performance (with TSMC's help) to make a competitive laptop chip based on the principles of their mobile chip design (and then eventually desktop chips). That allowed them to stop buying Intel chips, and switch to their own designs, manufactured by TSMC. Qualcomm is also attempting to get into the laptop/small PC market by scaling up their mobile chip designs, also manufactured by TSMC.
Intel can get things right if it catches up with or surpasses TSMC in the next paradigm of semiconductor manufacturing. The transistors are changing from finFET (where TSMC has utter dominance) to GAAFET (where Intel, TSMC, and Samsung are all jockeying for position), and are trying out backside power (where the transistor gates are powered from underneath rather than from the cluttered top side). Intel has basically gone all in on their 18A process, and in a sense it's a bit of a clean slate in their competition with TSMC (and to a lesser degree, Samsung, and a new company out of Japan named Rapidus), and possibly even with Chinese companies like SMIC.
But there are negative external signs. Intel acknowledged that they don't have a lot of outside customers signing up for foundry services, so they're not exactly poaching any clients from TSMC. And if that's happening while TSMC is making absurd profits, that must mean that those potential clients who have seen Intel's tech under NDA might see that Intel is falling further behind from TSMC. At that point, Intel will struggle to compete on logic chips (CPUs against AMD and Apple and maybe Qualcomm, discrete GPUs against AMD and NVIDIA), if they're all just paying TSMC to make the chips for them.
So I don't think all of their layoffs make a ton of sense, but understand that they're really trying to retake the lead on fabrication, with everything else a lesser priority.
Permanently Deleted
Apple's discounting strategy is generally to sell last year's model, sometimes the model before that, with roughly $200 discounts for each year since its release. They sometimes release a lower spec model (the 16e is the current example, prior SE models or even the mini models from previous generations were part of this strategy as well) and that sometimes means the 2-year-old model isn't kept available as long.
That's where their 5-7 year support window really shines, in that they can just sell older models as discounted models, knowing that the new owner will still get 3-5 years of support.
The other thing is that the used market for iPhones is pretty robust. I can go buy used phones that are 3 or 4 years old and still get a good 1-4 years of additional support. At least in the U.S., if you told me my budget for a phone was gonna be $300 for the next 2 years, I think I'd probably buy a used iPhone.
As it currently stands, I'm still on Pixels on a 2 year cycle, but I also know that my "sell used to offset the price of my new phone" strategy also would be much cheaper if I did it with iPhones instead of Pixels.
The sun loses 130 billion tons of matter in solar wind every day.
But how much can be caught?
From the sun, the angular diameter of the earth (12,756 km wide, 149,000,000 km away) is something like 0.004905 degrees (or 0.294 arc minutes or 17.66 arc seconds).
Imagining a circle the size of earth, at the distance of the earth, catching all of the solar wind, we're still looking at something that is about 127.8 x 106 square kilometers. A sphere the size of the Earth's average distance to the sun would be about 279.0 x 1015 square km in total surface area. So oversimplifying with an assumption that the solar wind is uniformly distributed, an earth-sized solar wind catcher would only get about 4.58 x 10^−10 of the solar wind.
Taking your 130 billion tons number, that means this earth-sized solar wind catcher could catch about 59.5 tons per day of matter, almost all of which is hydrogen and helium, and where the heavier elements still tend to be lower on the periodic table. Even if we could theoretically use all of it, would that truly be enough to meet humanity's mining needs?
It's like the relationship between mathematics and accounting. Sure, almost everything accountants do involve math in some way, but it's relatively simple math that is a tiny subset of what all of mathematics is about, and the actual study of math doesn't really touch on the principles of accounting.
Computer science is a theoretical discipline that can be studied without computers. It's about complexity theory and algorithms and data structures and the mathematical/logical foundations of computing. Actual practical programming work doesn't really touch on that, although many people are aware of those concepts and might keep them in the back of their mind while coding.
People who get downvoted a lot end up with a ‘low reputation’ indicator next to their name. You’ll know it when you see it.
Upvotes in meme communities do not add to reputation.
I think any kind of reputation score should be community specific. There are users whose commenting style fits one community but not another, and their overall reputation should be understood in the context of which communities actually like them rather than some kind of global average.
What did he order at McDonald's?
Gears have powerbands, CVTs are always in the sweet spot.
Isn't that basically true of automatics with 8+ gears, too?
Who's in the middle of this Venn Diagram between "uses some kind of custom OS on their phone to where their camera app doesn't automatically read QR codes" and "doesn't know how to install or use software that can read QR codes"?
I don't have a phone that can scan QR codes.
QR codes are a plain text encoding scheme. If you can screenshot it, you have access to FOSS software that can decode it, and you can paste that URL into your browser.
What counts as an algorithm? Surely it can't be the actual definition of algorithm.
Because in most forum software (even the older stuff that predates reddit or social media) if I just click on a username, that fetches from the database every comment that the user has ever made, usually sorted in reverse chronological order. That technically fits the definition of an algorithm, and presents that user's authored content in a manner that correlates the comments with the same user, regardless of where it originally appeared (in specific threads).
So if it generates a webpage that shows the person once made a comment in a cooking subreddit that says "I'm a Muslim and I love the halal version" next to a comment posted to a college admissions subreddit that says "I graduated from Harvard in 2019" next to a comment posted to a gardening subreddit that says "I live in Berlin," does reddit violate the GDPR by assembling this information all in one place?