I choose the books I buy, from Anna's Archive.
I choose the comics I buy from readComicsOnline.
I choose the [european] graphic novels I buy from #WONTTELL.
And I am one of the best customers of these 3 physical shops, in my town.
So sure, I don't buy the latest trends based on ads. I investigate a lot to buy GREAT stuff. Sometimes the shopkeeper has headaches to find the obscure stuff I discovered online that NOBODY knows it exists.
Am I an exception?
I don't know but those services are great to maintain a freedom of choice.
Many years ago, I was involved in a movie release group. Pretty much everybody in that group owned more VHSs/DVDs than the typical person. This is probably not surprising, since the time and effort one needs to put into that is rather large.
Those who only downloaded were more of a mixed bag; some of them were not in the US and might not be able to see a domestic release of the movies any time soon. Some proudly claimed that they never bought any media because paying for it when you could pirate was for losers.
I spent a small fortune on a record collection. Then the record format was abandoned and it was all CDs. I spent a small fortune re-buying that same record collection, insofar as the records were even available as CDs. Then we went all digital (yes, I know CDs were already digital) and it became MP3s. So I ripped my CD collection and assigned them to a box in my attic. I will not be spending money on spotify or whatever other service to listen to stuff that I already have.
Movies... I spent a small fortune on a movie collection. Then I moved countries and to my surprise found that my movies wouldn't play anymore. So I ripped the DVDs to digital media and played them using open source software. This saved a small fortune and was more convenient as well. I think I still have the DVDs.
I spent a large fortune on books. Thousands of them. Typically read once, a much smaller number read multiple times. So I gave away my books, except for a few hundred that I still keep. I support the authors that I like by buying their books but I read on screens not on paper because my eyesight sucks and on screens I can set the font to whatever I want rather than to what the publisher thought was optimal.
There is no way the media companies are going to guilt trip me over any of this, besides that I read both Janis Ian and Courtney Love's pieces on the recording industry.
Copyright is great, it has enabled lots of people to earn a living creating content. But it has also become a weapon in an ever more absurd war between consumers and middle men, the producers caught in some uncomfortable position in the background.
What's interesting is that the middlemen brought this all on themselves: they equated buying a physical copy of a production with licensing IP, but the general public didn't think that way at all: they bought a book, they bought a record, they bought a movie. And passing on what you've bought when you no longer need it was and still is such an ingrained part of our culture that it felt really weird to have restrictions placed on what you could do with stuff you bought and paid for. So when the format changed from physical to nothing (bits) plenty of people felt that this was not quite what we had agreed to, after all we were paying for the medium as much as we were paying for the content so how come we paid the same or even more as before? And now we paid and got something that we could no longer share with others. No way to easily pass that e-book to someone else (talk about malicious compliance), no way to send the song you just paid for through Spotify or iTunes to someone else to let them hear it after you are done with it. You don't own the medium any more so therefore you own nothing at all.
And those publishers and movie producers are all laughing to the bank whilst doing nothing at all except for playing bank.
I can't even pay for a second copy of an ebook for friends reliably. They literally won't take your money for cross region sales or whatever due to asinine market restrictions
The french comic pirate scene has an interesting rule where they keep a ~6 month time lag on what they release. The scene is small enough that the rule generally works.
It's a really good trade-off. I would never have gotten into these comics without piracy but now if something catches my eye, I don't mind buying on release (and stripping the DRM for personal use).
Most of my downloading is closer to collecting/hoarding/cataloguing behaviour but if I fully read something I enjoy, I'll support the author in some way.
Similar. Anna's Archive has become a more convenient alternative to the campus library. I can grab something while at home, get the info I need, and delete. If the title is worthwhile, I'll buy my own copy. I don't buy more books than I did before, but my satisfaction rate is higher, since I can check the contents before buying.
On the other hand, I buy way more movies than I used to, because upload sites have exposed me to many good films that I would never have heard of otherwise.
No, I'm the same. A lot of stuff I read is hard-to-get philosophy or from obscure authors, so I first get them from Anna's Archive. Reading them on paper is much better so I try to find a physical copy later.
Years ago I was following development of an indie game. The developers wanted to provide a DRM-free experience.
The game had some online functionality (leaderboard or something). They were surprised when the number of accounts accessing the online functionality exceeded their sales by a dramatic number. The developer updates grew more and more sad as they switched from discussing new features to pleading with people to actually buy the game instead of copying it. Eventually they called it quits and gave up because the game, while very popular, was so widely pirated that few people actually paid.
Whenever the piracy topic comes up I hear people do mental gymnastics to justify it, like claiming they spend more than average and therefore their piracy is a net win. Yet when we get small peeks into numbers and statistics like with video game piracy, it’s not hard to see that the majority of people who pirate things are just doing it because they get what they want and don’t have to pay for it.
The difficult bit is working out what percentage of pirated copies are actually replacing a sale that would have happened if the content wasn't available to pirate. The more dramatic industry numbers like to claim it's 100%, which is ridiculous. It's certainly more than 0%, though.
I'd assume that for your indie game, there were a lot of people who wound up thinking "I would play this if it's free, but I wouldn't spend $X" on it. Adding successful DRM wouldn't have done anything to them but drive them away, and reduce the amount of buzz the game received. But then, particularly in the indie game space, maybe trading away a lot of buzz for a couple hundred more full-price game sales would have been completely worth it...
This is where the concept of services like Xbox Game Pass seem to be landing. Once someone has paid their fairly-small-amount each month, every game is now "free". Much like fairly-cheap streaming music basically stopped music piracy from being mainstream, cheap game-services might have the same impact on the game industry.
Though, much like streaming music, whether it turns out to be economically viable for the average game studio is certainly a question.
(For the sake of completeness: I don't pirate anything, so I have nothing to justify here.)
Sales or economics is not the only thing a developer may care about. Some people want control over their work and will be upset from people pirating their game even if it doesn't mean they lose a sale. Similarly many artists do not want you to repost their art or use their art as your profile picture.
Ok, but should we care if those developers/artists get what they want? Some companies would also really like to take games they have sold you away from you so they can sell you the next installment. Some developers don't want certain groups of people they dislike to enjoy their game. Not all things that developers want are reasonable.
I think part of the question though is also, would they have been as popular as they were without piracy, which does provide some advertising benefits through audience exposure. It is easy to say a really popular game would still be popular without piracy, but some lesser known games might never have gained any attention at all if there weren't enough people spreading word about it. Of course trying to quantify the sales and word-of-mouth benefits from that sort of thing is extremely difficult.
> The game had some online functionality (leaderboard or something). They were surprised when the number of accounts accessing the online functionality exceeded their sales by a dramatic number. The developer updates grew more and more sad as they switched from discussing new features to pleading with people to actually buy the game instead of copying it. Eventually they called it quits and gave up because the game, while very popular, was so widely pirated that few people actually paid.
Ok, but why? Whas the game actually unprofitable or did they just feel bad about some people getting it for free. You need to remember that a pirated copy does not equal a lost sale - in fact, sales may even be higher than they would be without piracy as popularity gained from pirated copies also translates to more legitimate buyers.
Your story sounds like "World of Goo," which reported a 90% piracy rate from comparing unique IP addresses to number sold. Despite that, they didn't quit and recently released "World of Goo 2" still DRM free.
Yes, hit games are still popular enough for sequels (world of goo 2 came out 16 years after the first one, according to wikipedia, which is an unusually long time). I remember World of Goo being one of the few choices of games for iPad when it was young.
But the vast majority of developers aren't lucky enough to have massive hits, and so money differences can still matter.
> out of 100 people doing that how many actually buy product in the end???? if net gain is positive then developer would not pay millions to license DRM
Lets not pretend that markets and companies are actually rational.
I'm exactly the same. I tend to get the first book of any series that interests me and read a third before I decide whether to buy it or not. I do buy about 3-4 books a month (mostly epub drm free preferred) plus about 10 european graphic novels (paper books only) a month so I'm a heavy consumer I think.
I follow the newsletter from Borderlands Books in San Francisco. I usually buy one book off their best seller list a month (sometimes I’ll stop in and buy three or four)
I’ve recently started using my local library’s mobile app and I love it. (I typically use this for re-reading or audiobooks for plane trips)
I’m tempted to donate my entire bookshelf to the library and let them store and maintain it for me :-p
I don't think I follow. There's no recommendation engine in AA, right? Do you download a bunch of books from AA, read them, then if you happened to like one enough, you will buy it from a local bookstore?
Some Lovecraft letters were translated into french some weeks ago. Great reading! There, Lovecraft gives his opinion about the litterature and art of his time.
And he mentions Nicolas Roerich. No idea who this guy was, but hey pretty interesting painter (thank god Google Images!). Ok, let's check on AA if there is a definitive book about his art.
No luck, but that very same guy wrote many books about Hindouism and eastern asia. After a few downloads on AA, no big deal, I am not so fond of them. Except for one that I knew nothing about (the name is Altai Himalaya, and I have absolutely no clue why this one is picking my attention, but it does).
That's definitely what I call serendipity.
And that thing happens a lot when you have a full access to whatever content is available. [and you are curious by nature]
In the end, retrospectively, such widespread access permits serendipity at a level that is absurdly miraculous !
That’s exactly how I do it. I enjoy reading DRM-free epubs on my Kobo, and whenever I finished a book I enjoyed, I buy it from the local sci-fi bookshop. I buy about 90% of all books I read.
I used to do that with games back when I played. I was always a staunch advocate of, if it's good, people will pay, and I didn't want to be a hypocrite despite refusing to buy most games because they could not be returned afterwards. Even newer services that offer refunds make it more difficult than I'm willing to put up with. If I played it most of the way through, I bought it.
Also, I tend to look for obscure and old books (I love old travelogues) and once I find one that really gets me, you'll be sure to receive it as a gift, if I think you'd be someone (or in a place in life) who would enjoy it.
So, I might not but it for myself but I make my decision on the pirated version and then buy more than my share when it's truly a gem. If I don't end up recommending it or buying it for someone that usually means it was something which I'd be ok not to have consumed.
Studies show that the biggest pirates of content are also the biggest buyers of content. The theory is that piracy functions as a way to deepen paid fandom not to erase it.
We will never have real data on this. But simply on its face, I find it extremely hard to believe that most consumers have a strong enough moral compass to go out of their way to buy something they already have access to. Maybe they will for a tiny handful of special books that they want hard copies of, or authors they really like, but not for most media they consume.
This type of system also becomes a popularity contest for creators; you are supporting the people you like as opposed to whose work you want to read. If an author says something you disagree with, it's easy to just read their work without paying them. I'm not against consumer boycotts, but it should generally come with a sacrifice on both sides--for consumers, that means missing out on the product or service.
You are free to feel however you want about this. I can certainly see the immense societal value of making things accessible to more people. But I flat out don't believe the "piracy doesn't lead to lost sales" shtick, of course it does.
'The Dutch firm Ecory was commissioned to research the impact of piracy for several months, eventually submitting a 304-page report to the EU in May 2015. The report concluded that: “In general, the results do not show robust statistical evidence of displacement of sales by online copyright infringements. That does not necessarily mean that piracy has no effect but only that the statistical analysis does not prove with sufficient reliability that there is an effect.”
The report found that illegal downloads and streams can actually boost legal sales of games, according to the report. The only negative link the report found was with major blockbuster films: “The results show a displacement rate of 40 percent which means that for every ten recent top films watched illegally, four fewer films are consumed legally.”'
Very interesting report, and am not discounting it, but another factor is that maybe the pricing affect is already baked in from years of piracy. For example, back in the early 2000's, when P2P file sharing was being used to download music, then to compete, the music industry had to resort to iTunes store, which allowed users to buy just one song for a dollar, instead of the entire album (and then later on, to music streaming services). The damage was done decades ago, and eventhough P2P file sharing isn't big today, it's effects are still with us today (no music executive is going to go back to forcing people to buy an entire album to get just one or two songs).
But, maybe this report is taking this into account too??
Unfortunately, absence of evidence ≠ evidence of absence.
I obviously don't have time to read a 300 page report—I wish I did—but the conclusion says:
> With regard to total effects of online copyright infringements on legal transactions, there are no robustly significant findings. The strongest finding applies to films/TV-series, where a displacement rate of 27 with an error margin of roughly 36 per cent (two times the standard error) only indicates that online copyright infringements are much more likely to have negative than positive effects.
The conclusion goes on to discuss each type of media. Here's the section on games:
> For games, the estimated effect of illegal online transactions on sales is positive because only free games are more likely displaced by online copyright infringements than not. The overall estimate is 24 extra legal transactions (including free games) for every 100 online copyright infringements, with an error margin of 45 per cent (two times the standard error). The positive effect of illegal downloads and streams on the sales of games may be explained by players getting hooked and then paying to play the game with extra bonuses or at extra levels.
If this is what was meant by "illegal downloads and streams can actually boost legal sales of games" (and it's possible they're talking about something else which isn't in the conclusion), I don't find that convincing. It's within the margin of error and includes free transactions.
Moreover, I firmly believe that we are never going to have good data on this! You're trying to measure two things that are virtually impossible to measure with any accuracy: (1) how much piracy is taking place, and (2) what would sales have been without the piracy.
(I've edited my comment to actually quote the paper)
Or it's evidence that the effect can't be measured, which is what I'm trying to say.
I honestly don't understand how you would even attempt to measure something like this. There's no counterfactual. How can you possibly know what sales would have been without piracy?
This study appears to be relying on survey results. That seems questionable to me, because no one wants to admit "I totally would buy more books if piracy wasn't an option, but I choose piracy because I like having money and I think authors deserve to starve." I'm exaggerating for the sake of effect, but really, how can anyone ever know what they would have purchased under different circumstances? It's human nature to self-rationalize your actions. And yet, despite this, the study still didn't find statistically significant results!
Maybe if one country ever manages to truly cut off access to piracy websites, and there's another economically and sociologically similar country where piracy remains readily available, it will be possible to get some valid data on this question. I mostly hope this doesn't ever happen, because while I'm not a fan of piracy, I am a fan of the free internet!
Absence of proof is not proof of absence, and Sagan should have said that.
Absence of evidence is evidence of absence if evidence was sought and not found, and much of science is based on this. Or if evidence of presence should be expected ... consider for example the absence of evidence of an elephant in your living room.
This saying should die along with "you can't prove a negative"--Euclid proved that there is no greatest prime over 2000 years ago. What can't be proven is a universal empirical--positive or negative--such as "no raven is white" or "all ravens are black".
> The report found a lack of evidence that piracy displaces sales.
This isn't true though, as they conclude a 40% displacement in blockbuster movie sales. You would need a better analysis of their methodology to dismiss their other conclusions
As far as I can tell from the conclusion, everything was within the margin of error, so my assumption is that it's random noise. If there's a place in the paper that says otherwise, please let me know what page its on. If I'm misreading the results, please let me know that as well.
The 40% figure seems to come from section 8.2, p.152, which the authors describe as "robust".
However, having seen the report now, this section on top films seems to use a different methodology to that used for books, so it's not really comparable, and in general I wouldn't put much confidence in these results anyway.
> I find it extremely hard to believe that most consumers have a strong enough moral compass to go out of their way to buy something they already have access to.
This is zero-sum thinking. Do you oppose libraries on the same principle?
Sometimes making a thing accessible can increase the overall market for the good, because it trains the behavior. The market for books requires readers, and readers are created by people reading.
No, because libraries have to buy the books! If lots of people check out a book, the library will have to buy more copies! Yes, maybe the authors loose out on some revenue, but there's a clear relationship between number of readers and the author getting paid for their work.
This is also why I thought the Internet Archive's lending lending library was great! I'm aware they got sued anyway, and I think that's a real shame.
Yes, but whereas libraries need to buy more copies of books that lots of people check out, Anna's archive only ever needs one. Not exactly sustainable for the author.
As I said, I loved the Internet Archive's approach to this! That's very much not what Anna's archive is doing.
Libraries have to replace paperback books after ~20 checkouts on average. (This number is from memory but I'm quite sure it's in this range.) Hardcover books last a bit longer but of course are also more expensive.
I agree the industry would have a hard time surviving off library sales alone, in the same way that most businesses rely on multiple revenue streams to make ends meet, but I think library revenue is much more significant than you're making it out to be!
It's also likely true that a library that bought 10 copies of a book initially is unlikely to buy 10 more copies once there have been 200 circulations and they are needing to be replaced. They may only buy 5 replacement copies since the book is likely to be less popular than at initial release so it will take much longer for the next 100 circulations to occur.
As for anecdota, I have more than once borrowed a library book and then purchased a copy so I could read it again or to finish it if demand is strong enough that I would have to wait weeks or months to be able to borrow it again.
Have you tried borrowing a mildly popular recent book from the library? There's often a digital queue of 20+ people with reservations.
There's plenty of incentive for most people to buy the real book rather than wait for the queue.
(I've also found libraries a useful way to discover lesser-known authors, since you can quickly sample/browse books on the shelves. But they wont have all of the books published by those unknown authors.... so I end up buying/ordering other things by them)
The principle of virtual libraries is the same as physical ones: only one person has access to the book at any given time. For popular books, either the library has to buy more copies (or digital licenses) or else it rations access by waiting list. The idea is sound IMO.
I would not buy a book after downloading it from Anna's archive. But that's the wrong question in my opinion. You should be asking why aren't most books available in a DRM free format?
The main reason to download "pirated" books is that they get rid of all annoying barriers that exist in "legitimate" copies. It's a better product.
> You should be asking why aren't most books available in a DRM free format?
Because most people don't care! I wish they did, because I'm like you, I do care about owning DRM free media! I buy videos game from GOG wherever possible, and audiobooks from a combination of downpour.com and libro.fm. Guess what most people do? They buy games on Steam and audiobooks on Audible.
Audible is the one that really breaks my heart! Games and movies I understand, because the DRM free sources have such narrow selections, but I can find just about any audiobook I want on either Downpour or libro.fm; every once in a while I'll come across an audible exclusive, but it doesn't happen frequently. And yet, everybody uses Audible!
And, sure, there are known ways to strip Audible DRM, but with DRM free stores so readily accessible, why wouldn't you use those?
>but I can find just about any [DRM-free] audiobook I want on either Downpour or libro.fm
Just had a browse of Downpour. They say that it's mostly DRM-free. I don't get it. How come the rights holders don't complain? My experience of DRM-free e-books is that the available titles are, let's say, nothing I would want to read. And audiobooks have higher production value because of the voice acting. What A-list authors are narrating their own books and then allowing them to be sold DRM-free?
Unless something changed recently, every title on Downpour is DRM free when bought (as opposed to rented). I've been using Downpour for more than a decade and own tons of books. Libro.fm is slightly newer and IMO has slightly nicer UX, but both websites have mostly the same (wide) selection of titles.
I can't tell you why publishers make the decisions they do, but there's no trick here, if that's what you're asking. DRM free audio books are widely available and have been widely available for a long time now.
The real question is, why does Audible insist on putting DRM on their Audiobooks when the publishers clearly don't care? I don't know the answer to that either, but the upshot is that everyone should stop buying from Audible!
If only sales on downpour were possible outside the US. I just tried to buy a K. J. Parker. Does not sell to the EU. I haven't tested libro.fm because their ToS doesn't tell me if non-US sales are prohibited and I'm not going to make an account just to try.
Perhaps, but it’s a bit moot once you have the book and a reader which opens it. Anna’s archive is a better service because it doesn’t matter what reader you’ve got and the content is there. It was the same with Netflix when it was the only streaming service: it had everything easily accessible.
Once again, I repeat, discovering something completely unexpected makes this discovery moment "special". Personnaly, I materialize that discovery by making it real in my real life. So I buy a physical copy.
That is also a way to build a me-compliant environment and not let the algorithms decide what I am surrounded with. [let's be frank, algorithms suck at finding who you are and what you will like!]
I bought a book or two after downloading but they had forewords in new editions or I had wanted to search something in the digital edition quickly as a one off and peruse the physical copy at leisure later.
> I'm not against consumer boycotts, but it should generally come with a sacrifice on both sides--for consumers, that means missing out on the product or service.
I'm curious as to why you feel this way, genuinely. The decision to boycott means that there is no sale, full stop, so no money is being handed over. Why does anything after that matter? The important part, the money, is already decided from the start.
Because otherwise there's no incentive not to boycott. One of the nice things about capitalism is that even unpopular people can make money if they make a product people want to buy. It adds a level of realness to society, above status-games and popularity-contests.
That makes the very silly assumption that the default is to boycott everything, which is really not the case. People at large definitely still default to purchasing things first, for all sorts of reasons from just feeling that it is moral to the service being convenient to just enjoying and wanting to support the work itself. This is self evident in the fact that boycotts essentially never actually kill anything because the majority still favors paying.
The default is to not buy something. People don't like loosing money. If you can get something without loosing money, it's super easy to rationalize why you you're skipping the loose money part. People tend to make decisions which are in their financial interest.
I've seen lots of people on this site that pay for YouTube. I've met real people that have subscriptions to porn sites. They fork out money for stuff that's pretty much always already freely available, for basically no reason except maybe convenience or slightly better service. People spend money all the time, for stuff they want and care about. If they didn't want or care about it, they wouldn't buy it or pirate it.
But if this is already true by default, then we're back to square one where the important financial decision was already made. Again, if it was already decided by default that there is no sale to be made, then whatever the end user does after that is irrelevant.
But beside that, in my last response I gave you three very common reasons that people do buy things against their own financial interests, and you've ignored that part. How do you fit that into your argument?
Homo economicus is a poor model of human behaviour. Per https://en.wikipedia.org/wiki/Homo_economicus#Sociologists, both neurobiological and anthropological research suggest that unsolicited gift-giving is a natural human behaviour.
It's nothing to do with morals or conscience, pure self interest incites me to to take action and buy physical copies or official ebooks or collector's editions or CDs or lossless digital releases of works I first consume pirated. I want creators I like to make more stuff. I feel good looking at my bookshelf filled with things I enjoy. I don't like throwing out or donating tons of books every year because they're no good and I couldn't tell until I bought and read them.
In several countries customers are forced to pay a special tax on empty media (storage) with the intention of proceedings to be redistributed among the copyright owners.
Some of these countries are codified under the Roman law principle, ie whatever is not explicitly forbidden by law, is simply not forbidden (as opposed to common law).
In some countries downloading the published media (eg a film after the official release) is permitted.
And those who download, paid for it in the form of tax.
Directive 2001/29/EC for the EU only (Article 5).
Other countries rely in provisions of WCT, 1996 (Art 10) and WPPT, 1996 (Art 16)
TBH don't think those laws are conscionable because the money collected through those taxes is mainly paid to entrenched copyright cartels instead of being distributed to creators in a fair way.
You are probably right, I am not representative of the vast majority of people who consume products, whereas I collect [what I consider to be, for me] GREAT stuff.
But one of the point I also wanted to highlight is that I knew nothing about those stuff and would have had no opportunity to taste them and be convinced that they are GREAT stuff [for me].
And to come back to your comment regarding creators. The thing that I hate are creators [for example writers who are interviewed in radios] who sell their book with a marvelous speech, but the content is eventually very so/so. As a consumer I feel robbed.
Books seem somewhat unique to me in that the physical product is better or at least different from the digital one, so it kind of makes sense to buy it even if you already have a digital copy. This is unlike e.g. streaming services where the paid service is strictly worse than the pirated one (e.g. no offline, doesn't work at all with some monitors/setups, only low bitrates allowed).
"Better" is of course subjective. Digital is better to me: I can read the digital version on my laptop, phone, or e-reader. I prefer the e-reader, but don't like to carry it everywhere; at the very least I can always read on my phone if that's all I have on me.
I'm someone who used to be a voracious reader. In my childhood alone I would devour paperbacks and hardcovers like nobody's business. My summers were spent destroying the full summer reading list distributed by my school in weeks, and then going to the library to find more things to read. I have had thousands and thousands of physical books in my hands during my life. But I still prefer digital.
I only purchase digital books that either have no DRM, or stripable DRM.
You feel. You think.
Google up the studies of piracy and you’ll see that the biggest pirates are also the biggest buyers. Replace your private opinion with some science.
The reframing that will help you understand this is that these people are fans (I stole this framing from Korey Doctorow who releases his books online for free and encourages his fans to buy a copy if they like it). Fandom is a positive sum game. The more you do it, the deeper you go with it the more you’re happy to pay the people who create the content you love.
The easier it is for you to find new content the easier it is for you to become a fan of a new thing.
For example: I want to buy a copy of prince Pukler’s hints on landscape architecture. I can’t find a physical copy anywhere and I’m not sure if it’s worth $120 for a reprint or $500 for an older version. I could pirate it (I use that word loosely since this work is obviously in the public domain) and check it out, but I haven’t bothered so I haven’t bought a copy. This is a case of me NOT pirating and therefore NOT engaging with new content.
> But I flat out don't believe the "piracy doesn't lead to lost sales" shtick, of course it does.
I'm not as certain as you are. Correlation does not imply causation, but media sales have trended upwards in the age of piracy which leads to some interesting hypotheses.
A few years ago Shirley Manson (lead singer of the 90s band Garbage) accused YouTube of making its fortune off the backs of content creators - basically charging the entire enterprise as being one big exercise in copyright infringement. And yet the music industry, as well as Hollywood, seem to be doing better and better each year in terms of dollars made. Some of the distribution models have changed - broadcast and cable television are pretty dead in the water, but the entertainment industries in general seem to be doing better than ever. And yeah lots of individual artists are still getting raw deals from Spotify and labels etc. as they always have. But industry-wise, in terms of dollar amounts, it seems there's more money to be made than ever before from creating and selling entertainment.
The statement you made that I absolutely agree with is that it's hard to get real world data on this. An individual who is able to get free access to something may be unlikely to ever pay for that same thing.But the answer to the question: "Does piracy hurt the industry's bottom line, or help it on the whole?" is a very difficult question to answer. And we have to consider the even harder stuff to measure. Things like: is a teenager who pirates recorded media more or less likely to buy merch and concert tickets? More or less likely to buy a special edition package with tangible collector items?
At the end of the day, I have no clue.
I also offer all of this being very pro-capitalism and pro-intellectual-property. I don't condone piracy. But if we're just looking at raw data and trying to form our hypothesis, we have to start with the fact that the raw data points to upwards trends on the whole.
> but media sales have trended upwards in the age of piracy which leads to some interesting hypotheses.
But they were also on an upward trend before the age of piracy, so it's perfectly plausible to think they would be even higher. The same technologies that enable digital piracy also lower the cost of legal distribution, so you'd expect to see the industry doing better at the same time that piracy is rising.
Now, I'm of course not shedding too many tears for the major Hollywood studios, but I would like to live in a world with more niche films and games, and of course it's still quite difficult to make a living as an author or musician—a few manage it, most don't.
We agree that we don't have data—but to me, it just makes intuitive sense that a large majority of pirates are pirating lots of things they would have otherwise bought. For piracy to counteract that by generating buzz or aiding discovery or whatever it is... well, it would have to be an awful lot of buzz!
Occasionally in life, intuitions are dead wrong, and actual data leads to surprising discoveries. However, when faced with a lack of data, the first assumption shouldn't be "reality is the opposite of whatever I'd intuitively expect," that makes no sense.
I think there's a ton of motivated reasoning going on, and it just really bothers me. If you're going to pirate stuff, at least be honest with yourself about it.
> I find it extremely hard to believe that most consumers have a strong enough moral compass to go out of their way to buy something they already have access to
I like the idea that consumers only buy stuff out of moral obligation.
Like if you went to your ethical friend’s house and saw that he had empty book cases and no art on his walls because he hasn’t yet been imbued with the requisite moral fervor necessary to buy anything. It’s hard for him to be sure what he’s obligated to buy or that he’s obligated to buy anything since it would be wrong of him to know what’s inside any book without buying it first.
And then you went to your no-good, dirty, downright despicable friend’s house and it’s full of books and art because for every 20 books he pirates he buys one, and because he’s just so darn unethical he pirates a lot of books
Ok, there are not only obscure stuff. More blasts from the past, that really would deserve a better exposure. In term of non-Marvel/DC comics, things from Bernie Wrightson, P Craig Russel, George Besse, Alberto Breccia, Moebius, Druillet, Scuitten/Peeters, and others.
In term of letters, once again the almight Lovecraft letters are really jaw-dropping !
For movies, I discovered Vincent Price, Sam Peckimpah, John Ford, Wim Wenders.
So nothing really out of the "normality", but they are no longer marketed and are slowly fading to grey.
To be fair, the theory with the whole coin thing is solid, and I'd say it should count as something to be proud of even if in reality it gets tainted by speculative investments.
Yeah. I personally think the original bitcoin whitepaper is a work of genius. Balancing the soft game theory incentives with hard cryptography garuntees is really cool.
I'd love to see more systems exploring this combination approach. There is a saying about not being able to solve a social problem with technology. Bitcoin is the blueprint on how to do that.
Its everything that came after that point that is the problem.
What is the status on I2P these days? I used to run a lot of stuff on it. It was a lot of fun. It was like this cozy alternative development of internet, where things still felt like 1997.
The numbers are interesting and a bit surprising to me.
I remember a time when people would have seedboxes for private trackers, data hoarders brag about having TBs of storage and yet only a handful of people are seeding the complete collection(s). I understand not everyone has or can seed multiple TBs of data but I was expecting there to be a lot of seeders for torrents with few hundreds of GBs.
Interesting to see that sci-hub is about 90TB and libgen-non-fiction is 77.5TB. To me, these are the two archives that really need protecting because this is the bulk of scientific knowledge - papers and textbooks.
I keep about 16TB of personal storage space in a home server (spread over 4 spinning disks). The idea of expanding to ~200 TB however seems... intimidating. You're looking at ~qty 12 16TB disks (not counting any for redundancy). Going the refurbished enterprise SATA drive route that is still going to run you about $180/drive = $2200 in drives.
I'm not quite there as far as disposable income to throw, but, I know many people out there who are; doubling that cost for redundancy and throw in a bit for the server hardware - $5k, to keep a current cache of all our written scientific knowledge - seems reasonable.
The interesting thing is these storage sizes aren't really growing. Scihub stopped updating the papers in 2022? At honestly with the advent of slop publications since then, the importance of what is in that 170TB is likely to remain the most important portion of the contrib for a long time.
True but it matters a lot less in many fields because things have been moving to arXiv and other open access options, anyway. The main time I need sci-hub is for older articles. And that's a huge advantage of sci-hub--they have things like old foreign journal articles even the best academic libraries don't have.
As for mirroring it all, $2200 is beyond my budget too, but it would be nothing for a lot of academic departments, if the line item could be "characterized" the right way. To me it has been a bit of a nuisance working with libgen down the last couple months, like the post mentioned, and I would have loved for a local copy. I don't see it happening, but if libgen/sci-hub/annas archive goes the way of napster/scour, many academics would be in a serious fix.
It's 167.5, not ~200, and you can get disks much larger than 16 TB these days - a quick check shows 30 TB being sold in normal consumer stores although ~20 TB disks may still be more affordable per byte.
In text form only (no charts, plots, etc)- yes, pretty much all published 'science' (by that I mean something that appeared in a mass publication - paper, book, etc, not simply notes in people's notebooks) in the last 400 years likely fits into 20TB or so if converted completely to ASCII text and everything else is left out. Text is tiny.
The problem is it's not all text, you need the images, the plots, etc, and smartly, interstitially compressing the old stuff is still a very difficult problem even in this age of AI.
I have an archive of about 8TB of mechanical and aerospace papers dating back to the 1930s, and the biggest of them are usually scanned in documents, especially stuff from the 1960s and 70s, that have lots of charts and tables that take up a considerable amount of space, even in black and white only, due to how badly old scans compress (noise on paper prints, scanned in, just doesn't compress). Also many of those journals have the text compressed well, but they have a single, color, HUGE cover image as the first page of the PDF, that turns the PDF from 2MB into 20MB. Things like that could, maybe, be omitted to save space...
But as time goes on I start to become more against space-saving via truncation of those kind of scanned documents. My reasoning is that storage is getting cheaper and cheaper, and at some point the cost to store and retrieve those 80-90MB PDF's that are essentially total page by page image scans is going to be completely negligible. And I think you lose something be taking those papers and taking the covers out, or OCR'ing the typed pages and re-typesetting them to unicode (de-rasterize the scan), even when done perfectly (and when not done perfectly, you get horrible mistakes in things like equations, especially). I think we need to preserve everything to a quality level that is nearly as high as can be.
> In text form only (no charts, plots, etc)- yes, pretty much all published 'science' (by that I mean something that appeared in a mass publication - paper, book, etc, not simply notes in people's notebooks) in the last 400 years likely fits into 20TB or so if converted completely to ASCII text and everything else is left out. Text is tiny.
20 TB uncompresssed text is roughly 6TB compressed.
I just find it crazy that for about $100 i can buy an external hard drive that would fit in my pocket that can in theory carry around the bulk of humanity's collected knowledge.
What a time to be alive. Imagine telling someone this 100 years ago. Hell, imagine telling someone this 20 years ago.
I was reading a book series from my local library and for reasons I don’t understand they were missing the third or fourth book in the series. Probably damaged or lost. I even thought I could check the local (especially used) bookstores, buy a copy and then gift it to the library, but there’s a new edition that has a completely different vibe and size, with 2024 prices so I thought better of it. So I’d heard of Anna’s Archive and I got it there. Then it turned out one of the last books was unavailable too, can’t recall if it was missing or someone else had it out and wasn’t going to return it any time soon.
I was just trying to finish this writer’s corpus on a reread of their later material. It’s not that I’m cheap. I own a paper and audiobook copy of several of my favorite books. Including this author, so I’ve paid her twice. I just avoided the trap some of my friends long ago were falling into of hoarding books, by only keeping books I intend to read again. So any completionist tendencies have always been resolved via library or electronic editions.
I’m getting older now, and my first real confrontation with my own mortality came up with books. I have several years worth of books even if I were retired and reading three or four a week. New things come out all the time, and new voices. I haven’t read some of these books in ten years or more. Am I really going to read them again before… So a couple years ago I reread Dune for what will likely be the last time and sold my ratty old yellow copies to a used bookstore. If I do it again it will likely be audiobook.
People have likely already been mirroring it quietly for years.
IRL, "scanparties" used to be a thing if you were in the "bookz scene" around the turn of the century. (Where you and a small group of others go to a public library, hit the limits of your library cards and often clear out entire sections of shelves focused around a particular topic, meet someplace to scan/"cam" everything you borrowed as quickly as you can for processing and uploading in the near future, then return them all within a few days, and repeat this until you get bored or have other things to do.)
Precisely. To be clear, I don't agree with a comment upthread saying the "shoutout" is what might potentially do harm to the IA in court. I think the actual act of having scraped all those books from the IA's lending system could potentially do harm to the IA in court. The publishers can now point to all the copies of the books in the wild that IA had in their lending system and argue that IA's system is not legally acceptable. It was on shaky enough ground already.
I believe this was already brought up in the court proceedings, and Brewster Kahle already addressed it in April 2024: «Trying to blow protections we have put on files, for instance, does not help us– and usually hurts».
IA lending books with "weak" DRM also hurts efforts in reducing DRM and reforming copyright though and that is much more important in the long term. It was always a deal with the devil that IA should have never made and them now being at odds with others that preserve those books and actually make them available only makes that more clear.
It's like a food kitchen under a tyrannical regime complaining that people passing their food to rebels might get them shut down.
The only people facing consequences are the license-holders. Online lending libraries aren't missing a copy now that AA archived it, and there's not really a substantial cost to the hosters in network bandwidth.
Am I missing something here? As a user I don't empathize with anyone but the archivers.
I am curious how they’re funded. How they are able to stay online. Surely there must be people, governments etc with deep pockets that would want to take them down?
Can confirm this is happening. But the money paid is tiny. Think thousands of dollars, not millions. Not enough to keep the lights on. I would assume they do pretty well from donations.
Can you donate to them without someone claiming you're donating money to a criminal enterprise and getting you in trouble? I mean, without using bitcoins
If #1 is a reference to a famous quote from Steward Brand, founder of the Whole Earth Catalog, it's only part of the quote. The rest is relevant:
> On the one hand you have—the point you’re making Woz—is that information sort of wants to be expensive because it is so valuable—the right information in the right place just changes your life. On the other hand, information almost wants to be free because the costs of getting it out is getting lower and lower all of the time. So you have these two things fighting against each other
He stated later more succinctly:
> Information Wants To Be Free. Information also wants to be expensive. ...That tension will not go away
It's not a quote, but a statement. And even if it were a quote, random other quotes from the same person are not relevant. "This is just a part of the quote" people are so annoying. Like guess why it is "only a part of a quote"? Because some parts are neat, insightful and true, and some other parts are irrelevant and garbage.
Sorry, this was a more general rant, because it is so annoying every single time.
In this case: Who the hell cares about that random guy's random views? How is it relevant in this conversation?
For me, it was useful to clarify that "information wants to be free" was "information wants to be gratis", not "information wants to be libre". I didn't realize it referred to cost.
That's not a real tension. There is no case where the inherent value of some commodity keeps its price high despite easy availability. That's the point of the "diamonds in the desert" thought experiment.
Inherent value provides a ceiling on the price of whatever it is.
Availability also provides a ceiling on the price.
If I give you two theorems that say C < 300 and also C < 10, why would you describe those as being "in tension" with each other?
The tension arises because in some cases, at least for a while, the availability can be suppressed. Like when some expert releases an expensive ebook or video course "Secrets of X". Ofc many such books are scams, but assume for sake of argument the information is actually valuable. The initial buyers are motivated not to share it. It remains a scarce commodity for a while. But all it takes is one person to make a torrent, and the game is over. So there are two incentives -- one trying to keep it scarce, and the other trying to make it free.
Copyright was created because we realized that it takes effort to put works together (in the original case it was educational information) but that distribution can be done without rewarding that initial effort. Which then results in the initial effort not happening. Which then ends up in a dumber, less intelligent, idea poor world without those works.
Society agreed to copyright because of the social benefit of having people willing to put effort/expense into creating works. We're not talking zero value internet BS, but real works. People who create the works don't make them scarce, their distribution is infinitely scalable. They just make it so that they are compensated.
Most information is not easily available, it is purposefully hidden because knowledge is power and money. And that's through all fields and not only Coca-Cola recipes.
The argument is that authors will stop making information publicly available because piracy takes away the value. So instead information will be hidden in vaults and do good only for a few people. Like how maps used to be top state secrets.
The obvious fix for this is to either eliminate trade secret protections in favor of patents, or make them conditioned upon escrow with the government to be released to the public domain after some time (perhaps half the time of a patent).
Don't want to release your recipe ever? Tough cookies when your lead scientists bring it to a competitor.
Trade secrets are counter to the purpose of "IP" law. The public has no interest in protecting them and every interest in... not doing that.
Until every new born child is forcefully implanted with a microchip in their brain at birth, you will never be able to stop people from thinking and having secrets.
If people are not fairly compensated for sharing their secrets and discoveries with the public, they won't do it. They'll take it to the grave if so be. And we loose out on information which can benefit an enormous amount of people.
So the quoted person is absolutely right that there is a great tension between these two factors. How should great ideas be greatly compensated while giving the widest access possible? Neither piracy nor expensive access to information is the right solution.
Trade secrets never expire and sharing them is a crime, so currently people can take them to their grave and the government will have their backs in doing so. A single person's secret is also unlikely to matter much next to the potential of global corporations' secrets, and the nature of corporations is that they are made of people who have little reason not to take an offer with a competitor after they've learned the necessary secrets to do their job. Hence, don't protect those corporations unless they offer something in return (explicitly divulging them/contributing to the common knowledgebase). Without that protection, knowledge can more naturally spread.
The fair compensation they should be offered is time limited protection. Otherwise it should simply be legal for any of their employees to spread that knowledge. Giving unlimited protection to not divulge knowledge is counter to the entire point of "IP" law.
"The" Coca-Cola formula would have lost its patent restrictions a century ago. It's still unshared. Why exactly should we continue to grant any legal protection from an employee sharing it?
What are social security numbers if not just another bit of information that wants to be free?
Or perhaps you are saying that people that have an interest in the availability of particular information should have some control on that information's freedom...
The idea that any widely transmitted identifiers' confidentiality should be its primary method of security is asinine.
The failure of any exploit regarding SSNs or the like is not on the offending party, but on each using party's failure to implement even a modicum of actual security.
A widely transmitted identifier that tons of organizations need to ask you for taxes is not secret. It's used to precisely identify who you claim to be. It's your username. There's not much to say about also treating it as your password except that it's asinine. It's like treating your first name/last name as a secret password.
People can do good things and bad things simultaneously. Unless me supporting the good things directly enables also the bad things, I don't see a reason to throw out the good thing.
He said he personally suspects, I don't think that was more than a throwaway comment. Besides, if my enemy is dismantling an institute in my society that I want dismantled, I'm not going to complain.
I'm sick and tired of this misquote; as it was merely an observation of trends, and was never meant to be a moral maxim or mandate. If you truly believe information needs to be free as a moral mandate, share your company's source code first.
Meta illegally scraped 80TB of data from Anna's archive, Libgen, Zlib etc. I'm sure other tech giants did too. Without paying them a cent, costing these projects $$$ in bandwidth/hosting etc.
when I hear people complain about these projects it just sounds like hypocrisy.
Kudos to the team behind this project! It looks like they have improved UI in last year. The crucial problem right now is to remain accessible or to survive. I have no idea how much effort is being put into it. I wonder is it possible to remain afloat despite all efforts to take them down?
There was a pretty major UI update in the past 2-5 days-ish.
Apologies for the minor grumble, but on mobile I used to be able to browse search results much more effectively; the new design only fits ~4-5 results on a screen.
If there's a book that only has e-book versions on amazon, what is the best way to ensure the author gets money? I'd rather not fill my little apartment with paperbacks, and ordering a paperback and then returning it sounds kind of wasteful. Although I guess I could buy a paperback downtown and drop it off at the used book shop .. What do other people do, when they want to pay authors and read e-books without aiding and abetting Bezos?
It is neither humorous nor strange because that formulation omits authors.
How many authors who write the books in Anna's archive are happy about it?
I personally am pro Anna's archive (and sci-hub, etc) because I believe it benefits society to have better-read citizens. That said, I have some misgivings, because under our current system, there are issues with law and remuneration.
IMO, Scihub and the ebook parts of AA should be considered differently and not conflated.
In particular, Scihub is in opposition to the parasitic international publishers who dominate and control scientific publishing for profit, mostly on the backs of science generated by academia and other not-in-it-for-the-profit folks.
In contrast, downloading ebooks may, in some cases, lead to individual authors being hit in the pocket, in a profession it’s already hard to make a living from.
(I wish we’d figured out a better way to organise book publishing without publishing companies getting in the way and taking their large slice, allowing authors to profit more directly.)
Even if that might be the case now, I doubt that holds if piracy becomes truly widespread.
I would suspect A pirates book B and tells C about it, C buys book B is a lot more common than A pirates book B and likes it enough to buy it
I have no data to support this, and while I have paid for things I could access for free, but I'm sufficiently pessimistic about human nature to think that's the norm.
Piracy has been "truly widespread" for decades now.
Most people who are able to, still pay for things, especially if they're convenient. Even when those services actually add additional restrictions to their access to the media they think they're paying for.
That's absurd. I could potentially believe the conclusion that piracy doesn't take away from sales (that is, most people who pirate would otherwise do without, and not buy a copy). But the idea that many/most (or even some significantly-small percentage) of people who pirate will buy copies of the things they like? No, that doesn't pass the sniff test.
I do. When I was poor – I couldn't do it. Now that I'm wealthy and can afford any book, I prefer to take a quick look at online version and then buy a physical copy.
If you and I would support the works we think are good, why wouldn't others? I keep noticing that people constantly expect worse morals from others than how they claim they are themselves
It's easy to add a "me too" onto the existing list but that's not my point. I think we generally can expect better from the average person than we instinctively do. If 50% of people are just as honest as we are (if we're average persons which, on average, we are), that would be easily worth it if free distribution of a book gets you a 3x bigger reach as compared to when people have to pay up front. I'm not aware of research confirming or refuting this (of course I'd like to believe that information can be free), but it doesn't seem so outlandish to me that we can ignore the option altogether by doing a sniff test
This is true for me! For authors like, I might read a few epubs, then buy their entire series in hardcover (or paperback if no hardcover is available) to have in my bookshelves for rainy days.
Depends. I've seen some in favor and some against. Academics who have their papers paywalled by publishing entities against their own wills are generally for it.
Academics get their income from their university positions, and don't get any royalties from sales of their articles. Instead, the benefit they get from publishing is to their reputation, and for that it's better for their work to be as available as possible.
It's completely different for a writer who gets their income from sales of their work, obviously
Yep. And not that you asked, but my own opinion (not theirs) is that even writers who get income from sales will be fine either way. Reading a book for free and then buying it to support the author if you want to has been a practice for longer than the internet has existed. It's exactly how libraries have always worked!
The Council of Europe has decided that the websites of RT (formerly Russia Today) and Sputnik News may no longer be transmitted. The website you are trying to visit falls under this European sanction.
VodafoneZiggo is obligated to enforce the sanction and has blocked the website.”
Does Anna's Archive or a similar site host, say, the complete New York Times (pre-1930) as a full PDF download set? And every other newspaper too?
Tons of public domain sources are locked into websites like Newspapers.com or the nearly-dead and now completely unsearchable old Google News / Newspaper.
It would be nice if the massive pursuit of AI training data resulted in some fully-legal open source alternatives to these proprietary, outdated, or abandoned sites. I know some of it is available via the Internet Archive, etc., but something new with an AI-powered search and finding aid sounds so useful.
I imagine it's possible to achieve this through torrents from Anna's, but you'd have to search and compile the list of all individual PDFs.
> something new with an AI-powered search
With enough time and willingness, someone could put all the old NYT issues through optical character recognition and convert them to text; then make it available to large language models for semantic search of some kind. Ideally public cultural funds could support the effort as academic research.
This is surprising. I thought last I heard they'd arrested the guy who was suspected of running the site, about a year or so ago. Guess I'm misremembering.
Also I'm surprised Cloudflare hasn't shut them down like they do for other dodgy sites.
When accessing from Belgium the link is blocked by Cloudflare:
Error HTTP 451
Unavailable For Legal Reasons
In response to a legal order, Cloudflare has taken steps to limit access to this website through Cloudflare's pass-through security and CDN services within Belgium
CF is in a position such that if they aren't cooperating with national laws, then they are actively hindering them. National governments don't like that, and will have ISPs block CF wholesale if that's what accomplishes their goals.
To operate in Belgium, they have to follow local laws and comply with legal orders. They either make the site unavailable to local IPs or leave that market.
I'm unable to resolve the domain on EE UK - looks like it's DNS blocked.
By comparison, on my work network (TalkTalk) I can resolve the domain but I get a connection reset from the site.
I think this might be the first time I've hit a DNS block. It feels rather eerie seeing people talking about a site that, from my point of view, doesn't even exist...
There's an inconsistent censoring of numerous websites across the UK. In short, the biggest ISPs (a list which changes over time), will block various sites (TPB, libgen, AA, and others), based on court orders taken out at different timesIn general, it's a good idea to use Private Relay if you're using Apple devices and have access to it, no matter what network you're on, and if you're doing anything you don't want your ISP to traffic capture you should be using VPNs and/or Tor.
There are a lot of legitimate reasons to want to use scraping sites that UK copyright law is not nuanced enough to protect, and so blanket bans just end up emerging at the demands of copyright owners (which more often than not, means Disney or Springer).
Yes, Ofcom really needs to sort this out properly. I shouldn't be able to access this site from a UK ISP. Makes no sense that it's blocked on some and not others.
Idk, I went there a couple of times, I just love the people, the country. It’s a trip back in time. So it was my “random pick” for an exit node. And now I can read rt.com, sail the high seas, open any libgen or Anna's Archive. They're not part of the EU, seem far away from it (no euro, guarded borders, ditched their communist dictator who completely isolated the country ~40 years ago). Perhaps they are less easily coerced into censoring as practiced by countries primarily governed based on GDP and what the big corps want (although everybody seems to smoke everywhere so they could use some of that EU influence).
There's "431 Request Header Fields Too Large" which you will see occasionally. But after that 451 is the only other 400-level error code above 429. It was chosen as a reference to the book Fahrenheit 451.
Know am going to be downvoted into oblivion, but as a composer, can see it from the side of creators. Yeah, making their products free is starving these industries. For instance, in music, there is already very little money in music (think about how many musicians you personally know who can make a living off of music, besides being a music teacher). And, the music industry is still not even the same size as it was in 90's - global revenue in 2024 was $29 billion, while in 1994, in was $35 billion (and that's not even taking into account inflation).
Yes, there are many other reason why the music industry fell, but when your main demographic can always go to bittorrent to get their music if prices are too high, then there is only so much you can do with the price of music.
Yeah, I remember the 90's, music was huge, and there were so many good bands (Smashing Pumpkins, Nirvana, REM, White Stripes... Or if you're more into popular music, Michael Jackson, Whitney Houston...). Now, music is de-valued and cheap and our music scene has been decimated. Personally, think we should try to find ways to support musicians, writers, thinkers, artists...
... but if you have a different opinion, no worries. But, if you can, give it thought.
The ideal situation would be building a society that believes everyone deserves to be fed, clothed, and housed regardless of their ability to make profitable things. Weird how politically unpopular that seems to be.
Both producers and consumers of media are in the same boat of barely surviving. Maybe we can work with each other instead of against each other? :)
Streaming has replaced piracy, and scammed artists in the process. You can complain to the labels for that.
As for why I download: I am legally forbidden from buying the music that I want. Either it's the selling label geoblocking, or they only sell versions in a shitty format like mp3. I'm not jumping through hoops to give you my money, either I can buy FLAC files, or I download.
I want convenience, the same way users want it. Artists discovered that they were scammed by the labels instead of the pirates.
The devaluation happened through streaming services. Instead of spending dozens of dollars on subscriptions, BitTorrent and last.fm enables me to find what I like and spend the money on Bandcamp instead where it actually reaches the artist I am buying from.
I can just get a Spotify subscription instead though, if you insist.
I think a lot has happened since the 90's, and you rightfully point out that there was very little money in music to begin with. Labels generally always took a very large fraction of a physical CD sale, for example, so the model was rather rigged from the beginning (and recorded music doesn't have that long of a history, anyway).
In general, I'd argue that Spotify will be more toxic to the industry (or the artists' livelihood) than piracy. Streaming is even more predatory and centralized than labels in the 90's, but with an important caveat: it's legal. When people engage in piracy there is at least some awareness of, say, the pirate being at fault in the transaction — even though, as someone else already mentioned, people who pirate might contribute, or engage in other ways, with the creators. But with streaming, it got normalized to pay artists a fraction of a cent per stream (and the terms get progressively worse). I've countless times heard the argument "at least they get paid something!"
Bandcamp, for example, seems like a much fairer ideal for the industry. Luckily, the Epic buyout a few years ago did not immediately ruin the business.
As for the music in the 90's...music has changed. Naturally, one could argue that these are also exciting times: one can singlehandedly produce a record, distribute it independently, and be touring all over Europe without ever having to sign off to a major label. Is this not a good thing — or at least, a notable one? Of course, there's still great music around.
Yeah, usually, have also read that the only ones to make music on Spotify are major artists. They take a huge chunk of the the money distributed to musicians. At least for me, have never heard of any musician making a living off of their Spotify sales, not even close.
And Bandcamp does seem nice, wish it took off more.
And yes, I do completely agree with you that there are some big positives with today's music landscape. The rise of Digital Audio Workstations (DAW) to create your own music was a revolution, as is youtube for getting your music to the masses. Seems like a ton of musicians got their break from this these days...
...So as we talk, am thinking, maybe piracy has become a unimportant aspect of the music industry?? Hmm... Well, one aspect is missing, the seasoned engineers, producers, marketers and managers who can get your music created, promoted and performed all without the musician's needing to learn all this themselves. It really is a lot of work!
There's also the effect that new musicians are competing for attention with an ever growing catalog of top artists. I already have hundreds of CDs, so I'm not particularly inclined to go find whatever the 2025 version of the Smashing Pumpkins is because I already have the old one. Looking at this year's Billboard 200, I don't think I'd be interested in SZA or Lil Baby. Bowie died almost 10 years ago. I guess I'm good with what I've got.
Definitely... and think about your comment, it's probably what we've all heard, that the teens/twenties is the target demographic for the music industry, as they're the one who go out and buy things. Yeah, I don't buy that much music these days, maybe a few songs and albums per year (and I'm in music!).
In the 90s the good bands got lucky that their distributors picked them up and promoted them etc. You just don't remember the amount of crap that was on at any given point in time.
Today you have instant access to millions of songs around the world in every genre imaginable: https://everynoise.com/ And not just to the whatever few records your local store carried, or what the Big Four paid the radio stations to promote.
I do agree that youtube has made it much easier to self-promote, and that today's model has replaced the old one and is doing decently. Still, the at least by the numbers, the music industry is still smaller than it used to be. Unfortunately, money is a powerful resource, and it's not like the music industry took everything and completely screwed over the musicians. They helped struggling musicians survive, giving them a chance to make it, while taking care of a lot of the non-music-related tasks that are actually very time consuming - promotion, lining up performances, lining up interviews, learning the successful strategies for giving a band a chance to succeed, networking... It is really another job in itself and is very difficult.
Labels still do this today, but it's just the number of opportunities for musicians is smaller.
Although, again, do agree that youtube (and somewhat spotify from what I've heard) has made a huge difference. I've heard a few times that Youtube is probably one of the best resources for self promoting music, but being good at making videos on youtube is not easy to do well and is also another job in itself.
> I do agree that youtube has made it much easier to self-promote
And Spotify. And Apple Music, to an extent. And even SoundCloud.
> They helped struggling musicians survive, giving them a chance to make it,
Survivorship bias. You're completely ignoring the artists that never got the attention of distributors, or got immediately dropped, or dropped after the first disappointing (by studio standards) sales, or screwed out of revenue and royalties, or...
> Labels still do this today, but it's just the number of opportunities for musicians is smaller.
Labels still do this to the same extent as before. They spend about as much money and, percentage wise, keep as much money as before. It's even easier for them because a whole layer of physically printing and distributing media (tapes and CDs) is gone.
And the number of opportunities for artists increased, but became more complex.
In 2012 an otherwise unknown outside South Korea artist reached a billion views on Youtube resulting in worldwide tours. Now there are millions of unknowns on the same platforms. It's never been easier to promote your art, and it's never been more complex because there are so many others.
Always been the case. I have a late boomer early Gen x friend, who will insist that music was better back in the day, and that everyone was listening to Zeppelin and such, and nothing else. You can pull up the billboard charts for any year he waxes about and read off the top n, and rarely if ever find a track from the bands he claimed "everyone listened to."
Survivorship bias is and always has been real. If you don't believe me, think about the last time you heard Tubthumping from Chumbawumba on the radio or in a commercial
I'm not convinced that every pirate download equals a lost sale. Certainly sometimes it does, but I don't think it's the case that creators lose much revenue due to piracy. I think the big music labels and giant publishers might -- might. But that's not the same as creators losing money. And we're also unable to count how often piracy results in concert ticket sales that may have otherwise not happened.
> but when your main demographic can always go to bittorrent to get their music if prices are too high, then there is only so much you can do with the price of music.
And that's the thing: if the prices are too high, in the absence of piracy, most people are going to just do without. There's no lost sale when someone decides to do without rather than pay a price they thing is unreasonable.
I think the shift in the music landscape you see is due to three things: 1) your tastes have changed, and everyone looks at the "good old days" with a fondness and appreciation that is often undeserved, 2) the music industry itself has changed, moving away from the album-sales model, and fully embracing streaming (I believe around 70% of revenue comes from streaming these days), and 3) it is easier and cheaper than ever to create high-quality music; sure you need some level of talent, but many of the financial barriers to recording your own music (like the need for an expensive recording studio) have lessened or evaporated entirely.
> And, the music industry is still not even the same size as it was in 90's - global revenue in 2024 was $29 billion, while in 1994, in was $35 billion
This seemed surprising to me, so I did a little bit of light research. This isn't true. Revenue was steadily rising until around 1999, started dropping during the main time of digital disruption, to a low in 2014. In 2024, revenues were 1.5x what they were in the ~1999 peak.
Now, if you do inflation-adjust those numbers, you get a picture more like what you're saying, with a peak around 1999, a sharp decline, and then only a partial recovery.
But total revenue is only one part of the picture, and we can't judge creator impact solely upon that. And at the end of the day, no one is entitled to revenue. Sell a compelling product at a price people are willing to pay, and you'll make money.
Outside of streaming, I personally don't see many compelling products out there when it comes to music. I bought CDs and cassettes as a kid, but I don't see physical media, or even digital album bundles, as purchases worthy of my time. I have a YouTube Music subscription, and that fulfills the entirety of my at-home or on-the-go music needs. On top of that, I go to concerts and festivals when my favorite music is in town, and I'll sometimes buy some merch (like a festival t-shirt). Beyond that, I just don't see a need to spend money on music. (When I think about it, though, I probably do spend more money on music today than I did when I was buying physical media! Some of that is due to my better financial situation now, to be sure, but not all.)
> Personally, think we should try to find ways to support musicians, writers, thinkers, artists...
I absolutely agree, but I don't think piracy has the big negative effect on creators that you think it does.
Appreciate your view, and am no expert at this, but as you mentioned, the numbers do speak for themselves. Yeah, it isn't just "the good old days," we all who followed the music industry saw a huge decline in revenue in the 2000's (it was catastrophic and was as punch to the gut). It just kept going down year after year. And as you mentioned, if you adjust for inflation, the size if the industry is still smaller than it used to be...
Regardless, yeah, the music industry took a huge hit, and is looking better these days with streaming (which saved it), but it's still not great.
>And that's the thing: if the prices are too high, in the absence of piracy, most people are going to just do without. There's no lost sale when someone decides to do without rather than pay a price they thing is unreasonable.
Agreed, if prices are too high, yes, they'll do with out. But in the past, on average, it seems like most people did actually purchase CD's and DVD's, me included. Most of us had quite a sizable collection, and would routinely visit music stores to pay $20 to buy a CD, just because they liked one or two songs (and that's in 90's money). Yes, the music industry took a lot of the share of revenue, but that industry still is what promoted and supported the musicians.
I agree with you. There's a huge sense of entitlement from people who pirate, and the most absurd set of excuses. I bet most of them would shoplift if it was consequence free. And then complain that shops were going out of business.
I'd like that they enable torrents for single files, like internet archive does waiting too long for being able to download a file It's kind of annoying
We are still alive and kicking. In recent weeks we’ve seen increased attacks on our mission. We are taking steps to harden our infrastructure and operational security. The work of securing humanity’s legacy is worth fighting for.
Since we started in 2022, we have liberated tens of millions of books, scientific articles, magazines, newspapers, and more. These are now forever protected from destruction by natural disasters, wars, budget cuts, and other catastrophes, thanks to everyone who helps with torrenting.
Anna’s Archive itself has organized some of the largest scrapes: we acquired tens of millions of files from IA Controlled Digital Lending, HathiTrust, DuXiu, and many more.
We have also scraped and published the largest book metadata collections in history: WorldCat, Google Books, and others. With this we’ll be able to identify which books are still missing from our collections, and prioritize saving the rarest ones.
Much thanks to all of our volunteers for making these projects happen.
We’ve forged some incredible partnerships. We’ve partnered with two LibGen forks, STC/Nexus, Z-Library. We’ve secured tens of millions additional files through these partnerships. And they are helping the mission by mirroring our files.
Unfortunately we have seen the disappearance of one of the LibGen forks. We don’t have further information about what happened there, but are saddened by this development.
There is a new entrant: WeLib. They appear to have mirrored most of our collection, and use a fork of our codebase. We have copied some of their user interface improvements, and are grateful for that push. Sadly, we are not seeing them share any new collections, nor share their codebase improvements. Since they haven’t shown commitment to contributing back to the ecosystem, we advise extreme caution. We recommend not using them.
In the meantime, we have some exciting projects in the works. We have hundreds of terabytes in new collections sitting on our servers, waiting to be processed. If you’re at all interested in helping out, feel free to check out our Volunteering and Donate pages. We run all of this on a minimal budget, so any help is greatly appreciated.
Please remain up. Libgen no longer works. I've used IRC for fiction and non-fiction but tech books needs Anna's Archive and Libgen. I buy the physical with company budget to pay the author but I need DRM free ebooks to read comfortably on my Tab S9 Ultra.
Not accurate. You are probably looking at a site like https://libgen.ac/ which states clearly at the top: "Not a Part of Library Genesis. ex libgen.io, libgen.org"
Given that big tech has been scraping everything ever written to train LLMs, are there specialized prompts to trick models into spitting out copyrighted works ?
Kind of... the fact that they have the actual data behind a "soft" paywall (waiting times and terribly slow transfers otherwise) makes me a bit skeptic of their "goodwill".
No such thing as free when bandwidth costs money.
Any service online that is handing out things for free without restriction is getting their return through scrupulus means and shouldnt be trusted.
Anna's Archive straddles the line enough to allow people to download books for free but not at too great an expense to the volunteers who pay out of pocket to support the project.
Information and well-crafted sentences are available on the Language Tree, easily plucked by anyone at zero cost. It's greedy for those so-called novelists and subject matter experts to expect a living wage.
"Information wants to be free," which means that any cost of producing that information can be abstracted away due to ideological inconvenience.
Then show me the easily available "information on the langauge tree" to solve the unsolved problems in science.
Btw. books are not mere information, they are also products of effort and sacrifice and intentions. They are also embedded in an economic system of paper, books, ink, transport and what not producers.
So you are either poor or too lazy to buy a book from the store. But this doesn't justify mind theft or it's distribution.
Governments. You forgot governments. They take the bulk of the money, especially in Europe.
~25% VAT and then the publishers and retailers take their cut. The government takes another 40% in income and payroll taxes from that. The leftovers are what the author gets.
Buying from yourself is probably the biggest markup you can get.
Very little. Aside from high-profile/best-selling authors who do make a decent amount of money, the vast majority of writers do it because they love doing it, not because they expect to become rich.
I believe you only hit the paywall when you try to use the search engine & download individual files. They still offer the underlying data for free archival/mirroring via torrents.
I'll also say that when too much money starts becoming a part of this, trouble will increase dramatically. I realize this sort of endeavor costs a lot of time and money, but it's a line we should probably be aware of.
I know you're joking, but what the AI training lawsuits have said so far is that training and digitizing used books that you bought is fair use, but piracy isn't.
The Internet has been redesigned. It's just not been redesigned with your interests in mind and at least some of the "attacks" are features to the right people.
The precursor to BitCoin was this interesting project called HashCash. It was built to combat email spam and forced the sender to spend compute solving a moderate hash and put it in the header. The person who receives the email can prove easily if the sender "paid" the cost.
Proof of work and micropayments (eg. Xanadu or Internet Mail 2000) schemes solve spamming and LLM scraping, but are more expensive or more CPU-intensive.
P2P systems like FreeNet too, but they are harder to use and more storage intensive and make it easier to spy on individual users.
Tor solves UK-like surveillance laws but it's slower and makes it easier to spam.
Decentralization and interoperability, including the TCP routing protocols give the ability for the network to grow freely, but makes those kind of attacks easier.
The easiest way to mitigate those problem will be to decrease the openness and centralize more. It might lead to even worse things that DDOS.
I fully agree. It's difficult though because I genuinely believe that the solution space overlaps with cryptography, which is quickly discounted as viable option because it is now laden with negative connotations.
Cryptography has negative connotations? Like what? Do you mean cryptocurrency by any chance? (If so, it's feasible to practice cryptography without touching cryptocurrency).
- DRM.
- Owner-unfriendly device locks (such as manufacturer-controlled secure boot or locked-down OSes).
- Inability to audit network traffic from one's own devices, i.e. an IoT device.
- Remote attestation, when in opposition to open computing.
I could also see folks seeing the use of cryptography as "having something to hide" - I don't personally agree.
Because the vast majority of people don't want this, and not for some nefariuos reason or because they're stupid, but because we don't want to enable blatant fraud and abuse, among other things.
(Not to mention the astronomical technical work it would be; you can't just replace "The Entire Internet")
Right? I mean I love what they're doing. But at the same time please, stop claiming to be holy angels trying to build an archive for historical purposes. You're a terrific piracy site, period.
What is it then that they love doing? Is there a long-term thrill in being a piracy site? I don't think so. No truth in the angel story but they do say "it aims to "catalog all the books in existence" and "track humanity's progress toward making all these books easily available in digital form".
Zoom out, annas archive and every incarnation of the shadow library that exists is like the library of alexandria, in 150 years the copyright holders of the hour will be meaningless, nobody will care who got monetized or whatever, the point will be that a small number of vigilantes preserved human knowledge for posterity, and not even a half-second of thought will be given to the "crimes" that were involved in doing so.
I mean, you don't personally know any of them, do you? How could you possibly know what their motivations are?
And even if their motivations are less than pure, I will 100% get behind the mission of preserving humanity's literary output. If that's the outcome, I don't care about their motivations.
If efforts like this are to be sustainable in any lasting way, participants need to be cooperative, not parasitic.
I agree with the Anna's Archive team, it serves noone to have one of these players in the space hoarding their own collections and not sharing them to other archiving projects, it make the collection extremely vulnerable and at risk of becoming lost knowledge as time goes on.
I disagree with how this is framed. shadow libraries thrive on decentralization, any other servers mirroring a collection is better than no mirrors at all
Im not sure how you disagree with this.
Decentralization relies on multiple copies in multiple places.
The fact is that WeLib is not allowing other libraries like Anna's Archive to mirror or copy their exclusive collection, hence the recommendation not to use them.
Otherwise, please explain how I am missing your point.
> If efforts like this are to be sustainable in any lasting way, participants need to be cooperative, not parasitic. I agree with the Anna's Archive team,
>If efforts like this are to be sustainable in any lasting way, participants need to be cooperative, not parasitic
that is an odd demand for a site that thrives on piracy. Don't steal from the thieves? When you take from others it's liberation, when others take from you it's parasitic, that's certainly a convenient coincidence
They steal it, but give everyone free access. You can download it for free, but can also torrent everything. They don't hoard for themselves, but everyone gets access to what they have. That is the crucial difference.
Only giving access to your material over downloads means that people have to pay if they want to get more of it. If those people don't share it then the material is going to be lost again.
Torrenting all the material slapping using their frontend as a base and just making money is different.
They're not saying that the experience of using them will be bad; they're saying that participants in the ecosystem who are not cooperative are a net negative on the future of the movement. As a user, you may not see that directly, but only over time if resources are taken away from the cooperative parties.
Fuck that site. Offers people links to free PDF downloads of my book that I worked on for 32 years and finally got published by Pantheon Books in 2017. I didn't work all that fucking time for criminals like these to just break copyright law and make the book available for free. Fuck Anna's Archive, and I hope they go down in legal flames ASAP.
I hope you wrote that book more for personal pleasure and fulfillment than monetary gain. Over 32 years, would you have to be a best seller given the price of your book on Amazon (without counting the free audiobook you offer if someone starts a trial) to be making a minimum wage.
If you did that for passion and the book is good, it will definitely have a bigger impact if people can read your stories without having to go through Jeff or a bookstore (many English books are very hard to acquire outside of the US).
So, rejoice in the fact that someone thought your book was worth making available for the few who even know how to use these kind of online libraries (most people in the world don't). Bitterness on loss of revenue is definitely not worth it, especially after having put 32 years of life into it.
Unfortunately I don't really care about 60s US tech "scene" but the cover seems nice.
It may be a minority, but not all authors share your view. Paulo Coelho [1] says “a person who does not share is not only selfish, but bitter and alone”. Sorry gotta say it, your tone matches.
I can promise you that the site isn't the reason your book flopped financially. That is just what the vast majority of books do, especially ones on such niche topics.
I'm sorry you feel that way and it's understandable to be frustrated by them allowing piracy of something you've worked so long on.
That being said, do you know if their offering of your material has had a significant impact on your revenue or is it more the principal of the matter?
This strikes me as a bit ironic, if you're serious, as you list your current work as covering the entirety of the Beatles discography. Are you paying them for the rights?
I actually think it's ironic for precisely that reason. Similar to covering music, there is a legal precedent for making books available in public libraries - though most cover artists don't pay the royalties, and in this case this online library is not paying the GP. In the case that GP did in fact pay the fee, I rescind my criticism.
My understanding is that libraries do pay fees to stock books, some of which goes back to the original author. Anna's Archive does not pay anything back to the authors.
I think GP's criticism is valid. The toplevel poster is creating work that leverages the creativity of others. Regardless of whether or not he's paid a fee to do so, it's still funny to see the indignation about sharing, when the person's current project involves using the work of others.
There is both a qualitative and quantitative difference between covering/remixing the art of others, vs. just putting the original up for ~~sale~~ free.
I never heard of the site. But looking at it now, I can't see how it's anything else other than piracy.
I looked up one of my favorite authors ( https://annas-archive.org/search?q=scott+sigler ) and you can download practically his lifetime's worth of work in 5 minutes. This is not some author who lived 200 years ago - he is living and writing books now and this is his livelyhood.
Don't modern artists do this all the time? I mean, if you understand that you exist in a digital world where copying data is not only free and easy, but also the simple nature of computers, and that people do it all the time, can you really be surprised when your digital creation that's put into this world is treated like everything else?
Cultures are created to protect power structures. Culture is the enforcer of authority.
Culture distorts principles in order to defend the authority of evil. Culture must convince you that it is not wrong when law subjugates your worth and destroys your freedom. Culture convinces people of this by perverting the concept of morality.
Morality is liberty. Immorality is evil. The exercise and defense of freedom are moral. The destruction of freedom is immoral. This is the pure truth of morality.
Prudence is the proper application of principle. Imprudence is foolishness. Prudence is not morality. It is not immoral to kick a heavy stone with your bare foot, but it would probably be foolish. Prudence is a question of applying the principles and wisdom you have gathered in your life to achieve the goals you have for yourself. This is made possible by liberty. Without liberty, prudence is meaningless. Morality must come before prudence.
The great lie of culture is that authority is not bound by morality, and that authority can enforce its own prudence upon you. The great lie of culture is that you are worth less than law.
Cultures teach that intentions of prudence can be enforced by law. In this fashion they gain excuse to control the lives of people.
In order for people to learn, grow, and find happiness, people must be free to test their understanding of principles. With freedom, they can do this by a process of faith, trial and error. In this fashion children grow from immaturity to maturity. In this fashion human beings gain wisdom.
Cultures are agents of evil. The objective of evil is the damnation of your ability to grow strong in wisdom. The objective of evil is the destruction of your worth. In order to gain control over you, culture spreads the lie that authority is not bound by morality. It teaches that authority can destroy freedom at will, and claims prudence as the reason you should willingly submit. In the name of defending you, culture claims that the destruction of freedom is morality. Cultures pretend that evil is good and that good is evil.
Prudence can be found all around you. It is found in the choices you make every day. Even when a mistake is made, you learn prudence. Prudence cannot be enforced. To enforce prudence is law. Law is lie. Without the freedom to choose, you cannot learn prudence. You cannot be happy.
Morality can be found all around you. Wherever you find it, you will find joy. Wherever you find immorality, you will find misery. Culture enforces authority by destroying freedom with law. This is immorality.. - The End of all Evil, Jeremy Locke
You have invested in an idea that has been created by power structures through culture, that you are getting harmed by someone else's freedom. The people that will/want to support your work will do so out of a desire to do so, not because law says its right.
Many people are deceived that law breakers are immoral and harmful to society, but I don't think that's the case. Most laws are created to subjugate people, (I.E, take away there agency) Law's created by power structures which are ultimately designed to benefit the creators or supporters have done a very good job and convincing the subjugated that their interests align. Those that have been deceived by a system of laws that benefit the powerful are too invested in demanding a return for their efforts. What ever happened to the priority of making the world a better place first and foremost and having faith that you will be compensated in some fashion for your efforts?
I think you must be using an unusual definition of culture. As I understand it, culture is, broadly speaking, the shared values and practices of a group of people.
The only way to avoid having culture, in the usual sense, is to prevent groups of people from existing.
It is unusual. We have been condition to believe that culture is created by shared values. But actually is guided and molded by authority to create the illusion that its driven by society. Obviously this isn't true in all cases, but for most, its my belief that it is.
People can exist out side of the constrains of a culture that is imposed on then by understanding their own human value and worth that they are born with instead of looking to institutions and governments to give it to them.
In a society that doesn't have a centralized governing factor where the powerful impose their will on the people, then yes, I agree that its created by a shared understanding by its people. But that's not the case for 95% percent of the worlds cultures.
Oh, gotcha - if you'll permit me to paraphrase: it's not culture itself that you find evil; but that the powerful tend to warp the culture to protect their own interests.
Right. IMHO culture, at least for a very long time now, is used as a vehicle to push agendas, and people should be very wary about what to believe from what society says about a great many things.
I would agree if those shared values and practices grew entirely organically. But unfortunately people in power have a lot of, well, power, to shape culture.
People like this, because people like free stuff, and like to rationalize getting free stuff. Occasionally, someone who likes free stuff styles themself a freedom fighter, though their values do not otherwise seem to extend beyond getting free stuff.
Some AI company techbros like this data trove even harder, and limit their pretending to publicly saying things like "we're changing the world" (and "AI could be bad if you don't give us money and lock out competitors") but really only care about wealth and power.
Certain sanctioned countries that culturally value literature and science might also appreciate this. (This last category, I'm much-much more sympathetic to, and wish them well in their intellectual pursuits and appreciation of the humanities, though we should really find a better way to share that doesn't undermine Western economies and many people's livelihoods.)
I share your concern for the livelihood of authors (and your skepticism regarding the naiveté that often surrounds pro-piracy rhetoric), but I don't think that's fair to the question here. Unlike in the case of music or film, most users are not just trying to get the latest NY Times best-selling novel. The percent of books made accessible through these services that are tied to an author's income through consumer sales is negligible. Most specialist literature, whether in the natural sciences or the humanities, is priced under the assumption that university libraries are the ones making the purchase, often more or less automatically. Yet even and perhaps especially in the US (I know nothing of the library culture in certain sanctioned countries), it's increasingly rare that university libraries have open stacks for non-students and there are incredibly few public libraries that actually provide access to scholarly works, past or present -- New York Public Library and the Library of Congress in DC are the ones I've used personally, but I'm sure there are a handful of others.
Moreover, however many countless AI companies now buying and pulping copies of every book in existence seems to be really changing the used book market. Prices are going up dramatically and before this year it was very rare to not find a single copy in the world of whatever old book one desired.
As someone who spends a disproportionate amount on books and shares your concern for not making life even more difficult for authors, these services going away would be a tremendous regression.
Don't forget the video piracy thread had a lot of justification to the effect of 'the people that work on these shows/movies don't get paid enough anyways, so it's ok for me to pirate'. Wait, so you think they should get paid more for their work, this what they do is worth being paid for, just not by you? Weirdest flex.
No, I've absolutely seen that argument made online as justification for music and movie piracy, many times, for many years.
People rationalizing aren't mental giants. Piracy is generally by people who want free stuff. Not by philosophers who arrived at piracy through some line of reasoning other than wanting free stuff.
Anyone who doesn't train on all material available, legal or otherwise, will be outcompeted by teams that do, including those based in countries that don't respect Western copyright law. It's that simple.
Either this is practice is judged (or legislated) to be fair use, or copyright is done. It's also that simple.
I'm not convinced that LLMs and other AI models need to train on all material available. A representative sample is better.
I'll ignore the legality aspects in my response. I think coming up with a representative sample of all relevant information would be better in the long term (teams will not be outcompeted on long time horizons). Why don't the companies do this? Because it is easier to just "carpet bomb the parameter space" and worry about the potential confounding [1] and sampling bias [2] later. Coming up with a representative sample requires domain expertise and that is expensive in terms of time and money. But it reduces the total amount of training data and should reduce the amount of time and resources it takes to build the models. That may matter now that models are quite large.
This is definitely a design decision with tradeoffs on both sides. I can entertain the notion that we don't have time to sample things, but I think we are all too often dismissing the long-term benefits of proper sampling.
(In terms of the legality aspects, judges are trying to "split the baby" [3] in my opinion by saying that training on stuff you got legally is OK but training on pirated material isn't. So nobody is going to recommend training on pirated material in the first place.)
So, what? Authors and rights holders are supposed to just take it?
Copyright law exists for a reason. Trying to improve an LLM doesn't give you the right to flout our legal system. Yes, other countries might have an advantage in LLM training as a result but so be it.
> Authors and rights holders are supposed to just take it?
If it's judged as fair use, then yes. And then it's not flouting anything.
Remember the whole point of fair use is to benefit society by allowing reuse of material in ways that don't directly copy large portions of the material verbatim.
For example, nonfiction authors already "just take it" when reviews describe the main points of their book without paying them a cent. The justification is that it's for the greater good, and rights are limited.
Judges have recently ruled [1] that training on legally obtained materials constitutes fair use, but we will have to see in the long term if that ruling holds up.
> Remember the whole point of fair use is to benefit society by allowing reuse of material in ways that don't directly copy large portions of the material verbatim.
That's a rather bastardized and twisted representation of copyright and fair use.
The "whole point" of copyright was to promote the authorship of original creative works by legally protecting the financial income of those authors. The "whole point" of fair use was to make exceptions in cases where it's clear that the usage doesn't result in a market substitute and deprive original authors of their income.
The end-goal of LLMs is to ingest all of that original content and reproduce it with expert-level accuracy, promising to be the know-all, end-all product. If wildly optimistic predictions of LLM proponents turn out to be correct then they will never buy a book again, they will have no reason to. And this is precisely what the copyright was designed to protect authors against.
If wildly optimistic predictions of LLM proponents turn out to be correct then they will never buy a book again, they will have no reason to. And this is precisely what the copyright was designed to protect authors against.
And under those circumstances, your opinion is that copyrighted books should continue to exist, with full legal protection?
How could anyone, including the authors, possibly benefit from an obsolete paradigm like that? At that hypothetical point, your attachment to legacy copyright law would arguably hold back human progress as a whole, not just impede a few greedy corporations from training models on illegally-downloaded books.
Sure, but copyright was designed to accomplish clearly defined goals and LLMs clearly undermine those goals. The motivation and spirit of the law are extremely plainly stated, you don't need to be a legal expert to understand it.
We should absolutely have a discussion about modernizing copyright (and patent!) protections. But it has to be done through a democratic process, companies shouldn't be allowed to just ignore laws that are inconvenient to their business model.
> At that hypothetical point, your attachment to legacy copyright law would arguably hold back human progress as a whole
There won't be any progress if nobody is getting paid for their work. Either copyright stands and LLMs aren't allowed to train without compensation, or they get an exemption and there will be nothing left to train on in a few years.
>the whole point of fair use is to benefit society
I'll stop you right there - I really don't think that applies at all. Does 'society' really benefit when the whole thing is a funnel for enormous amounts of wealth to go to already-gigantic companies like Microsoft?
Yes, if it helps me get my own job done more effectively, efficiently, and economically. That's how our society works. You and I benefit from this, too, not just Microsoft.
If you don't like it, there's a process for changing how it works, but don't expect an easy path to success. Various people will object, and will have to be won over to your way of thinking.
> If you don't like it, there's a process for changing how it works
Except the converse is true. Copyright law today governs how fair use works and even so, how material can be obtained, licensed, etc. To change it to explicitly allow what you're suggesting would require changing copyright law.
If you think copyright law as we know it will survive what's happening today, then... wow. No chance.
Copyright is not a natural right. We pulled it out of our asses, very recently at that, to meet socioeconomic goals that existed at the time. It can and will go back where it came from, if it turns out that AI is indeed a better way to organize, analyze, and distribute human knowledge.
Even if AI doesn't turn to be anything all that revolutionary, we'll still need to update the law to address both training input and ownership of generated content. Congress and eventually the international community will have to resolve a large number of conflicting legal judgments, unless we want to leave it up to SCOTUS in the US and various unelected judges and bureaucrats elsewhere.
> Remember the whole point of fair use is to benefit society by allowing reuse of material in ways that don't directly copy large portions of the material verbatim.
If I was a writer, I'd consider publishing my works under a license that explicitly bans AI training. What happens when those works inevitably get ingested by an LLM?
That clause of your license wouldn't be legally enforceable.
Your license can only operate with what copyright allows you to withhold initially.
A license that banned AI training cannot be enforced. It is meaningless. The same way you can't write a book with a license that readers are not allowed to write reviews of it.
Fair use cannot be restricted by license like that.
(You can engage in individual contacts with people, with terms like NDA's work, but those actually have to be signed and stuff, and you can't do it with public information like published writing.)
Copyright law indeed exists for a reason. And that reason was that church and crown felt threatened by the power of printing presses to distribute ideas they couldn't control. 'To promote the usefull arts' has always been a way to sell the idea to the masses.
Meta managed to get into a private ebook torrent tracker called Bibliotik a few years ago to use for training Llama and the resulting publicity essentially killed the tracker.
Just curious - What is the future of service like these? More and more content will be AI generated, to some degree. And should thereby that content be aggregated?
In the future, the curation function of libraries will become even more important. Libraries — even bookstores —, both physical and online, will probably use as competitive advantage their capacity to separate the wheat from the chaff. There's no value to a place where AI slop is prevalent.
Whoever is running it must be doing really well for themselves laundering all that crypto.
Also interestingly they don't offer a tor onion service, while the admin is most certainly technically competent to administer one given that he no doubt uses tor to insulate himself from his enterprise and launder crypto. What is the reasoning for that?
Your comment seems like a non sequitur to me. Whether something is a "non-profit" has nothing to do with whether it receives or spends money. (See, e.g. the American Red Cross's ~$4B/yr budget.) It's about what it does with the money it has.
Obviously, since Anna's Archive is breaking the law, it can't conform itself to the normal legal/regulatory system that governs non-profit organizations. It can certainly still claim to be acting in the spirit of a non-profit, and it's up to you to decide whether you trust that claim. Nobody's forcing you to give them money.
It may have that connotation to you, but in general (at least in the US) non-profit organizations are not required to have independent audits. Typically, that requirement only happens if they receive a certain amount of government funding. An organization may choose to undergo audits in order to make people feel better about donating to it.
I really, really don't think that anybody is being fooled or misled into thinking that Anna's Archive is a "legitimate" audited organization when they describe themselves as a non-profit.
> The connotation of a non-profit is that it's being audited.
This is very geography-specific. In the US, 501(c)(3)s (what most people think of when they say "non-profit" where I am) have no general requirement for audits. There's also plenty of non-profit-by-some-definition organizations that never file a Form 1023, giving up some benefits of the 501(c)(3) regulations but in exchange being even less regulated.
At least in the US, claiming that you are a nonprofit implies that contributions are tax deductible. Claiming that you are a nonprofit when contributions are not tax deductible might be considered fraudulent.
Not true. There are different classes of nonprofit and they are not all tax deductible. Some nonprofits opt to forgo pursuing that status because it involves a lot of extra administration/filing requirements.
You're responding to a different point than the one I made. It's true that being a "nonprofit" doesn't logically entail that donations will be tax decudtible. But it still implies it to potential donors. The former is a matter of logic, the latter is a matter of psychology. Both are relevant.
Yes, there are multiple classes of nonprofit, not all of which are tax deductible. But it is also true that holding yourself out to the public as a "nonprofit" has the potential to mislead because it may imply to potential donors that contributions would be tax deductible. That is why responsible (or at least well advised) nonprofits disclose which they are, because claiming you're a "nonprofit" in marketing materials, without further explanation, can mislead potential donors.
They are already very much in breach of US law, which they have always been clear about. That aside, they don’t claim that contributions to them are tax deductible.
I would love to see someone try to explain to the IRS why all those purchases of Amazon gift cards and Monero for the transparently illegal organization should be deductible though
Is Cosa Nostra a non-profit? The question doesn't make sense. It's a category error.
A non-profit is a corporate legal structure. An unregistered organization could be a cabal, a gang, a syndicate, a fellowship, a religion, a movement, a private club, or something else.
The intent is still important. While from a legal point of view a terrorist cell cannot be registered as a non-profit, it typically spends whatever funds it can secure to further its political goals, not on increasing the wealth of its owners or participants. A typical criminal band though is a for-profit entity.
Given the amount of hosting and storage needed to sustain this project. Nobody is getting rich off of donations.
Not to mention the lifestyle tradeoffs that innevitably come with international fugitive status do not lend themselves to a very comfortable life.
The usage of crypto is entirely one of necessity, as controling information and knowledge is something powerful people have clear stakes in. Many countries weild their financial systems to hold or acquire power. Information and Knowledge is one form of such power.
Everything points to the Anna's Archive team being passionate ideologues as opposed to some criminal enterprise focused on profit motives.
> Not to mention the lifestyle tradeoffs that innevitably come with international fugitive status do not lend themselves to a very comfortable life.
Anonymous international fugitive?
> Nobody is getting rich off of donations.
How can anyone aside from the beneficiary know that?
The extent to which the controller can get rich off this enterprise depends entirely on the unknown quantity of donated funds (and deals with AI companies) and his skill at laundering crypto (which darknet marketplace controllers doing far more illegal stuff can do).
I'll believe that when they publish financial statements.
> Everything points to the Anna's Archive team being passionate ideologues as opposed to some criminal enterprise focused on profit motives.
"Passionate ideologues" who make you pay if you want to download anything at speeds greater than 10KB/s, how nice of them. I would rather just support the author, thank you.
I generally support piracy, but these piracy-as-a-business vultures who've been showing up in the shadow library scene need to go.
illegal doesn't at all have to mean immoral or particularly wrong either. Laws are complex constructions, often created for decidedly hypocritical reasons of benefitting some at the expense of others.
Thus, Who gives a shit if they're taking money from those who voluntarily subscribe. They still offer an absolutely incredible free service to who knows how many people who otherwise wouldn't be able to afford so much access to so much free information.
Given the behavior of the pro-copyright business interests and legal bodies of the world, and the outright hypocrisy of openly creating one set of rules on content piracy for certain corporations while applying another, harsher rule system for those who aren't so nicely connected, smug moralizing about something like Annas Archive has little grounding.
And aside from picking random crap out of your ass for smearing arbitrarily, what shred of evidence do you have of anyone there laundering crypto, and how?
> what shred of evidence do you have of anyone there laundering crypto, and how
The controller's freedom. If they didn't launder it they wouldn't be free.
> They still offer an absolutely incredible free service
Actually their free downloads aren't particularly good when compared to some of the other online services that 'leech' from them.
And their torrent strategy could be altruistic but it could also be self interested. By spreading storage costs around and attracting more contributions. And providing insurance to hardrive seizures.
What mainly interests me is how much money they are actually making, I suspect it's very profitable.
>What mainly interests me is how much money they are actually making, I suspect it's very profitable.
Well, it's about calculating their site support, storage, server and bandwidth costs. What might those be? Aside from these, I've seen them claim they use volunteers for much of their site support and certainly don't pay, or need to pay, anything for marketing since the word of mouth (partly through notoriety and partly through uniquness coupled with extreme usefulness) is more than enough to keep them famous.
I choose the books I buy, from Anna's Archive. I choose the comics I buy from readComicsOnline. I choose the [european] graphic novels I buy from #WONTTELL.
And I am one of the best customers of these 3 physical shops, in my town.
So sure, I don't buy the latest trends based on ads. I investigate a lot to buy GREAT stuff. Sometimes the shopkeeper has headaches to find the obscure stuff I discovered online that NOBODY knows it exists.
Am I an exception?
I don't know but those services are great to maintain a freedom of choice.
It's complicated.
Many years ago, I was involved in a movie release group. Pretty much everybody in that group owned more VHSs/DVDs than the typical person. This is probably not surprising, since the time and effort one needs to put into that is rather large.
Those who only downloaded were more of a mixed bag; some of them were not in the US and might not be able to see a domestic release of the movies any time soon. Some proudly claimed that they never bought any media because paying for it when you could pirate was for losers.
I spent a small fortune on a record collection. Then the record format was abandoned and it was all CDs. I spent a small fortune re-buying that same record collection, insofar as the records were even available as CDs. Then we went all digital (yes, I know CDs were already digital) and it became MP3s. So I ripped my CD collection and assigned them to a box in my attic. I will not be spending money on spotify or whatever other service to listen to stuff that I already have.
Movies... I spent a small fortune on a movie collection. Then I moved countries and to my surprise found that my movies wouldn't play anymore. So I ripped the DVDs to digital media and played them using open source software. This saved a small fortune and was more convenient as well. I think I still have the DVDs.
I spent a large fortune on books. Thousands of them. Typically read once, a much smaller number read multiple times. So I gave away my books, except for a few hundred that I still keep. I support the authors that I like by buying their books but I read on screens not on paper because my eyesight sucks and on screens I can set the font to whatever I want rather than to what the publisher thought was optimal.
There is no way the media companies are going to guilt trip me over any of this, besides that I read both Janis Ian and Courtney Love's pieces on the recording industry.
Copyright is great, it has enabled lots of people to earn a living creating content. But it has also become a weapon in an ever more absurd war between consumers and middle men, the producers caught in some uncomfortable position in the background.
What's interesting is that the middlemen brought this all on themselves: they equated buying a physical copy of a production with licensing IP, but the general public didn't think that way at all: they bought a book, they bought a record, they bought a movie. And passing on what you've bought when you no longer need it was and still is such an ingrained part of our culture that it felt really weird to have restrictions placed on what you could do with stuff you bought and paid for. So when the format changed from physical to nothing (bits) plenty of people felt that this was not quite what we had agreed to, after all we were paying for the medium as much as we were paying for the content so how come we paid the same or even more as before? And now we paid and got something that we could no longer share with others. No way to easily pass that e-book to someone else (talk about malicious compliance), no way to send the song you just paid for through Spotify or iTunes to someone else to let them hear it after you are done with it. You don't own the medium any more so therefore you own nothing at all.
And those publishers and movie producers are all laughing to the bank whilst doing nothing at all except for playing bank.
Oh, the transition to ephemeral copies definitely changes things, and was well after my active time a movie piracy group.
The (in)ability to loan, trade, and bequeath media is a real loss in the ephemeral media era, and should be a serious topic of any copyright reform.
I can't even pay for a second copy of an ebook for friends reliably. They literally won't take your money for cross region sales or whatever due to asinine market restrictions
The french comic pirate scene has an interesting rule where they keep a ~6 month time lag on what they release. The scene is small enough that the rule generally works.
It's a really good trade-off. I would never have gotten into these comics without piracy but now if something catches my eye, I don't mind buying on release (and stripping the DRM for personal use).
Most of my downloading is closer to collecting/hoarding/cataloguing behaviour but if I fully read something I enjoy, I'll support the author in some way.
Are there any links you could share for.. ..ehm.. ..research purposes?
I've downloaded a few from yggtorrent, but there might be some more niche/less public sites I'm not aware of
Similar. Anna's Archive has become a more convenient alternative to the campus library. I can grab something while at home, get the info I need, and delete. If the title is worthwhile, I'll buy my own copy. I don't buy more books than I did before, but my satisfaction rate is higher, since I can check the contents before buying.
On the other hand, I buy way more movies than I used to, because upload sites have exposed me to many good films that I would never have heard of otherwise.
No, I'm the same. A lot of stuff I read is hard-to-get philosophy or from obscure authors, so I first get them from Anna's Archive. Reading them on paper is much better so I try to find a physical copy later.
> Am I an exception?
Years ago I was following development of an indie game. The developers wanted to provide a DRM-free experience.
The game had some online functionality (leaderboard or something). They were surprised when the number of accounts accessing the online functionality exceeded their sales by a dramatic number. The developer updates grew more and more sad as they switched from discussing new features to pleading with people to actually buy the game instead of copying it. Eventually they called it quits and gave up because the game, while very popular, was so widely pirated that few people actually paid.
Whenever the piracy topic comes up I hear people do mental gymnastics to justify it, like claiming they spend more than average and therefore their piracy is a net win. Yet when we get small peeks into numbers and statistics like with video game piracy, it’s not hard to see that the majority of people who pirate things are just doing it because they get what they want and don’t have to pay for it.
The difficult bit is working out what percentage of pirated copies are actually replacing a sale that would have happened if the content wasn't available to pirate. The more dramatic industry numbers like to claim it's 100%, which is ridiculous. It's certainly more than 0%, though.
I'd assume that for your indie game, there were a lot of people who wound up thinking "I would play this if it's free, but I wouldn't spend $X" on it. Adding successful DRM wouldn't have done anything to them but drive them away, and reduce the amount of buzz the game received. But then, particularly in the indie game space, maybe trading away a lot of buzz for a couple hundred more full-price game sales would have been completely worth it...
This is where the concept of services like Xbox Game Pass seem to be landing. Once someone has paid their fairly-small-amount each month, every game is now "free". Much like fairly-cheap streaming music basically stopped music piracy from being mainstream, cheap game-services might have the same impact on the game industry.
Though, much like streaming music, whether it turns out to be economically viable for the average game studio is certainly a question.
(For the sake of completeness: I don't pirate anything, so I have nothing to justify here.)
> It's certainly more than 0%, though.
Is it? You also need to account to sales that only happened because someone learned of the game via a pirated copy.
compare what nintendo sells to other publishers on PC...
Sales or economics is not the only thing a developer may care about. Some people want control over their work and will be upset from people pirating their game even if it doesn't mean they lose a sale. Similarly many artists do not want you to repost their art or use their art as your profile picture.
Ok, but should we care if those developers/artists get what they want? Some companies would also really like to take games they have sold you away from you so they can sell you the next installment. Some developers don't want certain groups of people they dislike to enjoy their game. Not all things that developers want are reasonable.
Sure, but the specific thing the person I was replying to said the developer was complaining about was not getting paid.
I think part of the question though is also, would they have been as popular as they were without piracy, which does provide some advertising benefits through audience exposure. It is easy to say a really popular game would still be popular without piracy, but some lesser known games might never have gained any attention at all if there weren't enough people spreading word about it. Of course trying to quantify the sales and word-of-mouth benefits from that sort of thing is extremely difficult.
> The game had some online functionality (leaderboard or something). They were surprised when the number of accounts accessing the online functionality exceeded their sales by a dramatic number. The developer updates grew more and more sad as they switched from discussing new features to pleading with people to actually buy the game instead of copying it. Eventually they called it quits and gave up because the game, while very popular, was so widely pirated that few people actually paid.
Ok, but why? Whas the game actually unprofitable or did they just feel bad about some people getting it for free. You need to remember that a pirated copy does not equal a lost sale - in fact, sales may even be higher than they would be without piracy as popularity gained from pirated copies also translates to more legitimate buyers.
Your story sounds like "World of Goo," which reported a 90% piracy rate from comparing unique IP addresses to number sold. Despite that, they didn't quit and recently released "World of Goo 2" still DRM free.
Yes, hit games are still popular enough for sequels (world of goo 2 came out 16 years after the first one, according to wikipedia, which is an unusually long time). I remember World of Goo being one of the few choices of games for iPad when it was young.
But the vast majority of developers aren't lucky enough to have massive hits, and so money differences can still matter.
Yeah people pretending that "piracy" is good because they can try product first before buying which is true but lets be real
out of 100 people doing that how many actually buy product in the end???? if net gain is positive then developer would not pay millions to license DRM
> out of 100 people doing that how many actually buy product in the end???? if net gain is positive then developer would not pay millions to license DRM
Lets not pretend that markets and companies are actually rational.
Which indie game was that?
[flagged]
I'm exactly the same. I tend to get the first book of any series that interests me and read a third before I decide whether to buy it or not. I do buy about 3-4 books a month (mostly epub drm free preferred) plus about 10 european graphic novels (paper books only) a month so I'm a heavy consumer I think.
I follow the newsletter from Borderlands Books in San Francisco. I usually buy one book off their best seller list a month (sometimes I’ll stop in and buy three or four)
I’ve recently started using my local library’s mobile app and I love it. (I typically use this for re-reading or audiobooks for plane trips) I’m tempted to donate my entire bookshelf to the library and let them store and maintain it for me :-p
I don't think I follow. There's no recommendation engine in AA, right? Do you download a bunch of books from AA, read them, then if you happened to like one enough, you will buy it from a local bookstore?
Let me give you an example.
Some Lovecraft letters were translated into french some weeks ago. Great reading! There, Lovecraft gives his opinion about the litterature and art of his time.
And he mentions Nicolas Roerich. No idea who this guy was, but hey pretty interesting painter (thank god Google Images!). Ok, let's check on AA if there is a definitive book about his art.
No luck, but that very same guy wrote many books about Hindouism and eastern asia. After a few downloads on AA, no big deal, I am not so fond of them. Except for one that I knew nothing about (the name is Altai Himalaya, and I have absolutely no clue why this one is picking my attention, but it does).
That's definitely what I call serendipity.
And that thing happens a lot when you have a full access to whatever content is available. [and you are curious by nature]
In the end, retrospectively, such widespread access permits serendipity at a level that is absurdly miraculous !
That’s exactly how I do it. I enjoy reading DRM-free epubs on my Kobo, and whenever I finished a book I enjoyed, I buy it from the local sci-fi bookshop. I buy about 90% of all books I read.
I used to do that with games back when I played. I was always a staunch advocate of, if it's good, people will pay, and I didn't want to be a hypocrite despite refusing to buy most games because they could not be returned afterwards. Even newer services that offer refunds make it more difficult than I'm willing to put up with. If I played it most of the way through, I bought it.
[flagged]
Let's agree that I use piracy to find the things that match my tastes. [something that the legal offers fail to provide conveniently]
Pretty sure anyone who both pirates and buys can call bullshit on you. Also science. Science can cal bullshit on you.
Same here.
Also, I tend to look for obscure and old books (I love old travelogues) and once I find one that really gets me, you'll be sure to receive it as a gift, if I think you'd be someone (or in a place in life) who would enjoy it.
So, I might not but it for myself but I make my decision on the pirated version and then buy more than my share when it's truly a gem. If I don't end up recommending it or buying it for someone that usually means it was something which I'd be ok not to have consumed.
If you haven't read it, The Long Ride by Lloyd Sumner is (as I remember it) an excellent read.
Studies show that the biggest pirates of content are also the biggest buyers of content. The theory is that piracy functions as a way to deepen paid fandom not to erase it.
>readComicsOnline
I'll never get over piracy sites blocking VPNs...
maybe my tinfoil hat is on too tight, but to me that behavior sounds an awful lot like what a honeypot would do...
> Am I an exception?
Yes, I think you're an exception, sorry.
We will never have real data on this. But simply on its face, I find it extremely hard to believe that most consumers have a strong enough moral compass to go out of their way to buy something they already have access to. Maybe they will for a tiny handful of special books that they want hard copies of, or authors they really like, but not for most media they consume.
This type of system also becomes a popularity contest for creators; you are supporting the people you like as opposed to whose work you want to read. If an author says something you disagree with, it's easy to just read their work without paying them. I'm not against consumer boycotts, but it should generally come with a sacrifice on both sides--for consumers, that means missing out on the product or service.
You are free to feel however you want about this. I can certainly see the immense societal value of making things accessible to more people. But I flat out don't believe the "piracy doesn't lead to lost sales" shtick, of course it does.
https://gizmodo.com/the-eu-suppressed-a-300-page-study-that-...
From above:
'The Dutch firm Ecory was commissioned to research the impact of piracy for several months, eventually submitting a 304-page report to the EU in May 2015. The report concluded that: “In general, the results do not show robust statistical evidence of displacement of sales by online copyright infringements. That does not necessarily mean that piracy has no effect but only that the statistical analysis does not prove with sufficient reliability that there is an effect.”
The report found that illegal downloads and streams can actually boost legal sales of games, according to the report. The only negative link the report found was with major blockbuster films: “The results show a displacement rate of 40 percent which means that for every ten recent top films watched illegally, four fewer films are consumed legally.”'
Very interesting report, and am not discounting it, but another factor is that maybe the pricing affect is already baked in from years of piracy. For example, back in the early 2000's, when P2P file sharing was being used to download music, then to compete, the music industry had to resort to iTunes store, which allowed users to buy just one song for a dollar, instead of the entire album (and then later on, to music streaming services). The damage was done decades ago, and eventhough P2P file sharing isn't big today, it's effects are still with us today (no music executive is going to go back to forcing people to buy an entire album to get just one or two songs).
But, maybe this report is taking this into account too??
Unfortunately, absence of evidence ≠ evidence of absence.
I obviously don't have time to read a 300 page report—I wish I did—but the conclusion says:
> With regard to total effects of online copyright infringements on legal transactions, there are no robustly significant findings. The strongest finding applies to films/TV-series, where a displacement rate of 27 with an error margin of roughly 36 per cent (two times the standard error) only indicates that online copyright infringements are much more likely to have negative than positive effects.
The conclusion goes on to discuss each type of media. Here's the section on games:
> For games, the estimated effect of illegal online transactions on sales is positive because only free games are more likely displaced by online copyright infringements than not. The overall estimate is 24 extra legal transactions (including free games) for every 100 online copyright infringements, with an error margin of 45 per cent (two times the standard error). The positive effect of illegal downloads and streams on the sales of games may be explained by players getting hooked and then paying to play the game with extra bonuses or at extra levels.
If this is what was meant by "illegal downloads and streams can actually boost legal sales of games" (and it's possible they're talking about something else which isn't in the conclusion), I don't find that convincing. It's within the margin of error and includes free transactions.
Moreover, I firmly believe that we are never going to have good data on this! You're trying to measure two things that are virtually impossible to measure with any accuracy: (1) how much piracy is taking place, and (2) what would sales have been without the piracy.
(I've edited my comment to actually quote the paper)
>Unfortunately, absence of evidence ≠ evidence of absence.
A study showing no statistically significant effect is not an absence of evidence, it is evidence of the absence of a large effect.
Or it's evidence that the effect can't be measured, which is what I'm trying to say.
I honestly don't understand how you would even attempt to measure something like this. There's no counterfactual. How can you possibly know what sales would have been without piracy?
This study appears to be relying on survey results. That seems questionable to me, because no one wants to admit "I totally would buy more books if piracy wasn't an option, but I choose piracy because I like having money and I think authors deserve to starve." I'm exaggerating for the sake of effect, but really, how can anyone ever know what they would have purchased under different circumstances? It's human nature to self-rationalize your actions. And yet, despite this, the study still didn't find statistically significant results!
Maybe if one country ever manages to truly cut off access to piracy websites, and there's another economically and sociologically similar country where piracy remains readily available, it will be possible to get some valid data on this question. I mostly hope this doesn't ever happen, because while I'm not a fan of piracy, I am a fan of the free internet!
Absence of proof is not proof of absence, and Sagan should have said that.
Absence of evidence is evidence of absence if evidence was sought and not found, and much of science is based on this. Or if evidence of presence should be expected ... consider for example the absence of evidence of an elephant in your living room.
This saying should die along with "you can't prove a negative"--Euclid proved that there is no greatest prime over 2000 years ago. What can't be proven is a universal empirical--positive or negative--such as "no raven is white" or "all ravens are black".
> The report found a lack of evidence that piracy displaces sales.
This isn't true though, as they conclude a 40% displacement in blockbuster movie sales. You would need a better analysis of their methodology to dismiss their other conclusions
As far as I can tell from the conclusion, everything was within the margin of error, so my assumption is that it's random noise. If there's a place in the paper that says otherwise, please let me know what page its on. If I'm misreading the results, please let me know that as well.
The 40% figure seems to come from section 8.2, p.152, which the authors describe as "robust".
However, having seen the report now, this section on top films seems to use a different methodology to that used for books, so it's not really comparable, and in general I wouldn't put much confidence in these results anyway.
> I find it extremely hard to believe that most consumers have a strong enough moral compass to go out of their way to buy something they already have access to.
This is zero-sum thinking. Do you oppose libraries on the same principle?
Sometimes making a thing accessible can increase the overall market for the good, because it trains the behavior. The market for books requires readers, and readers are created by people reading.
> Do you oppose libraries on the same principle?
No, because libraries have to buy the books! If lots of people check out a book, the library will have to buy more copies! Yes, maybe the authors loose out on some revenue, but there's a clear relationship between number of readers and the author getting paid for their work.
This is also why I thought the Internet Archive's lending lending library was great! I'm aware they got sued anyway, and I think that's a real shame.
If we take this to the logical extreme - someone had to buy the book in order to upload it to Anna's archive in the first place.
Yes, but whereas libraries need to buy more copies of books that lots of people check out, Anna's archive only ever needs one. Not exactly sustainable for the author.
As I said, I loved the Internet Archive's approach to this! That's very much not what Anna's archive is doing.
Still, libraries buy what, maybe 5 copies of a mildly popular book. I don't think that would be sustainable either if that was the only books sold.
Libraries have to replace paperback books after ~20 checkouts on average. (This number is from memory but I'm quite sure it's in this range.) Hardcover books last a bit longer but of course are also more expensive.
I agree the industry would have a hard time surviving off library sales alone, in the same way that most businesses rely on multiple revenue streams to make ends meet, but I think library revenue is much more significant than you're making it out to be!
It's also likely true that a library that bought 10 copies of a book initially is unlikely to buy 10 more copies once there have been 200 circulations and they are needing to be replaced. They may only buy 5 replacement copies since the book is likely to be less popular than at initial release so it will take much longer for the next 100 circulations to occur.
As for anecdota, I have more than once borrowed a library book and then purchased a copy so I could read it again or to finish it if demand is strong enough that I would have to wait weeks or months to be able to borrow it again.
Have you tried borrowing a mildly popular recent book from the library? There's often a digital queue of 20+ people with reservations.
There's plenty of incentive for most people to buy the real book rather than wait for the queue.
(I've also found libraries a useful way to discover lesser-known authors, since you can quickly sample/browse books on the shelves. But they wont have all of the books published by those unknown authors.... so I end up buying/ordering other things by them)
The principle of virtual libraries is the same as physical ones: only one person has access to the book at any given time. For popular books, either the library has to buy more copies (or digital licenses) or else it rations access by waiting list. The idea is sound IMO.
"If we take this to the logical extreme "
I think this is a situation where doing so doesn't make much sense. This is all about compromising, I think that must be the premise.
I would not buy a book after downloading it from Anna's archive. But that's the wrong question in my opinion. You should be asking why aren't most books available in a DRM free format?
The main reason to download "pirated" books is that they get rid of all annoying barriers that exist in "legitimate" copies. It's a better product.
> You should be asking why aren't most books available in a DRM free format?
Because most people don't care! I wish they did, because I'm like you, I do care about owning DRM free media! I buy videos game from GOG wherever possible, and audiobooks from a combination of downpour.com and libro.fm. Guess what most people do? They buy games on Steam and audiobooks on Audible.
Audible is the one that really breaks my heart! Games and movies I understand, because the DRM free sources have such narrow selections, but I can find just about any audiobook I want on either Downpour or libro.fm; every once in a while I'll come across an audible exclusive, but it doesn't happen frequently. And yet, everybody uses Audible!
And, sure, there are known ways to strip Audible DRM, but with DRM free stores so readily accessible, why wouldn't you use those?
>but I can find just about any [DRM-free] audiobook I want on either Downpour or libro.fm
Just had a browse of Downpour. They say that it's mostly DRM-free. I don't get it. How come the rights holders don't complain? My experience of DRM-free e-books is that the available titles are, let's say, nothing I would want to read. And audiobooks have higher production value because of the voice acting. What A-list authors are narrating their own books and then allowing them to be sold DRM-free?
Unless something changed recently, every title on Downpour is DRM free when bought (as opposed to rented). I've been using Downpour for more than a decade and own tons of books. Libro.fm is slightly newer and IMO has slightly nicer UX, but both websites have mostly the same (wide) selection of titles.
I can't tell you why publishers make the decisions they do, but there's no trick here, if that's what you're asking. DRM free audio books are widely available and have been widely available for a long time now.
The real question is, why does Audible insist on putting DRM on their Audiobooks when the publishers clearly don't care? I don't know the answer to that either, but the upshot is that everyone should stop buying from Audible!
If only sales on downpour were possible outside the US. I just tried to buy a K. J. Parker. Does not sell to the EU. I haven't tested libro.fm because their ToS doesn't tell me if non-US sales are prohibited and I'm not going to make an account just to try.
Perhaps, but it’s a bit moot once you have the book and a reader which opens it. Anna’s archive is a better service because it doesn’t matter what reader you’ve got and the content is there. It was the same with Netflix when it was the only streaming service: it had everything easily accessible.
Gabe figured it out eons ago, steam is the proof.
Once again, I repeat, discovering something completely unexpected makes this discovery moment "special". Personnaly, I materialize that discovery by making it real in my real life. So I buy a physical copy. That is also a way to build a me-compliant environment and not let the algorithms decide what I am surrounded with. [let's be frank, algorithms suck at finding who you are and what you will like!]
I bought a book or two after downloading but they had forewords in new editions or I had wanted to search something in the digital edition quickly as a one off and peruse the physical copy at leisure later.
Your other points aside...
> I'm not against consumer boycotts, but it should generally come with a sacrifice on both sides--for consumers, that means missing out on the product or service.
I'm curious as to why you feel this way, genuinely. The decision to boycott means that there is no sale, full stop, so no money is being handed over. Why does anything after that matter? The important part, the money, is already decided from the start.
Because otherwise there's no incentive not to boycott. One of the nice things about capitalism is that even unpopular people can make money if they make a product people want to buy. It adds a level of realness to society, above status-games and popularity-contests.
That makes the very silly assumption that the default is to boycott everything, which is really not the case. People at large definitely still default to purchasing things first, for all sorts of reasons from just feeling that it is moral to the service being convenient to just enjoying and wanting to support the work itself. This is self evident in the fact that boycotts essentially never actually kill anything because the majority still favors paying.
The default is to not buy something. People don't like loosing money. If you can get something without loosing money, it's super easy to rationalize why you you're skipping the loose money part. People tend to make decisions which are in their financial interest.
I've seen lots of people on this site that pay for YouTube. I've met real people that have subscriptions to porn sites. They fork out money for stuff that's pretty much always already freely available, for basically no reason except maybe convenience or slightly better service. People spend money all the time, for stuff they want and care about. If they didn't want or care about it, they wouldn't buy it or pirate it.
> The default is to not buy something.
But if this is already true by default, then we're back to square one where the important financial decision was already made. Again, if it was already decided by default that there is no sale to be made, then whatever the end user does after that is irrelevant.
But beside that, in my last response I gave you three very common reasons that people do buy things against their own financial interests, and you've ignored that part. How do you fit that into your argument?
Homo economicus is a poor model of human behaviour. Per https://en.wikipedia.org/wiki/Homo_economicus#Sociologists, both neurobiological and anthropological research suggest that unsolicited gift-giving is a natural human behaviour.
It's nothing to do with morals or conscience, pure self interest incites me to to take action and buy physical copies or official ebooks or collector's editions or CDs or lossless digital releases of works I first consume pirated. I want creators I like to make more stuff. I feel good looking at my bookshelf filled with things I enjoy. I don't like throwing out or donating tons of books every year because they're no good and I couldn't tell until I bought and read them.
In several countries customers are forced to pay a special tax on empty media (storage) with the intention of proceedings to be redistributed among the copyright owners.
Some of these countries are codified under the Roman law principle, ie whatever is not explicitly forbidden by law, is simply not forbidden (as opposed to common law).
In some countries downloading the published media (eg a film after the official release) is permitted.
And those who download, paid for it in the form of tax.
Directive 2001/29/EC for the EU only (Article 5).
Other countries rely in provisions of WCT, 1996 (Art 10) and WPPT, 1996 (Art 16)
https://en.m.wikipedia.org/wiki/Private_copying_levy has several countries listed, with examples/extent of these laws
I hope you support downloading books/films/TV shows/music by the customers who paid for this privilege.
TBH don't think those laws are conscionable because the money collected through those taxes is mainly paid to entrenched copyright cartels instead of being distributed to creators in a fair way.
You are probably right, I am not representative of the vast majority of people who consume products, whereas I collect [what I consider to be, for me] GREAT stuff.
But one of the point I also wanted to highlight is that I knew nothing about those stuff and would have had no opportunity to taste them and be convinced that they are GREAT stuff [for me].
And to come back to your comment regarding creators. The thing that I hate are creators [for example writers who are interviewed in radios] who sell their book with a marvelous speech, but the content is eventually very so/so. As a consumer I feel robbed.
Books seem somewhat unique to me in that the physical product is better or at least different from the digital one, so it kind of makes sense to buy it even if you already have a digital copy. This is unlike e.g. streaming services where the paid service is strictly worse than the pirated one (e.g. no offline, doesn't work at all with some monitors/setups, only low bitrates allowed).
"Better" is of course subjective. Digital is better to me: I can read the digital version on my laptop, phone, or e-reader. I prefer the e-reader, but don't like to carry it everywhere; at the very least I can always read on my phone if that's all I have on me.
I'm someone who used to be a voracious reader. In my childhood alone I would devour paperbacks and hardcovers like nobody's business. My summers were spent destroying the full summer reading list distributed by my school in weeks, and then going to the library to find more things to read. I have had thousands and thousands of physical books in my hands during my life. But I still prefer digital.
I only purchase digital books that either have no DRM, or stripable DRM.
You feel. You think. Google up the studies of piracy and you’ll see that the biggest pirates are also the biggest buyers. Replace your private opinion with some science.
The reframing that will help you understand this is that these people are fans (I stole this framing from Korey Doctorow who releases his books online for free and encourages his fans to buy a copy if they like it). Fandom is a positive sum game. The more you do it, the deeper you go with it the more you’re happy to pay the people who create the content you love.
The easier it is for you to find new content the easier it is for you to become a fan of a new thing.
For example: I want to buy a copy of prince Pukler’s hints on landscape architecture. I can’t find a physical copy anywhere and I’m not sure if it’s worth $120 for a reprint or $500 for an older version. I could pirate it (I use that word loosely since this work is obviously in the public domain) and check it out, but I haven’t bothered so I haven’t bought a copy. This is a case of me NOT pirating and therefore NOT engaging with new content.
It is not science. Don't fool yourself that you have science on your side when it is just some shitty survey.
> But I flat out don't believe the "piracy doesn't lead to lost sales" shtick, of course it does.
I'm not as certain as you are. Correlation does not imply causation, but media sales have trended upwards in the age of piracy which leads to some interesting hypotheses.
A few years ago Shirley Manson (lead singer of the 90s band Garbage) accused YouTube of making its fortune off the backs of content creators - basically charging the entire enterprise as being one big exercise in copyright infringement. And yet the music industry, as well as Hollywood, seem to be doing better and better each year in terms of dollars made. Some of the distribution models have changed - broadcast and cable television are pretty dead in the water, but the entertainment industries in general seem to be doing better than ever. And yeah lots of individual artists are still getting raw deals from Spotify and labels etc. as they always have. But industry-wise, in terms of dollar amounts, it seems there's more money to be made than ever before from creating and selling entertainment.
The statement you made that I absolutely agree with is that it's hard to get real world data on this. An individual who is able to get free access to something may be unlikely to ever pay for that same thing.But the answer to the question: "Does piracy hurt the industry's bottom line, or help it on the whole?" is a very difficult question to answer. And we have to consider the even harder stuff to measure. Things like: is a teenager who pirates recorded media more or less likely to buy merch and concert tickets? More or less likely to buy a special edition package with tangible collector items?
At the end of the day, I have no clue.
I also offer all of this being very pro-capitalism and pro-intellectual-property. I don't condone piracy. But if we're just looking at raw data and trying to form our hypothesis, we have to start with the fact that the raw data points to upwards trends on the whole.
> but media sales have trended upwards in the age of piracy which leads to some interesting hypotheses.
But they were also on an upward trend before the age of piracy, so it's perfectly plausible to think they would be even higher. The same technologies that enable digital piracy also lower the cost of legal distribution, so you'd expect to see the industry doing better at the same time that piracy is rising.
Now, I'm of course not shedding too many tears for the major Hollywood studios, but I would like to live in a world with more niche films and games, and of course it's still quite difficult to make a living as an author or musician—a few manage it, most don't.
We agree that we don't have data—but to me, it just makes intuitive sense that a large majority of pirates are pirating lots of things they would have otherwise bought. For piracy to counteract that by generating buzz or aiding discovery or whatever it is... well, it would have to be an awful lot of buzz!
Occasionally in life, intuitions are dead wrong, and actual data leads to surprising discoveries. However, when faced with a lack of data, the first assumption shouldn't be "reality is the opposite of whatever I'd intuitively expect," that makes no sense.
I think there's a ton of motivated reasoning going on, and it just really bothers me. If you're going to pirate stuff, at least be honest with yourself about it.
> I find it extremely hard to believe that most consumers have a strong enough moral compass to go out of their way to buy something they already have access to
I like the idea that consumers only buy stuff out of moral obligation.
Like if you went to your ethical friend’s house and saw that he had empty book cases and no art on his walls because he hasn’t yet been imbued with the requisite moral fervor necessary to buy anything. It’s hard for him to be sure what he’s obligated to buy or that he’s obligated to buy anything since it would be wrong of him to know what’s inside any book without buying it first.
And then you went to your no-good, dirty, downright despicable friend’s house and it’s full of books and art because for every 20 books he pirates he buys one, and because he’s just so darn unethical he pirates a lot of books
Can you recommend some of the obscure stuff?
Ok, there are not only obscure stuff. More blasts from the past, that really would deserve a better exposure. In term of non-Marvel/DC comics, things from Bernie Wrightson, P Craig Russel, George Besse, Alberto Breccia, Moebius, Druillet, Scuitten/Peeters, and others. In term of letters, once again the almight Lovecraft letters are really jaw-dropping ! For movies, I discovered Vincent Price, Sam Peckimpah, John Ford, Wim Wenders.
So nothing really out of the "normality", but they are no longer marketed and are slowly fading to grey.
Bernie Wrightson's is beyond awesome, fills me with nostalgia for a time I never lived through.
The Roots of the Swamp Thing collection is really fun and serves as a fantastic hors d'oeuvre for reading the famed Alan Moore run.
Shadow libraries maintainers deserve a Nobel prize for their contributions to humanity. Satoshi would be proud.
Satoshi's pride:
* ability to fund shadow libraries without fear of censorship
* lists with a single item still count as lists
To be fair, the theory with the whole coin thing is solid, and I'd say it should count as something to be proud of even if in reality it gets tainted by speculative investments.
Yeah. I personally think the original bitcoin whitepaper is a work of genius. Balancing the soft game theory incentives with hard cryptography garuntees is really cool.
I'd love to see more systems exploring this combination approach. There is a saying about not being able to solve a social problem with technology. Bitcoin is the blueprint on how to do that.
Its everything that came after that point that is the problem.
You want to stop things happening after other things?
> ability to fund shadow libraries without fear of censorship
Bitcoin is much worse than cash in that regard
sure except for all the reasons cash doesn’t work for this
who do you hand the cash to in order to fund a website?
Sending cash via snail mail to buy stuff online exists. While Anna's archive does not support it, it certainly exists.
Ok, but now you're not comparing bitcoin to cash but rather to cash+mail, which has many more tracking opportunities.
What about Monero?
That's why most shadow libraries are funded with cash.
aaronsw would be proud, too.
Perhaps he could spare a few coins, chump change to him to help out.
Might need more than a few as the price would tank if his wallets came out of dormancy.
Superman will freak out the world if he kills so he doesnt. Does Satoshi want to avoid freaking out the BC holders?
The name Satoshi Nakamoto literally translates to "central intelligence."
Not according to this thread:
https://news.ycombinator.com/item?id=41449007
Only when you completely disregard Japanese syntax and the fact that East Asian names tend to be made of Chinese characters with good meanings.
They made LLM possible, for good or bad.
Also, they provide a torrents list that anyone can seed and be part of the long-term preservation.
https://annas-archive.org/torrents
I'm surprised i2p torrents are still not popular enough to be offered as an option by sites like this.
I'd assume there are many people who don't help out purely because of legal fears, something i2p could help with.
I2P's major drawback when torrenting is speed. Assuming a speed of 500 kbps, it would take 2,000 days to download a 10 TB torrent.
What is the status on I2P these days? I used to run a lot of stuff on it. It was a lot of fun. It was like this cozy alternative development of internet, where things still felt like 1997.
The numbers are interesting and a bit surprising to me.
I remember a time when people would have seedboxes for private trackers, data hoarders brag about having TBs of storage and yet only a handful of people are seeding the complete collection(s). I understand not everyone has or can seed multiple TBs of data but I was expecting there to be a lot of seeders for torrents with few hundreds of GBs.
Interesting to see that sci-hub is about 90TB and libgen-non-fiction is 77.5TB. To me, these are the two archives that really need protecting because this is the bulk of scientific knowledge - papers and textbooks.
I keep about 16TB of personal storage space in a home server (spread over 4 spinning disks). The idea of expanding to ~200 TB however seems... intimidating. You're looking at ~qty 12 16TB disks (not counting any for redundancy). Going the refurbished enterprise SATA drive route that is still going to run you about $180/drive = $2200 in drives.
I'm not quite there as far as disposable income to throw, but, I know many people out there who are; doubling that cost for redundancy and throw in a bit for the server hardware - $5k, to keep a current cache of all our written scientific knowledge - seems reasonable.
The interesting thing is these storage sizes aren't really growing. Scihub stopped updating the papers in 2022? At honestly with the advent of slop publications since then, the importance of what is in that 170TB is likely to remain the most important portion of the contrib for a long time.
"Scihub stopped updating the papers in 2022"
True but it matters a lot less in many fields because things have been moving to arXiv and other open access options, anyway. The main time I need sci-hub is for older articles. And that's a huge advantage of sci-hub--they have things like old foreign journal articles even the best academic libraries don't have.
As for mirroring it all, $2200 is beyond my budget too, but it would be nothing for a lot of academic departments, if the line item could be "characterized" the right way. To me it has been a bit of a nuisance working with libgen down the last couple months, like the post mentioned, and I would have loved for a local copy. I don't see it happening, but if libgen/sci-hub/annas archive goes the way of napster/scour, many academics would be in a serious fix.
It's 167.5, not ~200, and you can get disks much larger than 16 TB these days - a quick check shows 30 TB being sold in normal consumer stores although ~20 TB disks may still be more affordable per byte.
A lot of these are (relatively large) pdfs, right?
I wonder how much space it is as highly compressed, deduplicated, plain text files.
Does the sum of human scientific knowledge fit on a large hard drive?
In text form only (no charts, plots, etc)- yes, pretty much all published 'science' (by that I mean something that appeared in a mass publication - paper, book, etc, not simply notes in people's notebooks) in the last 400 years likely fits into 20TB or so if converted completely to ASCII text and everything else is left out. Text is tiny.
The problem is it's not all text, you need the images, the plots, etc, and smartly, interstitially compressing the old stuff is still a very difficult problem even in this age of AI.
I have an archive of about 8TB of mechanical and aerospace papers dating back to the 1930s, and the biggest of them are usually scanned in documents, especially stuff from the 1960s and 70s, that have lots of charts and tables that take up a considerable amount of space, even in black and white only, due to how badly old scans compress (noise on paper prints, scanned in, just doesn't compress). Also many of those journals have the text compressed well, but they have a single, color, HUGE cover image as the first page of the PDF, that turns the PDF from 2MB into 20MB. Things like that could, maybe, be omitted to save space...
But as time goes on I start to become more against space-saving via truncation of those kind of scanned documents. My reasoning is that storage is getting cheaper and cheaper, and at some point the cost to store and retrieve those 80-90MB PDF's that are essentially total page by page image scans is going to be completely negligible. And I think you lose something be taking those papers and taking the covers out, or OCR'ing the typed pages and re-typesetting them to unicode (de-rasterize the scan), even when done perfectly (and when not done perfectly, you get horrible mistakes in things like equations, especially). I think we need to preserve everything to a quality level that is nearly as high as can be.
> In text form only (no charts, plots, etc)- yes, pretty much all published 'science' (by that I mean something that appeared in a mass publication - paper, book, etc, not simply notes in people's notebooks) in the last 400 years likely fits into 20TB or so if converted completely to ASCII text and everything else is left out. Text is tiny.
20 TB uncompresssed text is roughly 6TB compressed.
I just find it crazy that for about $100 i can buy an external hard drive that would fit in my pocket that can in theory carry around the bulk of humanity's collected knowledge.
What a time to be alive. Imagine telling someone this 100 years ago. Hell, imagine telling someone this 20 years ago.
I was reading a book series from my local library and for reasons I don’t understand they were missing the third or fourth book in the series. Probably damaged or lost. I even thought I could check the local (especially used) bookstores, buy a copy and then gift it to the library, but there’s a new edition that has a completely different vibe and size, with 2024 prices so I thought better of it. So I’d heard of Anna’s Archive and I got it there. Then it turned out one of the last books was unavailable too, can’t recall if it was missing or someone else had it out and wasn’t going to return it any time soon.
I was just trying to finish this writer’s corpus on a reread of their later material. It’s not that I’m cheap. I own a paper and audiobook copy of several of my favorite books. Including this author, so I’ve paid her twice. I just avoided the trap some of my friends long ago were falling into of hoarding books, by only keeping books I intend to read again. So any completionist tendencies have always been resolved via library or electronic editions.
I’m getting older now, and my first real confrontation with my own mortality came up with books. I have several years worth of books even if I were retired and reading three or four a week. New things come out all the time, and new voices. I haven’t read some of these books in ten years or more. Am I really going to read them again before… So a couple years ago I reread Dune for what will likely be the last time and sold my ratty old yellow copies to a used bookstore. If I do it again it will likely be audiobook.
"Anna’s Archive itself has organized some of the largest scrapes: we acquired tens of millions of files from IA Controlled Digital Lending"
Not really helping in the big picture, here, guys.
People have likely already been mirroring it quietly for years.
IRL, "scanparties" used to be a thing if you were in the "bookz scene" around the turn of the century. (Where you and a small group of others go to a public library, hit the limits of your library cards and often clear out entire sections of shelves focused around a particular topic, meet someplace to scan/"cam" everything you borrowed as quickly as you can for processing and uploading in the near future, then return them all within a few days, and repeat this until you get bored or have other things to do.)
yeah, that's a really unfortunate shoutout that's going to be brought up in court.
Why? They acquired books, that’s what they do
The OP is referring to the ongoing legal struggles the IA is facing wrt. to their version of an online library (with digital book lending).
Precisely. To be clear, I don't agree with a comment upthread saying the "shoutout" is what might potentially do harm to the IA in court. I think the actual act of having scraped all those books from the IA's lending system could potentially do harm to the IA in court. The publishers can now point to all the copies of the books in the wild that IA had in their lending system and argue that IA's system is not legally acceptable. It was on shaky enough ground already.
I believe this was already brought up in the court proceedings, and Brewster Kahle already addressed it in April 2024: «Trying to blow protections we have put on files, for instance, does not help us– and usually hurts».
https://old.reddit.com/r/DataHoarder/comments/1bswhdj/commen...
IA lending books with "weak" DRM also hurts efforts in reducing DRM and reforming copyright though and that is much more important in the long term. It was always a deal with the devil that IA should have never made and them now being at odds with others that preserve those books and actually make them available only makes that more clear.
It's like a food kitchen under a tyrannical regime complaining that people passing their food to rebels might get them shut down.
The shout-out is evidence of the act.
Oh, ok. Thanks, I agree
Super selfish of Anna's Archive to mention this. "Look what we did!" with zero thought to the consequences for others.
> the consequences for others.
The only people facing consequences are the license-holders. Online lending libraries aren't missing a copy now that AA archived it, and there's not really a substantial cost to the hosters in network bandwidth.
Am I missing something here? As a user I don't empathize with anyone but the archivers.
IA can be painted in court as an “unwilling enabler” of something like Anna’s Archive, instead of a regular library
If I go to the public library, check out a movie on disc, back it up, and share the back up file online, is my public library legally liable
Depends a bit probably if your local library has major lawsuits for operating in a very sketchy side of the legal gray area
Maybe your library shouldn't have made choices that put it at odds with the data preservation community then.
fuck those guys, annas archive is one of the last good things about the internet.
I am curious how they’re funded. How they are able to stay online. Surely there must be people, governments etc with deep pockets that would want to take them down?
Allegedly, some companies with deep pockets have paid them for access to their collection. The collection turns out to be useful for training LLMs.
Can confirm this is happening. But the money paid is tiny. Think thousands of dollars, not millions. Not enough to keep the lights on. I would assume they do pretty well from donations.
Never assume anyone does well from donations. That's rarely the case.
Source on this claim? All their torrents are released publicly. Why would "companies with deep pockets" need to pay them?
For fast access
Because they have an interest in the ongoing work of archiving of new things I guess
Paying someone, even a pittance, gives you deniability and a chain of ownership.
Using a torrent of the exact same thing does not.
You can donate to get access to faster download mirrors. I'd guess this is the main source of their revenue.
https://annas-archive.org/donate
Can you donate to them without someone claiming you're donating money to a criminal enterprise and getting you in trouble? I mean, without using bitcoins
You can buy Amazon gift cards using bitcoin lighting to add another layer (actually 2) of paranoia :)
I suppose it could also be their enterprise users, though there’s not a lot of info on this aspect of their activity.
[flagged]
If #1 is a reference to a famous quote from Steward Brand, founder of the Whole Earth Catalog, it's only part of the quote. The rest is relevant:
> On the one hand you have—the point you’re making Woz—is that information sort of wants to be expensive because it is so valuable—the right information in the right place just changes your life. On the other hand, information almost wants to be free because the costs of getting it out is getting lower and lower all of the time. So you have these two things fighting against each other
He stated later more succinctly:
> Information Wants To Be Free. Information also wants to be expensive. ...That tension will not go away
It's not a quote, but a statement. And even if it were a quote, random other quotes from the same person are not relevant. "This is just a part of the quote" people are so annoying. Like guess why it is "only a part of a quote"? Because some parts are neat, insightful and true, and some other parts are irrelevant and garbage.
Sorry, this was a more general rant, because it is so annoying every single time.
In this case: Who the hell cares about that random guy's random views? How is it relevant in this conversation?
For me, it was useful to clarify that "information wants to be free" was "information wants to be gratis", not "information wants to be libre". I didn't realize it referred to cost.
You can read more about it at https://en.wikipedia.org/wiki/Information_wants_to_be_free
If you are similarly annoyed with the random "it's only part of the quote" spam, you can use my text above and link it as https://news.ycombinator.com/item?id=44944055
>Who the hell cares about that random guy's random views?
Not you.
>How is it relevant in this conversation?
If you cared it'd be obvious to you.
"Not caring" was cool last century.
That's not a real tension. There is no case where the inherent value of some commodity keeps its price high despite easy availability. That's the point of the "diamonds in the desert" thought experiment.
Inherent value provides a ceiling on the price of whatever it is.
Availability also provides a ceiling on the price.
If I give you two theorems that say C < 300 and also C < 10, why would you describe those as being "in tension" with each other?
The tension arises because in some cases, at least for a while, the availability can be suppressed. Like when some expert releases an expensive ebook or video course "Secrets of X". Ofc many such books are scams, but assume for sake of argument the information is actually valuable. The initial buyers are motivated not to share it. It remains a scarce commodity for a while. But all it takes is one person to make a torrent, and the game is over. So there are two incentives -- one trying to keep it scarce, and the other trying to make it free.
Copyright was created because we realized that it takes effort to put works together (in the original case it was educational information) but that distribution can be done without rewarding that initial effort. Which then results in the initial effort not happening. Which then ends up in a dumber, less intelligent, idea poor world without those works.
Society agreed to copyright because of the social benefit of having people willing to put effort/expense into creating works. We're not talking zero value internet BS, but real works. People who create the works don't make them scarce, their distribution is infinitely scalable. They just make it so that they are compensated.
Most information is not easily available, it is purposefully hidden because knowledge is power and money. And that's through all fields and not only Coca-Cola recipes.
The argument is that authors will stop making information publicly available because piracy takes away the value. So instead information will be hidden in vaults and do good only for a few people. Like how maps used to be top state secrets.
The obvious fix for this is to either eliminate trade secret protections in favor of patents, or make them conditioned upon escrow with the government to be released to the public domain after some time (perhaps half the time of a patent).
Don't want to release your recipe ever? Tough cookies when your lead scientists bring it to a competitor.
Trade secrets are counter to the purpose of "IP" law. The public has no interest in protecting them and every interest in... not doing that.
Until every new born child is forcefully implanted with a microchip in their brain at birth, you will never be able to stop people from thinking and having secrets.
If people are not fairly compensated for sharing their secrets and discoveries with the public, they won't do it. They'll take it to the grave if so be. And we loose out on information which can benefit an enormous amount of people.
So the quoted person is absolutely right that there is a great tension between these two factors. How should great ideas be greatly compensated while giving the widest access possible? Neither piracy nor expensive access to information is the right solution.
Trade secrets never expire and sharing them is a crime, so currently people can take them to their grave and the government will have their backs in doing so. A single person's secret is also unlikely to matter much next to the potential of global corporations' secrets, and the nature of corporations is that they are made of people who have little reason not to take an offer with a competitor after they've learned the necessary secrets to do their job. Hence, don't protect those corporations unless they offer something in return (explicitly divulging them/contributing to the common knowledgebase). Without that protection, knowledge can more naturally spread.
The fair compensation they should be offered is time limited protection. Otherwise it should simply be legal for any of their employees to spread that knowledge. Giving unlimited protection to not divulge knowledge is counter to the entire point of "IP" law.
"The" Coca-Cola formula would have lost its patent restrictions a century ago. It's still unshared. Why exactly should we continue to grant any legal protection from an employee sharing it?
What are social security numbers if not just another bit of information that wants to be free?
Or perhaps you are saying that people that have an interest in the availability of particular information should have some control on that information's freedom...
The idea that any widely transmitted identifiers' confidentiality should be its primary method of security is asinine.
The failure of any exploit regarding SSNs or the like is not on the offending party, but on each using party's failure to implement even a modicum of actual security.
FYI calling something "asinine" is not an argument.
A widely transmitted identifier that tons of organizations need to ask you for taxes is not secret. It's used to precisely identify who you claim to be. It's your username. There's not much to say about also treating it as your password except that it's asinine. It's like treating your first name/last name as a secret password.
> They are literally burning down giant commercial buildings in Europe.
Who is, and which buildings?
https://investigations.news-exchange.ebu.ch/playing-with-fir...
People can do good things and bad things simultaneously. Unless me supporting the good things directly enables also the bad things, I don't see a reason to throw out the good thing.
was the alternative for the pirate bay people jailtime?
Can you expand more on any links between Russia and Anna’s? I’m not joining the downvote brigade on this one without asking.
He said he personally suspects, I don't think that was more than a throwaway comment. Besides, if my enemy is dismantling an institute in my society that I want dismantled, I'm not going to complain.
> Information should be free
I'm sick and tired of this misquote; as it was merely an observation of trends, and was never meant to be a moral maxim or mandate. If you truly believe information needs to be free as a moral mandate, share your company's source code first.
I see it as “everyone deserves respect”. No need to overanalyse it. It’s one of those few things in life that are simply true, no proof needed.
I see it as "Carthage must be destroyed". No need to nitpick it. We must destroy Carthage.
> the last good things
Last but not least?
Meta illegally scraped 80TB of data from Anna's archive, Libgen, Zlib etc. I'm sure other tech giants did too. Without paying them a cent, costing these projects $$$ in bandwidth/hosting etc.
when I hear people complain about these projects it just sounds like hypocrisy.
Kudos to the team behind this project! It looks like they have improved UI in last year. The crucial problem right now is to remain accessible or to survive. I have no idea how much effort is being put into it. I wonder is it possible to remain afloat despite all efforts to take them down?
There was a pretty major UI update in the past 2-5 days-ish.
Apologies for the minor grumble, but on mobile I used to be able to browse search results much more effectively; the new design only fits ~4-5 results on a screen.
BTW, this is very useful:
https://open-slum.org/
This site is down or inaccessible to me. What is in it and why is it useful?
That site has a list of shadow libraries, whether they are still operating, and where to find them.
It seems to be an instance of Uptime Kuma, which is a pretty great OSS for uptime monitoring and Dashboarding.
https://github.com/louislam/uptime-kuma
If there's a book that only has e-book versions on amazon, what is the best way to ensure the author gets money? I'd rather not fill my little apartment with paperbacks, and ordering a paperback and then returning it sounds kind of wasteful. Although I guess I could buy a paperback downtown and drop it off at the used book shop .. What do other people do, when they want to pay authors and read e-books without aiding and abetting Bezos?
maybe $author has contact information or even a donate button up somewhere
Isn't it humorous how citizens are pro Anna's archive, but governments are against it? Bit of additional evidence for elitism and such.
It is neither humorous nor strange because that formulation omits authors.
How many authors who write the books in Anna's archive are happy about it?
I personally am pro Anna's archive (and sci-hub, etc) because I believe it benefits society to have better-read citizens. That said, I have some misgivings, because under our current system, there are issues with law and remuneration.
IMO, Scihub and the ebook parts of AA should be considered differently and not conflated.
In particular, Scihub is in opposition to the parasitic international publishers who dominate and control scientific publishing for profit, mostly on the backs of science generated by academia and other not-in-it-for-the-profit folks.
In contrast, downloading ebooks may, in some cases, lead to individual authors being hit in the pocket, in a profession it’s already hard to make a living from.
(I wish we’d figured out a better way to organise book publishing without publishing companies getting in the way and taking their large slice, allowing authors to profit more directly.)
That's an excellent point. The problem cases with AA are edge cases on sci-hub.
The law only benefits the most popular authors, otherwise it protects publishers primarily.
I made the assumption everything relates to scientific papers that have been made public or were taxpayer funded.
What about writers?
IIRC it was shown that piracy increases sales for books.
For example, if you pirated an ebook and liked it, you'd likely buy a physical copy.
Even if that might be the case now, I doubt that holds if piracy becomes truly widespread.
I would suspect A pirates book B and tells C about it, C buys book B is a lot more common than A pirates book B and likes it enough to buy it
I have no data to support this, and while I have paid for things I could access for free, but I'm sufficiently pessimistic about human nature to think that's the norm.
Piracy has been "truly widespread" for decades now.
Most people who are able to, still pay for things, especially if they're convenient. Even when those services actually add additional restrictions to their access to the media they think they're paying for.
That's absurd. I could potentially believe the conclusion that piracy doesn't take away from sales (that is, most people who pirate would otherwise do without, and not buy a copy). But the idea that many/most (or even some significantly-small percentage) of people who pirate will buy copies of the things they like? No, that doesn't pass the sniff test.
I do. When I was poor – I couldn't do it. Now that I'm wealthy and can afford any book, I prefer to take a quick look at online version and then buy a physical copy.
I actually have bought many books that I started reading online. The book format is useful.
Kids who don’t have pocket money won’t, but they aren’t lost sales anyway.
https://news.ycombinator.com/item?id=15305476
EU paid for report that concluded piracy isn’t harmful, tried to hide findings (thenextweb.com)
280 points by tchalla on Sept 21, 2017 | 59 comments
If you and I would support the works we think are good, why wouldn't others? I keep noticing that people constantly expect worse morals from others than how they claim they are themselves
It's easy to add a "me too" onto the existing list but that's not my point. I think we generally can expect better from the average person than we instinctively do. If 50% of people are just as honest as we are (if we're average persons which, on average, we are), that would be easily worth it if free distribution of a book gets you a 3x bigger reach as compared to when people have to pay up front. I'm not aware of research confirming or refuting this (of course I'd like to believe that information can be free), but it doesn't seem so outlandish to me that we can ignore the option altogether by doing a sniff test
This is true for me! For authors like, I might read a few epubs, then buy their entire series in hardcover (or paperback if no hardcover is available) to have in my bookshelves for rainy days.
Depends. I've seen some in favor and some against. Academics who have their papers paywalled by publishing entities against their own wills are generally for it.
Academics get their income from their university positions, and don't get any royalties from sales of their articles. Instead, the benefit they get from publishing is to their reputation, and for that it's better for their work to be as available as possible.
It's completely different for a writer who gets their income from sales of their work, obviously
Yep. And not that you asked, but my own opinion (not theirs) is that even writers who get income from sales will be fine either way. Reading a book for free and then buying it to support the author if you want to has been a practice for longer than the internet has existed. It's exactly how libraries have always worked!
My comment made the assumption that everything in Anna's archive is the result of taxpayer-funded or public research.
Their volunteering system seems pretty well organized. Also might explain why I've seen so many comments over the years sharing about anna's archive.
https://annas-archive.org/volunteering
“This website is blocked
European sanctions
The Council of Europe has decided that the websites of RT (formerly Russia Today) and Sputnik News may no longer be transmitted. The website you are trying to visit falls under this European sanction.
VodafoneZiggo is obligated to enforce the sanction and has blocked the website.”
Does Anna's Archive or a similar site host, say, the complete New York Times (pre-1930) as a full PDF download set? And every other newspaper too?
Tons of public domain sources are locked into websites like Newspapers.com or the nearly-dead and now completely unsearchable old Google News / Newspaper.
It would be nice if the massive pursuit of AI training data resulted in some fully-legal open source alternatives to these proprietary, outdated, or abandoned sites. I know some of it is available via the Internet Archive, etc., but something new with an AI-powered search and finding aid sounds so useful.
> complete New York Times (pre-1930)
https://archive.org/search?query=title%3ANew+York+Times&sort...
> as a full PDF download set
I imagine it's possible to achieve this through torrents from Anna's, but you'd have to search and compile the list of all individual PDFs.
> something new with an AI-powered search
With enough time and willingness, someone could put all the old NYT issues through optical character recognition and convert them to text; then make it available to large language models for semantic search of some kind. Ideally public cultural funds could support the effort as academic research.
This is surprising. I thought last I heard they'd arrested the guy who was suspected of running the site, about a year or so ago. Guess I'm misremembering.
Also I'm surprised Cloudflare hasn't shut them down like they do for other dodgy sites.
When accessing from Belgium the link is blocked by Cloudflare:
Error HTTP 451 Unavailable For Legal Reasons
In response to a legal order, Cloudflare has taken steps to limit access to this website through Cloudflare's pass-through security and CDN services within Belgium
Man, I thought cloudflare stood in front of individual sites. When did they start becoming a filter on an individual’s web connections?
CF is in a position such that if they aren't cooperating with national laws, then they are actively hindering them. National governments don't like that, and will have ISPs block CF wholesale if that's what accomplishes their goals.
Eh, they can't block half the Internet.
Apparently in Spain they can: https://www.reddit.com/r/CloudFlare/comments/1j7yx5y/i_cant_...
To operate in Belgium, they have to follow local laws and comply with legal orders. They either make the site unavailable to local IPs or leave that market.
Interesting. Seems to be only certain jurisdictions. I can access it no problem from the UK Vodafone network.
I'm unable to resolve the domain on EE UK - looks like it's DNS blocked.
By comparison, on my work network (TalkTalk) I can resolve the domain but I get a connection reset from the site.
I think this might be the first time I've hit a DNS block. It feels rather eerie seeing people talking about a site that, from my point of view, doesn't even exist...
There's an inconsistent censoring of numerous websites across the UK. In short, the biggest ISPs (a list which changes over time), will block various sites (TPB, libgen, AA, and others), based on court orders taken out at different timesIn general, it's a good idea to use Private Relay if you're using Apple devices and have access to it, no matter what network you're on, and if you're doing anything you don't want your ISP to traffic capture you should be using VPNs and/or Tor.
There are a lot of legitimate reasons to want to use scraping sites that UK copyright law is not nuanced enough to protect, and so blanket bans just end up emerging at the demands of copyright owners (which more often than not, means Disney or Springer).
Yes, Ofcom really needs to sort this out properly. I shouldn't be able to access this site from a UK ISP. Makes no sense that it's blocked on some and not others.
It starts with one
Set proton VPN to Albania and enjoy the full internet is my experience.
Whats up with Albania?
Idk, I went there a couple of times, I just love the people, the country. It’s a trip back in time. So it was my “random pick” for an exit node. And now I can read rt.com, sail the high seas, open any libgen or Anna's Archive. They're not part of the EU, seem far away from it (no euro, guarded borders, ditched their communist dictator who completely isolated the country ~40 years ago). Perhaps they are less easily coerced into censoring as practiced by countries primarily governed based on GDP and what the big corps want (although everybody seems to smoke everywhere so they could use some of that EU influence).
Hmm. Even the title link above doesn't work for me on Virgin's cable, in the UK
Do you see an error page / blocked page?
I used to get archive.org blocked and had to contact my provider to have the filters taken off.
Nope,it just takes forever, then eventually shows a blank screen...
Yep blocked by Ziggo in NL as well
Whenever I'm in the Netherlands I need to set my DNS to 1.1.1.1 or similar, lots of blocks.
Except that that’s CloudFlare, which is also blocking Anna’s Archive.
Luckily it isn't the only public DNS.
8.8.8.8, 9.9.9.9, and many others exist.
Here it's Cloudflare the CDN, sitting in front of Anna's Archive, that's doing the blocking. The DNS resolver used doesn't come into play.
(Case in point, I am using Google's DNS, yet still encounter the block when accessing from a Belgian IP.)
We should stop using public DNS and start using our own DNS.
I actually didn't know there were more error codes beyond error code 429
There's "431 Request Header Fields Too Large" which you will see occasionally. But after that 451 is the only other 400-level error code above 429. It was chosen as a reference to the book Fahrenheit 451.
451 is kind of a novelty code, its meaning being related to Bradbury's "Fahrenheit 451" SciFi novel.
Oh! You'll love this: 418 I'm a teapot
https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/...
visit http.cat
The two behind Z-Library were arrested in late 2022.
Thank you, I think I must have got the details of that confused with the OCLC lawsuit.
Is there a tool like the one from Archive Team that one could run that helps with their effort?
https://annas-archive.org/torrents
https://kmr.annas-archive.org/blog/an-update-from-the-team.h...?
"The site can't be reached" ERR_QUIC_PROTOCOL_ERROR
"attacks on our mission"("our mission" being stealing).
The mission is to give people the tools that enable the mental proficiency that allows to make clearminded statements and avoid skullconcussed ones.
Know am going to be downvoted into oblivion, but as a composer, can see it from the side of creators. Yeah, making their products free is starving these industries. For instance, in music, there is already very little money in music (think about how many musicians you personally know who can make a living off of music, besides being a music teacher). And, the music industry is still not even the same size as it was in 90's - global revenue in 2024 was $29 billion, while in 1994, in was $35 billion (and that's not even taking into account inflation).
Yes, there are many other reason why the music industry fell, but when your main demographic can always go to bittorrent to get their music if prices are too high, then there is only so much you can do with the price of music.
Yeah, I remember the 90's, music was huge, and there were so many good bands (Smashing Pumpkins, Nirvana, REM, White Stripes... Or if you're more into popular music, Michael Jackson, Whitney Houston...). Now, music is de-valued and cheap and our music scene has been decimated. Personally, think we should try to find ways to support musicians, writers, thinkers, artists...
... but if you have a different opinion, no worries. But, if you can, give it thought.
The ideal situation would be building a society that believes everyone deserves to be fed, clothed, and housed regardless of their ability to make profitable things. Weird how politically unpopular that seems to be.
Both producers and consumers of media are in the same boat of barely surviving. Maybe we can work with each other instead of against each other? :)
That is a very nice thought, have been told the Europeans seem to do this to a much higher degree:)
[dead]
Streaming has replaced piracy, and scammed artists in the process. You can complain to the labels for that.
As for why I download: I am legally forbidden from buying the music that I want. Either it's the selling label geoblocking, or they only sell versions in a shitty format like mp3. I'm not jumping through hoops to give you my money, either I can buy FLAC files, or I download.
I want convenience, the same way users want it. Artists discovered that they were scammed by the labels instead of the pirates.
The devaluation happened through streaming services. Instead of spending dozens of dollars on subscriptions, BitTorrent and last.fm enables me to find what I like and spend the money on Bandcamp instead where it actually reaches the artist I am buying from. I can just get a Spotify subscription instead though, if you insist.
I think a lot has happened since the 90's, and you rightfully point out that there was very little money in music to begin with. Labels generally always took a very large fraction of a physical CD sale, for example, so the model was rather rigged from the beginning (and recorded music doesn't have that long of a history, anyway).
In general, I'd argue that Spotify will be more toxic to the industry (or the artists' livelihood) than piracy. Streaming is even more predatory and centralized than labels in the 90's, but with an important caveat: it's legal. When people engage in piracy there is at least some awareness of, say, the pirate being at fault in the transaction — even though, as someone else already mentioned, people who pirate might contribute, or engage in other ways, with the creators. But with streaming, it got normalized to pay artists a fraction of a cent per stream (and the terms get progressively worse). I've countless times heard the argument "at least they get paid something!"
Bandcamp, for example, seems like a much fairer ideal for the industry. Luckily, the Epic buyout a few years ago did not immediately ruin the business.
As for the music in the 90's...music has changed. Naturally, one could argue that these are also exciting times: one can singlehandedly produce a record, distribute it independently, and be touring all over Europe without ever having to sign off to a major label. Is this not a good thing — or at least, a notable one? Of course, there's still great music around.
Yeah, usually, have also read that the only ones to make music on Spotify are major artists. They take a huge chunk of the the money distributed to musicians. At least for me, have never heard of any musician making a living off of their Spotify sales, not even close.
And Bandcamp does seem nice, wish it took off more.
And yes, I do completely agree with you that there are some big positives with today's music landscape. The rise of Digital Audio Workstations (DAW) to create your own music was a revolution, as is youtube for getting your music to the masses. Seems like a ton of musicians got their break from this these days... ...So as we talk, am thinking, maybe piracy has become a unimportant aspect of the music industry?? Hmm... Well, one aspect is missing, the seasoned engineers, producers, marketers and managers who can get your music created, promoted and performed all without the musician's needing to learn all this themselves. It really is a lot of work!
There's also the effect that new musicians are competing for attention with an ever growing catalog of top artists. I already have hundreds of CDs, so I'm not particularly inclined to go find whatever the 2025 version of the Smashing Pumpkins is because I already have the old one. Looking at this year's Billboard 200, I don't think I'd be interested in SZA or Lil Baby. Bowie died almost 10 years ago. I guess I'm good with what I've got.
Definitely... and think about your comment, it's probably what we've all heard, that the teens/twenties is the target demographic for the music industry, as they're the one who go out and buy things. Yeah, I don't buy that much music these days, maybe a few songs and albums per year (and I'm in music!).
Music got commoditized.
In the 90s the good bands got lucky that their distributors picked them up and promoted them etc. You just don't remember the amount of crap that was on at any given point in time.
Today you have instant access to millions of songs around the world in every genre imaginable: https://everynoise.com/ And not just to the whatever few records your local store carried, or what the Big Four paid the radio stations to promote.
I do agree that youtube has made it much easier to self-promote, and that today's model has replaced the old one and is doing decently. Still, the at least by the numbers, the music industry is still smaller than it used to be. Unfortunately, money is a powerful resource, and it's not like the music industry took everything and completely screwed over the musicians. They helped struggling musicians survive, giving them a chance to make it, while taking care of a lot of the non-music-related tasks that are actually very time consuming - promotion, lining up performances, lining up interviews, learning the successful strategies for giving a band a chance to succeed, networking... It is really another job in itself and is very difficult.
Labels still do this today, but it's just the number of opportunities for musicians is smaller.
Although, again, do agree that youtube (and somewhat spotify from what I've heard) has made a huge difference. I've heard a few times that Youtube is probably one of the best resources for self promoting music, but being good at making videos on youtube is not easy to do well and is also another job in itself.
> I do agree that youtube has made it much easier to self-promote
And Spotify. And Apple Music, to an extent. And even SoundCloud.
> They helped struggling musicians survive, giving them a chance to make it,
Survivorship bias. You're completely ignoring the artists that never got the attention of distributors, or got immediately dropped, or dropped after the first disappointing (by studio standards) sales, or screwed out of revenue and royalties, or...
Or those who never got a chance at all because Sony or Warner paid radio stations to promote who they wanted to promote: https://www.npr.org/2005/11/23/5024411/warner-agrees-to-sett...
> Labels still do this today, but it's just the number of opportunities for musicians is smaller.
Labels still do this to the same extent as before. They spend about as much money and, percentage wise, keep as much money as before. It's even easier for them because a whole layer of physically printing and distributing media (tapes and CDs) is gone.
And the number of opportunities for artists increased, but became more complex.
In 2012 an otherwise unknown outside South Korea artist reached a billion views on Youtube resulting in worldwide tours. Now there are millions of unknowns on the same platforms. It's never been easier to promote your art, and it's never been more complex because there are so many others.
Always been the case. I have a late boomer early Gen x friend, who will insist that music was better back in the day, and that everyone was listening to Zeppelin and such, and nothing else. You can pull up the billboard charts for any year he waxes about and read off the top n, and rarely if ever find a track from the bands he claimed "everyone listened to."
Survivorship bias is and always has been real. If you don't believe me, think about the last time you heard Tubthumping from Chumbawumba on the radio or in a commercial
I'm not convinced that every pirate download equals a lost sale. Certainly sometimes it does, but I don't think it's the case that creators lose much revenue due to piracy. I think the big music labels and giant publishers might -- might. But that's not the same as creators losing money. And we're also unable to count how often piracy results in concert ticket sales that may have otherwise not happened.
> but when your main demographic can always go to bittorrent to get their music if prices are too high, then there is only so much you can do with the price of music.
And that's the thing: if the prices are too high, in the absence of piracy, most people are going to just do without. There's no lost sale when someone decides to do without rather than pay a price they thing is unreasonable.
I think the shift in the music landscape you see is due to three things: 1) your tastes have changed, and everyone looks at the "good old days" with a fondness and appreciation that is often undeserved, 2) the music industry itself has changed, moving away from the album-sales model, and fully embracing streaming (I believe around 70% of revenue comes from streaming these days), and 3) it is easier and cheaper than ever to create high-quality music; sure you need some level of talent, but many of the financial barriers to recording your own music (like the need for an expensive recording studio) have lessened or evaporated entirely.
> And, the music industry is still not even the same size as it was in 90's - global revenue in 2024 was $29 billion, while in 1994, in was $35 billion
This seemed surprising to me, so I did a little bit of light research. This isn't true. Revenue was steadily rising until around 1999, started dropping during the main time of digital disruption, to a low in 2014. In 2024, revenues were 1.5x what they were in the ~1999 peak.
Now, if you do inflation-adjust those numbers, you get a picture more like what you're saying, with a peak around 1999, a sharp decline, and then only a partial recovery.
But total revenue is only one part of the picture, and we can't judge creator impact solely upon that. And at the end of the day, no one is entitled to revenue. Sell a compelling product at a price people are willing to pay, and you'll make money.
Outside of streaming, I personally don't see many compelling products out there when it comes to music. I bought CDs and cassettes as a kid, but I don't see physical media, or even digital album bundles, as purchases worthy of my time. I have a YouTube Music subscription, and that fulfills the entirety of my at-home or on-the-go music needs. On top of that, I go to concerts and festivals when my favorite music is in town, and I'll sometimes buy some merch (like a festival t-shirt). Beyond that, I just don't see a need to spend money on music. (When I think about it, though, I probably do spend more money on music today than I did when I was buying physical media! Some of that is due to my better financial situation now, to be sure, but not all.)
> Personally, think we should try to find ways to support musicians, writers, thinkers, artists...
I absolutely agree, but I don't think piracy has the big negative effect on creators that you think it does.
Appreciate your view, and am no expert at this, but as you mentioned, the numbers do speak for themselves. Yeah, it isn't just "the good old days," we all who followed the music industry saw a huge decline in revenue in the 2000's (it was catastrophic and was as punch to the gut). It just kept going down year after year. And as you mentioned, if you adjust for inflation, the size if the industry is still smaller than it used to be...
...Also, it seems like it depends on where you look for yearly revenue. At least this research article is more like what I saw (although, not sure what numbers are correct): https://www.researchgate.net/figure/Global-Recorded-Music-In...
Regardless, yeah, the music industry took a huge hit, and is looking better these days with streaming (which saved it), but it's still not great.
>And that's the thing: if the prices are too high, in the absence of piracy, most people are going to just do without. There's no lost sale when someone decides to do without rather than pay a price they thing is unreasonable.
Agreed, if prices are too high, yes, they'll do with out. But in the past, on average, it seems like most people did actually purchase CD's and DVD's, me included. Most of us had quite a sizable collection, and would routinely visit music stores to pay $20 to buy a CD, just because they liked one or two songs (and that's in 90's money). Yes, the music industry took a lot of the share of revenue, but that industry still is what promoted and supported the musicians.
https://news.ycombinator.com/item?id=15305476
EU paid for report that concluded piracy isn’t harmful, tried to hide findings (thenextweb.com)
280 points by tchalla on Sept 21, 2017 | 59 comments
I agree with you. There's a huge sense of entitlement from people who pirate, and the most absurd set of excuses. I bet most of them would shoplift if it was consequence free. And then complain that shops were going out of business.
Except the chief argument remains the distinction between goods of difficult replication vs goods of cheap replication.
And except all the rest in that illogic.
Interesting question:)
[flagged]
I'd like that they enable torrents for single files, like internet archive does waiting too long for being able to download a file It's kind of annoying
They won’t do that. How else are they supposed to sell these premium subscriptions?
annas-archive.li/blog, 2025-08-17
About recent events.
We are still alive and kicking. In recent weeks we’ve seen increased attacks on our mission. We are taking steps to harden our infrastructure and operational security. The work of securing humanity’s legacy is worth fighting for.
Since we started in 2022, we have liberated tens of millions of books, scientific articles, magazines, newspapers, and more. These are now forever protected from destruction by natural disasters, wars, budget cuts, and other catastrophes, thanks to everyone who helps with torrenting.
Anna’s Archive itself has organized some of the largest scrapes: we acquired tens of millions of files from IA Controlled Digital Lending, HathiTrust, DuXiu, and many more.
We have also scraped and published the largest book metadata collections in history: WorldCat, Google Books, and others. With this we’ll be able to identify which books are still missing from our collections, and prioritize saving the rarest ones.
Much thanks to all of our volunteers for making these projects happen.
We’ve forged some incredible partnerships. We’ve partnered with two LibGen forks, STC/Nexus, Z-Library. We’ve secured tens of millions additional files through these partnerships. And they are helping the mission by mirroring our files.
Unfortunately we have seen the disappearance of one of the LibGen forks. We don’t have further information about what happened there, but are saddened by this development.
There is a new entrant: WeLib. They appear to have mirrored most of our collection, and use a fork of our codebase. We have copied some of their user interface improvements, and are grateful for that push. Sadly, we are not seeing them share any new collections, nor share their codebase improvements. Since they haven’t shown commitment to contributing back to the ecosystem, we advise extreme caution. We recommend not using them.
In the meantime, we have some exciting projects in the works. We have hundreds of terabytes in new collections sitting on our servers, waiting to be processed. If you’re at all interested in helping out, feel free to check out our Volunteering and Donate pages. We run all of this on a minimal budget, so any help is greatly appreciated.
Keep fighting.
Please remain up. Libgen no longer works. I've used IRC for fiction and non-fiction but tech books needs Anna's Archive and Libgen. I buy the physical with company budget to pay the author but I need DRM free ebooks to read comfortably on my Tab S9 Ultra.
libgen is still there
Not accurate. You are probably looking at a site like https://libgen.ac/ which states clearly at the top: "Not a Part of Library Genesis. ex libgen.io, libgen.org"
The real one has been down for a long time.
All original mirrors currently seem down. But other mirrors are up. Check here: https://open-slum.org/
The pirate bay's been down for a long time too. And yet...
What’s the url?
Given that big tech has been scraping everything ever written to train LLMs, are there specialized prompts to trick models into spitting out copyrighted works ?
Foundation LLMs are lossy compressed databases, you might never get the exact work back from it.
Yes, if you believe the New York Times: https://www.techdirt.com/2023/12/28/the-ny-times-lawsuit-aga...
SciDB DOI lookup has given me dud after dud recently with newer publications. Anyone else experiencing the same?
I think they've paused uploading since 2021 or so, due to the pending case in India.
That is Sci-Hub. "Sci-Hub has paused uploading of new papers. SciDB is a continuation of Sci-Hub" from the AA front page.
Anna's archives is possibly the greatest site ever.
Infinite love to the team <3
Kind of... the fact that they have the actual data behind a "soft" paywall (waiting times and terribly slow transfers otherwise) makes me a bit skeptic of their "goodwill".
No such thing as free when bandwidth costs money. Any service online that is handing out things for free without restriction is getting their return through scrupulus means and shouldnt be trusted. Anna's Archive straddles the line enough to allow people to download books for free but not at too great an expense to the volunteers who pay out of pocket to support the project.
So what about the authors and creators of the works? They did it for free?
Information and well-crafted sentences are available on the Language Tree, easily plucked by anyone at zero cost. It's greedy for those so-called novelists and subject matter experts to expect a living wage.
"Information wants to be free," which means that any cost of producing that information can be abstracted away due to ideological inconvenience.
Then show me the easily available "information on the langauge tree" to solve the unsolved problems in science. Btw. books are not mere information, they are also products of effort and sacrifice and intentions. They are also embedded in an economic system of paper, books, ink, transport and what not producers.
So you are either poor or too lazy to buy a book from the store. But this doesn't justify mind theft or it's distribution.
they already work almost for free, since all the money goes to the publisher and retailer.
out of $20 book, the authors earn about $1 - $1.5, for e-books its about $1.7 - $2
The value from book sales goes to retailer and publisher: two large corporations, and in case of amazon - a single big corporation
so please cry me a river about amazon's lost profits earned at the back of the book authors
Governments. You forgot governments. They take the bulk of the money, especially in Europe.
~25% VAT and then the publishers and retailers take their cut. The government takes another 40% in income and payroll taxes from that. The leftovers are what the author gets.
Buying from yourself is probably the biggest markup you can get.
yes, if you add VAT and remove taxes from authors' incomes, it becomes even more laughable.
its really might be better to publish for free and create a buy me a coffee
This is a problem of publishers and retailers, and not a justification for distribution of mind theft.
Then what's the economic interest for writing a book
Very little. Aside from high-profile/best-selling authors who do make a decent amount of money, the vast majority of writers do it because they love doing it, not because they expect to become rich.
Their backdoor plan to get rich! Not going to fool me this time VCs!!
Everyone involved is taking on significant personal liability and hosting expenses. Not sure what more you expect.
Yes spot on, crazy that asking for an optional pittance for less bandwidth throttling on such a huge and risky project can be seen as exploitative.
you should ask for a refund!
Bandwidth isn’t free of charge
Especially the type of bandwidth where you can host felony contempts of business model without getting a life sentence.
and hosting
I believe you only hit the paywall when you try to use the search engine & download individual files. They still offer the underlying data for free archival/mirroring via torrents.
Keep fighting the good fight. Our cultural heritage must be preserved.
Also how can one totally anonymously pay them?
It doesn't look like they accept from anything that strikes me as being remotely anonymous, which is surprising.
https://annas-archive.org/donate
I'll also say that when too much money starts becoming a part of this, trouble will increase dramatically. I realize this sort of endeavor costs a lot of time and money, but it's a line we should probably be aware of.
They accept Monero which would be my first thought.
Does anyone have discreet pointers for downloading all the data? What format is it usually?
https://annas-archive.org/datasets
remember guys, it's not pirating, it's gathering date from AI model training purposes. Perfectly legal.
I know you're joking, but what the AI training lawsuits have said so far is that training and digitizing used books that you bought is fair use, but piracy isn't.
For "you" as a natural person. Companies are legally free to download books for free to train AI.
That would be a valid argument if they weren’t redistributing the data verbatim.
The entire internet needs to be re-designed to stand up against attacks.
- DDOS attacks
- Spamming
- UK like surveillance laws
- LLM scraping
Why is it that there is almost not initiative for this?
The Internet has been redesigned. It's just not been redesigned with your interests in mind and at least some of the "attacks" are features to the right people.
The precursor to BitCoin was this interesting project called HashCash. It was built to combat email spam and forced the sender to spend compute solving a moderate hash and put it in the header. The person who receives the email can prove easily if the sender "paid" the cost.
There are, but they each have their tradeoffs.
Proof of work and micropayments (eg. Xanadu or Internet Mail 2000) schemes solve spamming and LLM scraping, but are more expensive or more CPU-intensive.
P2P systems like FreeNet too, but they are harder to use and more storage intensive and make it easier to spy on individual users.
Tor solves UK-like surveillance laws but it's slower and makes it easier to spam.
RFC-3514 [1] proposed an effective solution against attacks.
So see, there are initiatives, but people treat it as a joke, maybe because of when it was released.
[1] https://www.ietf.org/rfc/rfc3514.txt
Decentralization and interoperability, including the TCP routing protocols give the ability for the network to grow freely, but makes those kind of attacks easier.
The easiest way to mitigate those problem will be to decrease the openness and centralize more. It might lead to even worse things that DDOS.
Go right ahead
Out of curiosity, do you see the archive in question as being part of the problem or that it needs protection from the issues you raise?
You're probably looking for https://geti2p.net
because they will come after new design? how do you not see this?
I'll start the wiki
I'll design the logo!
I'll make a GUI in Visual Basic!
I'll bring my axe!
i'll make snacks
Redesigned like how?
I fully agree. It's difficult though because I genuinely believe that the solution space overlaps with cryptography, which is quickly discounted as viable option because it is now laden with negative connotations.
Cryptography has negative connotations? Like what? Do you mean cryptocurrency by any chance? (If so, it's feasible to practice cryptography without touching cryptocurrency).
Not op, but in my bubble:
- DRM. - Owner-unfriendly device locks (such as manufacturer-controlled secure boot or locked-down OSes). - Inability to audit network traffic from one's own devices, i.e. an IoT device. - Remote attestation, when in opposition to open computing.
I could also see folks seeing the use of cryptography as "having something to hide" - I don't personally agree.
nah. cryptography is not seriously held back by cryptocurrency
Because the vast majority of people don't want this, and not for some nefariuos reason or because they're stupid, but because we don't want to enable blatant fraud and abuse, among other things.
(Not to mention the astronomical technical work it would be; you can't just replace "The Entire Internet")
the problem is that anybody who does that work will be targeted very quickly by the people in power.
even if it's decentralised, it'll be banned one way or another and you'll be hunted down.
"Be the change you want to see in the world"
> In recent weeks we’ve seen increased attacks on our mission.
A pretty rich thing to say when your mission is piracy.
I'm not against piracy at all, quite the contrary, but this is quite laughable.
Piracy is done with ships. Anna's is doing digital conservation, which is sorely needed in a rapidly changing world.
Right? I mean I love what they're doing. But at the same time please, stop claiming to be holy angels trying to build an archive for historical purposes. You're a terrific piracy site, period.
What is it then that they love doing? Is there a long-term thrill in being a piracy site? I don't think so. No truth in the angel story but they do say "it aims to "catalog all the books in existence" and "track humanity's progress toward making all these books easily available in digital form".
> What is it then that they love doing?
They're earning a fuck ton of money, that's what they're doing.
Just like megaupload back in the days, they sell premium accounts for fast download speeds and no queue.
Zoom out, annas archive and every incarnation of the shadow library that exists is like the library of alexandria, in 150 years the copyright holders of the hour will be meaningless, nobody will care who got monetized or whatever, the point will be that a small number of vigilantes preserved human knowledge for posterity, and not even a half-second of thought will be given to the "crimes" that were involved in doing so.
I mean, you don't personally know any of them, do you? How could you possibly know what their motivations are?
And even if their motivations are less than pure, I will 100% get behind the mission of preserving humanity's literary output. If that's the outcome, I don't care about their motivations.
It's an interesting peek into their milieu. For those in the club, the statement might seem self-evident.
[dead]
[flagged]
> We recommend not using them
I've been using WeLib since April and had a good experience so far
If efforts like this are to be sustainable in any lasting way, participants need to be cooperative, not parasitic. I agree with the Anna's Archive team, it serves noone to have one of these players in the space hoarding their own collections and not sharing them to other archiving projects, it make the collection extremely vulnerable and at risk of becoming lost knowledge as time goes on.
I disagree with how this is framed. shadow libraries thrive on decentralization, any other servers mirroring a collection is better than no mirrors at all
Im not sure how you disagree with this. Decentralization relies on multiple copies in multiple places. The fact is that WeLib is not allowing other libraries like Anna's Archive to mirror or copy their exclusive collection, hence the recommendation not to use them.
Otherwise, please explain how I am missing your point.
> If efforts like this are to be sustainable in any lasting way, participants need to be cooperative, not parasitic. I agree with the Anna's Archive team,
That's an odd combination.
>If efforts like this are to be sustainable in any lasting way, participants need to be cooperative, not parasitic
that is an odd demand for a site that thrives on piracy. Don't steal from the thieves? When you take from others it's liberation, when others take from you it's parasitic, that's certainly a convenient coincidence
They steal it, but give everyone free access. You can download it for free, but can also torrent everything. They don't hoard for themselves, but everyone gets access to what they have. That is the crucial difference.
Only giving access to your material over downloads means that people have to pay if they want to get more of it. If those people don't share it then the material is going to be lost again.
Torrenting all the material slapping using their frontend as a base and just making money is different.
No honour among thieves.
Let’s have the person who does not use any LLMs throw the first stone.
They're not saying that the experience of using them will be bad; they're saying that participants in the ecosystem who are not cooperative are a net negative on the future of the movement. As a user, you may not see that directly, but only over time if resources are taken away from the cooperative parties.
Why use them over annas archive?
cleaner interface
I dread these. I still remember the rarbg announcement from a few years back I saw here. Do I even dare click the link?
Not that scary. Click it.
They just announced that they're still in the fight.
I think you'll be happy if you do
Fuck that site. Offers people links to free PDF downloads of my book that I worked on for 32 years and finally got published by Pantheon Books in 2017. I didn't work all that fucking time for criminals like these to just break copyright law and make the book available for free. Fuck Anna's Archive, and I hope they go down in legal flames ASAP.
I hope you wrote that book more for personal pleasure and fulfillment than monetary gain. Over 32 years, would you have to be a best seller given the price of your book on Amazon (without counting the free audiobook you offer if someone starts a trial) to be making a minimum wage.
If you did that for passion and the book is good, it will definitely have a bigger impact if people can read your stories without having to go through Jeff or a bookstore (many English books are very hard to acquire outside of the US).
So, rejoice in the fact that someone thought your book was worth making available for the few who even know how to use these kind of online libraries (most people in the world don't). Bitterness on loss of revenue is definitely not worth it, especially after having put 32 years of life into it.
Unfortunately I don't really care about 60s US tech "scene" but the cover seems nice.
It may be a minority, but not all authors share your view. Paulo Coelho [1] says “a person who does not share is not only selfish, but bitter and alone”. Sorry gotta say it, your tone matches.
[1] https://en.wikipedia.org/wiki/Paulo_Coelho
I can promise you that the site isn't the reason your book flopped financially. That is just what the vast majority of books do, especially ones on such niche topics.
I'm sorry you feel that way and it's understandable to be frustrated by them allowing piracy of something you've worked so long on.
That being said, do you know if their offering of your material has had a significant impact on your revenue or is it more the principal of the matter?
This strikes me as a bit ironic, if you're serious, as you list your current work as covering the entirety of the Beatles discography. Are you paying them for the rights?
I don't think this is a useful path to go down; there's a legal precedent for cover songs, and perhaps he did pay the fee: https://www.nolo.com/legal-encyclopedia/question-when-mechan...
I actually think it's ironic for precisely that reason. Similar to covering music, there is a legal precedent for making books available in public libraries - though most cover artists don't pay the royalties, and in this case this online library is not paying the GP. In the case that GP did in fact pay the fee, I rescind my criticism.
My understanding is that libraries do pay fees to stock books, some of which goes back to the original author. Anna's Archive does not pay anything back to the authors.
I think GP's criticism is valid. The toplevel poster is creating work that leverages the creativity of others. Regardless of whether or not he's paid a fee to do so, it's still funny to see the indignation about sharing, when the person's current project involves using the work of others.
There is both a qualitative and quantitative difference between covering/remixing the art of others, vs. just putting the original up for ~~sale~~ free.
We could hold all the following thoughts simultaneously...
• Anna's Archive is a delightful resource for readers
• the more widely the public reads, the better for society
• copyright law should be changed
• it would be good if society made it easier for authors to make a living
• some authors will rightfully feel exploited to have free copies of their works distributed illegally and without their permission
...or we could collapse under the cognitive dissonance, and lash out at @brianstorms instead.
I wonder if the people who downloaded it for free (has anyone actually done so?) would have ever paid for it.
I think they shouldn’t publish books which are fairly new. Hurts the authors…
I never heard of the site. But looking at it now, I can't see how it's anything else other than piracy.
I looked up one of my favorite authors ( https://annas-archive.org/search?q=scott+sigler ) and you can download practically his lifetime's worth of work in 5 minutes. This is not some author who lived 200 years ago - he is living and writing books now and this is his livelyhood.
[flagged]
It appears to be "The Friendly Orange Glow: The Untold Story of the PLATO System and the Dawn of Cyberculture".
Do you like to give away your work for free? Please do tell us what it is you work on and where we can get it for free.
Don't modern artists do this all the time? I mean, if you understand that you exist in a digital world where copying data is not only free and easy, but also the simple nature of computers, and that people do it all the time, can you really be surprised when your digital creation that's put into this world is treated like everything else?
Most Open Source maintainers give away their work for free.
Cultures are created to protect power structures. Culture is the enforcer of authority.
Culture distorts principles in order to defend the authority of evil. Culture must convince you that it is not wrong when law subjugates your worth and destroys your freedom. Culture convinces people of this by perverting the concept of morality. Morality is liberty. Immorality is evil. The exercise and defense of freedom are moral. The destruction of freedom is immoral. This is the pure truth of morality.
Prudence is the proper application of principle. Imprudence is foolishness. Prudence is not morality. It is not immoral to kick a heavy stone with your bare foot, but it would probably be foolish. Prudence is a question of applying the principles and wisdom you have gathered in your life to achieve the goals you have for yourself. This is made possible by liberty. Without liberty, prudence is meaningless. Morality must come before prudence.
The great lie of culture is that authority is not bound by morality, and that authority can enforce its own prudence upon you. The great lie of culture is that you are worth less than law. Cultures teach that intentions of prudence can be enforced by law. In this fashion they gain excuse to control the lives of people.
In order for people to learn, grow, and find happiness, people must be free to test their understanding of principles. With freedom, they can do this by a process of faith, trial and error. In this fashion children grow from immaturity to maturity. In this fashion human beings gain wisdom.
Cultures are agents of evil. The objective of evil is the damnation of your ability to grow strong in wisdom. The objective of evil is the destruction of your worth. In order to gain control over you, culture spreads the lie that authority is not bound by morality. It teaches that authority can destroy freedom at will, and claims prudence as the reason you should willingly submit. In the name of defending you, culture claims that the destruction of freedom is morality. Cultures pretend that evil is good and that good is evil.
Prudence can be found all around you. It is found in the choices you make every day. Even when a mistake is made, you learn prudence. Prudence cannot be enforced. To enforce prudence is law. Law is lie. Without the freedom to choose, you cannot learn prudence. You cannot be happy.
Morality can be found all around you. Wherever you find it, you will find joy. Wherever you find immorality, you will find misery. Culture enforces authority by destroying freedom with law. This is immorality.. - The End of all Evil, Jeremy Locke
You have invested in an idea that has been created by power structures through culture, that you are getting harmed by someone else's freedom. The people that will/want to support your work will do so out of a desire to do so, not because law says its right.
Many people are deceived that law breakers are immoral and harmful to society, but I don't think that's the case. Most laws are created to subjugate people, (I.E, take away there agency) Law's created by power structures which are ultimately designed to benefit the creators or supporters have done a very good job and convincing the subjugated that their interests align. Those that have been deceived by a system of laws that benefit the powerful are too invested in demanding a return for their efforts. What ever happened to the priority of making the world a better place first and foremost and having faith that you will be compensated in some fashion for your efforts?
I think you must be using an unusual definition of culture. As I understand it, culture is, broadly speaking, the shared values and practices of a group of people.
The only way to avoid having culture, in the usual sense, is to prevent groups of people from existing.
It is unusual. We have been condition to believe that culture is created by shared values. But actually is guided and molded by authority to create the illusion that its driven by society. Obviously this isn't true in all cases, but for most, its my belief that it is.
People can exist out side of the constrains of a culture that is imposed on then by understanding their own human value and worth that they are born with instead of looking to institutions and governments to give it to them.
In a society that doesn't have a centralized governing factor where the powerful impose their will on the people, then yes, I agree that its created by a shared understanding by its people. But that's not the case for 95% percent of the worlds cultures.
Oh, gotcha - if you'll permit me to paraphrase: it's not culture itself that you find evil; but that the powerful tend to warp the culture to protect their own interests.
Right. IMHO culture, at least for a very long time now, is used as a vehicle to push agendas, and people should be very wary about what to believe from what society says about a great many things.
I would agree if those shared values and practices grew entirely organically. But unfortunately people in power have a lot of, well, power, to shape culture.
People like this, because people like free stuff, and like to rationalize getting free stuff. Occasionally, someone who likes free stuff styles themself a freedom fighter, though their values do not otherwise seem to extend beyond getting free stuff.
Some AI company techbros like this data trove even harder, and limit their pretending to publicly saying things like "we're changing the world" (and "AI could be bad if you don't give us money and lock out competitors") but really only care about wealth and power.
Certain sanctioned countries that culturally value literature and science might also appreciate this. (This last category, I'm much-much more sympathetic to, and wish them well in their intellectual pursuits and appreciation of the humanities, though we should really find a better way to share that doesn't undermine Western economies and many people's livelihoods.)
I share your concern for the livelihood of authors (and your skepticism regarding the naiveté that often surrounds pro-piracy rhetoric), but I don't think that's fair to the question here. Unlike in the case of music or film, most users are not just trying to get the latest NY Times best-selling novel. The percent of books made accessible through these services that are tied to an author's income through consumer sales is negligible. Most specialist literature, whether in the natural sciences or the humanities, is priced under the assumption that university libraries are the ones making the purchase, often more or less automatically. Yet even and perhaps especially in the US (I know nothing of the library culture in certain sanctioned countries), it's increasingly rare that university libraries have open stacks for non-students and there are incredibly few public libraries that actually provide access to scholarly works, past or present -- New York Public Library and the Library of Congress in DC are the ones I've used personally, but I'm sure there are a handful of others.
Moreover, however many countless AI companies now buying and pulping copies of every book in existence seems to be really changing the used book market. Prices are going up dramatically and before this year it was very rare to not find a single copy in the world of whatever old book one desired.
As someone who spends a disproportionate amount on books and shares your concern for not making life even more difficult for authors, these services going away would be a tremendous regression.
Don't forget the video piracy thread had a lot of justification to the effect of 'the people that work on these shows/movies don't get paid enough anyways, so it's ok for me to pirate'. Wait, so you think they should get paid more for their work, this what they do is worth being paid for, just not by you? Weirdest flex.
You've just made this person that you're arguing against up.
No, I've absolutely seen that argument made online as justification for music and movie piracy, many times, for many years.
People rationalizing aren't mental giants. Piracy is generally by people who want free stuff. Not by philosophers who arrived at piracy through some line of reasoning other than wanting free stuff.
The dialogue in the space is what you'd expect.
Link it.
https://news.ycombinator.com/item?id=44913003 https://news.ycombinator.com/item?id=44914737 https://news.ycombinator.com/item?id=44913698
Openai need to train their models based on these books, not stackoverflow or reddit.
They do: https://xcancel.com/vxunderground/status/1888019174133276846, https://www.theverge.com/2023/7/9/23788741/sarah-silverman-o...
The tweet only names Meta, but it would be very surprising if OpenAI didn't do the same thing.
Anyone who doesn't train on all material available, legal or otherwise, will be outcompeted by teams that do, including those based in countries that don't respect Western copyright law. It's that simple.
Either this is practice is judged (or legislated) to be fair use, or copyright is done. It's also that simple.
I'm not convinced that LLMs and other AI models need to train on all material available. A representative sample is better.
I'll ignore the legality aspects in my response. I think coming up with a representative sample of all relevant information would be better in the long term (teams will not be outcompeted on long time horizons). Why don't the companies do this? Because it is easier to just "carpet bomb the parameter space" and worry about the potential confounding [1] and sampling bias [2] later. Coming up with a representative sample requires domain expertise and that is expensive in terms of time and money. But it reduces the total amount of training data and should reduce the amount of time and resources it takes to build the models. That may matter now that models are quite large.
This is definitely a design decision with tradeoffs on both sides. I can entertain the notion that we don't have time to sample things, but I think we are all too often dismissing the long-term benefits of proper sampling.
(In terms of the legality aspects, judges are trying to "split the baby" [3] in my opinion by saying that training on stuff you got legally is OK but training on pirated material isn't. So nobody is going to recommend training on pirated material in the first place.)
[1] https://en.wikipedia.org/wiki/Confounding
[2] https://en.wikipedia.org/wiki/Sampling_bias
[3] https://www.404media.co/judge-rules-training-ai-on-authors-b...
Quality. The tranformable value in all data is not equal.
Or none of both happens and the corporations will just continue to evade laws and taxes to their benefit.
Outcompeted in the competition of what, exactly? How quickly they can produce inaccurate garbage?
So, what? Authors and rights holders are supposed to just take it?
Copyright law exists for a reason. Trying to improve an LLM doesn't give you the right to flout our legal system. Yes, other countries might have an advantage in LLM training as a result but so be it.
> Authors and rights holders are supposed to just take it?
If it's judged as fair use, then yes. And then it's not flouting anything.
Remember the whole point of fair use is to benefit society by allowing reuse of material in ways that don't directly copy large portions of the material verbatim.
For example, nonfiction authors already "just take it" when reviews describe the main points of their book without paying them a cent. The justification is that it's for the greater good, and rights are limited.
Judges have recently ruled [1] that training on legally obtained materials constitutes fair use, but we will have to see in the long term if that ruling holds up.
[1] https://www.404media.co/judge-rules-training-ai-on-authors-b...
> Remember the whole point of fair use is to benefit society by allowing reuse of material in ways that don't directly copy large portions of the material verbatim.
That's a rather bastardized and twisted representation of copyright and fair use.
The "whole point" of copyright was to promote the authorship of original creative works by legally protecting the financial income of those authors. The "whole point" of fair use was to make exceptions in cases where it's clear that the usage doesn't result in a market substitute and deprive original authors of their income.
The end-goal of LLMs is to ingest all of that original content and reproduce it with expert-level accuracy, promising to be the know-all, end-all product. If wildly optimistic predictions of LLM proponents turn out to be correct then they will never buy a book again, they will have no reason to. And this is precisely what the copyright was designed to protect authors against.
If wildly optimistic predictions of LLM proponents turn out to be correct then they will never buy a book again, they will have no reason to. And this is precisely what the copyright was designed to protect authors against.
And under those circumstances, your opinion is that copyrighted books should continue to exist, with full legal protection?
How could anyone, including the authors, possibly benefit from an obsolete paradigm like that? At that hypothetical point, your attachment to legacy copyright law would arguably hold back human progress as a whole, not just impede a few greedy corporations from training models on illegally-downloaded books.
Sure, but copyright was designed to accomplish clearly defined goals and LLMs clearly undermine those goals. The motivation and spirit of the law are extremely plainly stated, you don't need to be a legal expert to understand it.
We should absolutely have a discussion about modernizing copyright (and patent!) protections. But it has to be done through a democratic process, companies shouldn't be allowed to just ignore laws that are inconvenient to their business model.
> At that hypothetical point, your attachment to legacy copyright law would arguably hold back human progress as a whole
There won't be any progress if nobody is getting paid for their work. Either copyright stands and LLMs aren't allowed to train without compensation, or they get an exemption and there will be nothing left to train on in a few years.
You say it's human progress. Many, many others would disagree.
If it happens, it won't matter what we think.
If it doesn't happen, it won't matter what we think.
(I think it's simply too early to tell, but it's fun to think about what will have to change if the AI cheerleaders turn out to be correct.)
>the whole point of fair use is to benefit society
I'll stop you right there - I really don't think that applies at all. Does 'society' really benefit when the whole thing is a funnel for enormous amounts of wealth to go to already-gigantic companies like Microsoft?
Does the goodness of a shadow library depend on who uses it?
Yes, if it helps me get my own job done more effectively, efficiently, and economically. That's how our society works. You and I benefit from this, too, not just Microsoft.
If you don't like it, there's a process for changing how it works, but don't expect an easy path to success. Various people will object, and will have to be won over to your way of thinking.
> If you don't like it, there's a process for changing how it works
Except the converse is true. Copyright law today governs how fair use works and even so, how material can be obtained, licensed, etc. To change it to explicitly allow what you're suggesting would require changing copyright law.
If you think copyright law as we know it will survive what's happening today, then... wow. No chance.
Copyright is not a natural right. We pulled it out of our asses, very recently at that, to meet socioeconomic goals that existed at the time. It can and will go back where it came from, if it turns out that AI is indeed a better way to organize, analyze, and distribute human knowledge.
Even if AI doesn't turn to be anything all that revolutionary, we'll still need to update the law to address both training input and ownership of generated content. Congress and eventually the international community will have to resolve a large number of conflicting legal judgments, unless we want to leave it up to SCOTUS in the US and various unelected judges and bureaucrats elsewhere.
> Remember the whole point of fair use is to benefit society by allowing reuse of material in ways that don't directly copy large portions of the material verbatim.
How do you think masked language models work?
If I was a writer, I'd consider publishing my works under a license that explicitly bans AI training. What happens when those works inevitably get ingested by an LLM?
That clause of your license wouldn't be legally enforceable.
Your license can only operate with what copyright allows you to withhold initially.
A license that banned AI training cannot be enforced. It is meaningless. The same way you can't write a book with a license that readers are not allowed to write reviews of it.
Fair use cannot be restricted by license like that.
(You can engage in individual contacts with people, with terms like NDA's work, but those actually have to be signed and stuff, and you can't do it with public information like published writing.)
It seems like it could conceivably be fair in some sense, as long as the models were actually released as open-weights (for the benefit of society).
Copyright law indeed exists for a reason. And that reason was that church and crown felt threatened by the power of printing presses to distribute ideas they couldn't control. 'To promote the usefull arts' has always been a way to sell the idea to the masses.
"...but so be it."
That phrase is carrying a lot of water, isn't it? Trillions of dollars worth by some estimates.
They do, don't they? I think OpenAI uses libgen.
Meta managed to get into a private ebook torrent tracker called Bibliotik a few years ago to use for training Llama and the resulting publicity essentially killed the tracker.
Just curious - What is the future of service like these? More and more content will be AI generated, to some degree. And should thereby that content be aggregated?
In the future, the curation function of libraries will become even more important. Libraries — even bookstores —, both physical and online, will probably use as competitive advantage their capacity to separate the wheat from the chaff. There's no value to a place where AI slop is prevalent.
Pretty sure no one wants AI slop stored away forever even though that's the unavoidable future
Not sure like between books and AI
Can Anna's Archive claim to be a non-profit when it's effectively an illegal enterprise with unknown controllers?
They are even offering decent bounties: https://software.annas-archive.li/AnnaArchivist/annas-archiv...
Whoever is running it must be doing really well for themselves laundering all that crypto.
Also interestingly they don't offer a tor onion service, while the admin is most certainly technically competent to administer one given that he no doubt uses tor to insulate himself from his enterprise and launder crypto. What is the reasoning for that?
Your comment seems like a non sequitur to me. Whether something is a "non-profit" has nothing to do with whether it receives or spends money. (See, e.g. the American Red Cross's ~$4B/yr budget.) It's about what it does with the money it has.
Obviously, since Anna's Archive is breaking the law, it can't conform itself to the normal legal/regulatory system that governs non-profit organizations. It can certainly still claim to be acting in the spirit of a non-profit, and it's up to you to decide whether you trust that claim. Nobody's forcing you to give them money.
The connotation of a non-profit is that it's being audited. It would be extremely silly to suggest otherwise.
It may have that connotation to you, but in general (at least in the US) non-profit organizations are not required to have independent audits. Typically, that requirement only happens if they receive a certain amount of government funding. An organization may choose to undergo audits in order to make people feel better about donating to it.
I really, really don't think that anybody is being fooled or misled into thinking that Anna's Archive is a "legitimate" audited organization when they describe themselves as a non-profit.
> The connotation of a non-profit is that it's being audited.
This is very geography-specific. In the US, 501(c)(3)s (what most people think of when they say "non-profit" where I am) have no general requirement for audits. There's also plenty of non-profit-by-some-definition organizations that never file a Form 1023, giving up some benefits of the 501(c)(3) regulations but in exchange being even less regulated.
The entities are regulated at the state level in the usa, with the responsibility to comply with both state and federal tax authorities.
Audits have nothing to do with it; all entities are subject to audit.
The primary difference between a non-profit and a for-profit is that a non-profit does not distribute profit to shareholders, including the founders.
Audit or threat of audit is the mechanism of enforcement and that is all that ever matters.
At least in the US, claiming that you are a nonprofit implies that contributions are tax deductible. Claiming that you are a nonprofit when contributions are not tax deductible might be considered fraudulent.
Not true. There are different classes of nonprofit and they are not all tax deductible. Some nonprofits opt to forgo pursuing that status because it involves a lot of extra administration/filing requirements.
You're responding to a different point than the one I made. It's true that being a "nonprofit" doesn't logically entail that donations will be tax decudtible. But it still implies it to potential donors. The former is a matter of logic, the latter is a matter of psychology. Both are relevant.
Yes, there are multiple classes of nonprofit, not all of which are tax deductible. But it is also true that holding yourself out to the public as a "nonprofit" has the potential to mislead because it may imply to potential donors that contributions would be tax deductible. That is why responsible (or at least well advised) nonprofits disclose which they are, because claiming you're a "nonprofit" in marketing materials, without further explanation, can mislead potential donors.
They are already very much in breach of US law, which they have always been clear about. That aside, they don’t claim that contributions to them are tax deductible.
I would love to see someone try to explain to the IRS why all those purchases of Amazon gift cards and Monero for the transparently illegal organization should be deductible though
Is Cosa Nostra a non-profit? The question doesn't make sense. It's a category error.
A non-profit is a corporate legal structure. An unregistered organization could be a cabal, a gang, a syndicate, a fellowship, a religion, a movement, a private club, or something else.
The intent is still important. While from a legal point of view a terrorist cell cannot be registered as a non-profit, it typically spends whatever funds it can secure to further its political goals, not on increasing the wealth of its owners or participants. A typical criminal band though is a for-profit entity.
Given the amount of hosting and storage needed to sustain this project. Nobody is getting rich off of donations. Not to mention the lifestyle tradeoffs that innevitably come with international fugitive status do not lend themselves to a very comfortable life.
The usage of crypto is entirely one of necessity, as controling information and knowledge is something powerful people have clear stakes in. Many countries weild their financial systems to hold or acquire power. Information and Knowledge is one form of such power.
Everything points to the Anna's Archive team being passionate ideologues as opposed to some criminal enterprise focused on profit motives.
> Not to mention the lifestyle tradeoffs that innevitably come with international fugitive status do not lend themselves to a very comfortable life.
Anonymous international fugitive?
> Nobody is getting rich off of donations.
How can anyone aside from the beneficiary know that?
The extent to which the controller can get rich off this enterprise depends entirely on the unknown quantity of donated funds (and deals with AI companies) and his skill at laundering crypto (which darknet marketplace controllers doing far more illegal stuff can do).
> Nobody is getting rich off of donations.
I'll believe that when they publish financial statements.
> Everything points to the Anna's Archive team being passionate ideologues as opposed to some criminal enterprise focused on profit motives.
"Passionate ideologues" who make you pay if you want to download anything at speeds greater than 10KB/s, how nice of them. I would rather just support the author, thank you.
I generally support piracy, but these piracy-as-a-business vultures who've been showing up in the shadow library scene need to go.
> Given the amount of hosting and storage needed to sustain this project. Nobody is getting rich off of donations.
They're getting donations as much as megaupload was getting donations for premium accounts...
People pay for higher bandwidth and no wait time, not to support the "cause". It's a farce to qualify this of donations.
And obviously people do get rich off of it, as you can see from the slew of file hosting services.
> he
Is there any particular reason you suspect Anna's Archive to be run by a man?
illegal doesn't at all have to mean immoral or particularly wrong either. Laws are complex constructions, often created for decidedly hypocritical reasons of benefitting some at the expense of others.
Thus, Who gives a shit if they're taking money from those who voluntarily subscribe. They still offer an absolutely incredible free service to who knows how many people who otherwise wouldn't be able to afford so much access to so much free information.
Given the behavior of the pro-copyright business interests and legal bodies of the world, and the outright hypocrisy of openly creating one set of rules on content piracy for certain corporations while applying another, harsher rule system for those who aren't so nicely connected, smug moralizing about something like Annas Archive has little grounding.
And aside from picking random crap out of your ass for smearing arbitrarily, what shred of evidence do you have of anyone there laundering crypto, and how?
> what shred of evidence do you have of anyone there laundering crypto, and how
The controller's freedom. If they didn't launder it they wouldn't be free.
> They still offer an absolutely incredible free service
Actually their free downloads aren't particularly good when compared to some of the other online services that 'leech' from them.
And their torrent strategy could be altruistic but it could also be self interested. By spreading storage costs around and attracting more contributions. And providing insurance to hardrive seizures.
What mainly interests me is how much money they are actually making, I suspect it's very profitable.
>What mainly interests me is how much money they are actually making, I suspect it's very profitable.
Well, it's about calculating their site support, storage, server and bandwidth costs. What might those be? Aside from these, I've seen them claim they use volunteers for much of their site support and certainly don't pay, or need to pay, anything for marketing since the word of mouth (partly through notoriety and partly through uniquness coupled with extreme usefulness) is more than enough to keep them famous.