Should we regulate large scale AI

12.11.2023

There have been a few breakthroughs in AI technology in the past few years. There are algorithms running on consumer hardware that can generate realistic images and fake videos. Large Language Models (LLMs) are becoming so good that they are indistinguishable from humans. The progress has been so fast and way above expectations for many experts in the field, that there are strong calls for regulating the research in AI or stopping it altogether. There are also calls on prohibiting open source projects related to the Large Language Models. The proponents of the regulation are concerned that someone will create a “super dangerous AI” that will kill or seriously harm humanity, while the opponents are worried that the ban will completely stop the progress in AI. I’d like to weigh in on potential dangers of AI for humanity and, on the other side, on the dangers of regulation for the progress of our civilization. 

Many experts in the field of neural networks, such as Geoffrey Hinton call for AI regulation. Sam Altman, the CEO of OpenAI, also wants to impose safety rules on the AI community. In March 2023 the Future of Life institute published a letter that called for “all AI labs to immediately pause for at least 6 months the training of AI systems more powerful than GPT4”. The letter collected more than 30,000 signatures at the time of writing this text, including such prominent scientists and business leaders as Yoshua Bengio, Stuart Russel, Elon Musk and many others. 

However, the AI community is not uniform in this question. One of the active critics of a potential AI regulation is Yann Lecun who calls it “a new wave of obscurantism”. Indeed, one can draw parallels with the calls for a ban on the printing press (concerns about an uncontrollable spread of ideas), cars (safety concerns), telegraph (“too fast for truth”), Large Hadron Collider (concerns about Earth destruction) and many, many other breakthrough inventions that impacted humanity in a very positive way. Opponents of the ban are rightfully worried that an AI regulation will stop the progress in AI altogether. Being a developer of the OpenCV and a part of the computer vision community for the past 25 years, I had a privilege to observe first-hand how the openness of data and code streamlined the development of the tech from which humanity now benefits. 

And you don’t have to take my word for it. Look at what happened in other areas that are regulated. Nuclear fission is regulated because a power plant can blow up, but an unintended consequence of that is that we are still using the power plant designs from the 60s. The world starves for power, and the Earth is heating because of fossil fuels, but we do not have any innovation in the fission industry, because the threshold is so absurdly high (see, for example, this publication). Banks are heavily regulated, and the progress in fintech is painfully slow. Sending a wire transfer still costs a lot and takes anywhere from a few hours to a few days. And why would it be different? The trend is clear: the capital is increasingly concentrated in a few large banks, and that doesn’t push them to compete through innovation. 

So,we have do admid that it is probable that the regulation will hurt progress, not just scientifically, but also in terms of real macroeconomy consequences. Which, ironically, carries the extinction risk for humanity: the less efficient is our (as a species) scientific progress, the higher the probability that we will be eliminated  by an extinction event (global warming, pandemic, cosmic event, alien invasion – pick your favorite). Of course, the risk of that seems pretty small. But how large is the risk of developing the AI that threatens the existence of humanity? Let us take a closer look into it.

Obviously, if AI really has a substantial chance of killing humanity, it is better to err on the side of caution and impose the regulation. But do we really understand how AI will kill us? You would think that given the intensity of the discussion, there is a consensus on what specific risks the development of Large Language Models (LLMs) bears. Surprisingly, this is very far from the truth. Hinton talks about various risks, from bias and unemployment to existential risks for humanity. Bengio (here is a very comprehensive yet succinct summary of his position) thinks that given the continuing rapid progress in AI someone will create a “superdangerous AI” that either through misalignment (see The AI Alignment Problem) or intentionally creates an actor that poses an existential threat to humanity. 

Yet, there is very little specifics on how exactly the “superdangerous AI” will destroy humanity. A much cited writer on the topic, Elizier Yudkovsky gives an example of an AI sending DNA in electronic form to a biolab that will produce protein on demand, resulting in artificial life or a deadly pandemic. If we want to eliminate this risk, it seems that regulating biolabs is a way more robust option than banning AI. When someone sells heroin on the Internet, we do not ban the Internet, we have police arrest the seller. On the same note, anyone worried about the bio risk should first advocate for banning the gain of function research that, as opposed to AI, is being a probable cause of the COVID-19 pandemic that has already killed millions of people. A lot of the GoF research is still done in BSL2 like the Wuhan lab. 

This example shows that analyzing specific risks is very important for developing AI regulation. When I asked Roman Yampolsky, a prominent AI safety researcher, what are the specific scenarios where there is an existential threat to humanity, he refrained from spelling it out, implying that a “superdangerous AI” will be way smarter than humans, and we won’t know what kills us. An analogy that is often mentioned by the proponents of this idea is a 10 years old kid playing chess against a grandmaster. The kid will not understand why and how the grandmaster wins. Similarly, a “superdangerous AI” can kill us in so many different ways, that we won’t know it till it is too late. The problem is, any regulation should assume specific, even if implausible, risks in order to address them. Nuclear states haven’t banned any research related to nuclear reactions, they created a Treaty on the Non-Proliferation of Nuclear Weapons. Had they banned all research that could potentially lead to a nuclear bomb, we wouldn’t have seen nuclear power plants and, possibly, X-ray and CT scans. 

So, let’s work with what we have and assess the specific risks we can think of in order to better understand the effects of the regulation that could eliminate them.

AI in social networks. Here the risk is that a “superdangerous AI” will be employed by malicious actors to create a farm of bots that will be hard to distinguish from real people. So, a group of people will have to use AI to create a large set of bots that attack the network in a coordinated fashion. For instance, they will persuade a group of people to commit an act of genocide. This is, in my opinion, the strongest argument for banning large scale AI. There is no doubt that with the rise of GPT-like tools, the struggle between social networks and bots will rise to the next level. However, this risk can be addressed without banning either AI or social networks. There are lots of ways social nets can figure out a coordinated event and ban the accounts or deamplify them. Also, a significant bot effort (including getting past captcha, making friends and generating lots of human-like posts) will still require a costly development effort as well as a substantial amount of power. This means with some additional effort from the social networks the cost of running an AI bot farm will be more expensive than the cost of hiring  human bots. 
Cybersecurity threats from AI. LLMs are capable of writing software code. What if hackers use it to hack into government facilities and blow up stuff? First of all, if there’s a possibility of blowing up stuff by hacking, it is wrong, because there is always a chance that someone will break through the cyber defense. But also, is AI going to be that helpful in hacking? While it is still not clear how well LLMs can teach themselves to program with reinforcement learning, general purpose LLMs are notoriously bad at generating code by themselves. A study on robustness of LLMs used for programming in Javascript have found “62% of the generated code contains API misuses, which would cause unexpected consequences if the code is introduced into real-world software.” Why is that? Common sense and evidence suggest that LLMs are good in programming when they are trained on a large collection of correct code solving real problems. This implies that LLMs are good on languages and problems that have a lot of open source code, and bad otherwise. Are there a lot of open source code used for hacking? Not really. One can argue that this can change in future and the cost of cyber break-ins will be dramatically reduced in the next few years. I think this is definitely possible, but banning AI in specific countries wouldn’t help to solve it, similarly to banning guns in high crime areas without enforcing the ban. There will be countries that will host bad actors. So a more reasonable approach would be to develop countermeasures against AI bots that increase their operating costs for hacking. Also, if a “destroy humanity” switch is connected to the Internet and can be hacked into, this seems like a very bad idea regardless of the existence of a “superdangerous AI”, and we need to put it off the Internet ASAP, before working on the AI regulation!

AI will be used to create chemo/bioweapons. The concern here is that AI is capable of creating bioweapons much better than people, and will be used by malicious actors. People used papers like this one as an argument. It’s behind a paywall, but one of its author gave a lengthy interview, you can get the gist from there https://www.theverge.com/2022/3/17/22983197/ai-new-possible-chemical-weapons-generative-models-vx. Contrary to what many think, this paper doesn’t state that AI has created toxins that are more dangerous than the existing ones. They state that AI suggested toxins that are estimated to be very dangerous by some commercially-available model: “we even found some that were generated from the model that were actual chemical warfare agents.” If, however, you read the paper, you find our that they: (a) developed a predictive model that figures out if a molecule is toxic or not, (b) fed this model as a cost function to a neural network, (c) as the paper states, “we chose to drive the generative model towards compounds such as the nerve agent VX”. And when AI found a good solution to the cost function, they were “concerned”. The authors have not actually synthesized any of the agents and tested its toxicity. If their predictive model doesn’t work, then AI just synthesized rubbish. If the predictive model is really good, then AI synthesized toxins, but this is not the problem with AI, this is the problem with the predictive model that – maybe – shouldn’t have been created in the first place. And given that it is a relatively small network – nothing to do with LLMs – you can’t even ban AI here, you have to ban the datasets themselves. And this might be a good idea, datasets of toxic molecules shouldn’t be out there in the open.

The same principle applies to other bioweapons. If there is enough data to train AI to predict lethal chemical or biological weapons, it will be trained, and it won’t be an LLM, it will be a small simple cost function optimization that no one can ban, it’s like banning all math. We can – and should – ban dangerous data, like we ban the nuclear bomb recipes. 

AI Alignment. The alignment problem is setting the goals for AI that are aligned with the goals of the humans who use this AI. Oftentimes it is very hard to formalize what we want, and so AI will eventually work on a different problem. The concern of AI safety people is that AI will decide that it has to kill people in order to achieve a goal that has nothing to do with human extinction. One of the examples is the paperclip maximizer, a thought experiment by Nick Bostrom: an AI that has a goal to create as many paperclips as possible will decide that it has to kill all humans, as they may switch the AI off and thus prevent it from making as many paperclips as possible. Eliezer Yudkowsky suggests that a runaway AI may decide to kill all humans, design a deadly virus, order its manufacturing with an online service and set up a deadly pandemic. He goes as far as suggesting to use nuclear bombs on data centers that train LLMs, even at the cost of starting a nuclear war. 

Science fiction aside, obviously the AI alignment problem is real and has implications on applying reinforcement learning systems to practical problems. However, we are very far from a runaway AI system that has capability to kill humanity. Anyone dealing with reinforcement learning knows that in addition to defining the goal function one has to define the set of policies that will be optimized. It means that the runaway AI can’t accidentally decide to kill humanity and send a virus to the lab, this has to be programmed. The AlphaGo is programmed to make Go moves, it can’t play chess or kill an opponent. And if an AI system has a capability of killing people, the problem lies in the specific method by which a computer can inflict harm, not with the algorithm itself.

OK, we’ve covered some specific scenarios (almost impossible to cover all of them) and found out that the threat from specifically LLMs is not as high as portrayed by the Future of Life institute letter signees. Now let’s look at the other side of the scales: what we will be giving up if AI is regulated. The details of the regulation are still vague, so for the sake of simplicity let’s assume that the development and use of machine learning models with more than 1 trillion parameters will require an approval from a federal agency (for the reference, GPT-4 has 1.76 trillion parameters). First, large companies that already have large language models will be the winners, and this regulation will be their moat. This means startups will stop innovating in this area, and then, gradually, large companies will also stop, given the absence of competition. Second, more generally, the progress in this area is so fast because a lot of the research is open, and a lot of models are open source. Stability AI released their stable diffusion models in open source, and Meta has opened their Llama-2 model. We have observed many times that scientific breakthroughs don’t come from large groups of people that are constantly in meetings, they come from small focused research groups that have enough tools to do something new and cool. Regulation will delay the innovation in this field or stop it altogether.

Some experts advocate for a 6 month moratorium imposed by the government. 6 months doesn’t look like a big deal, and it will give us all some time to think about AI safety. I think this is quite naive, as there are very few precedents of a government relaxing a regulation. Make no mistake: if it is a 6 months moratorium, it is going to be an indefinite ban. 

It looks like we do not gain much from banning LLMs, the risks are usually connected with other security threats that should be addressed separately (instead of banning LLMs that can create a deadly virus, ban the labs that can print such a virus from an online order). Meanwhile, LLMs are expected to make the economy way more efficient, and a ban can hurt a lot of people.

Are we ready to turn this area of rapid progress into something like fission, where only a few large entities move with a glacial pace? Decide for yourself. 

[UPD] This text was written before October 30, 2023 when the Executive Order on AI was published. Steven Sinofsky, who ran the Windows division at Microsoft, has written a blog post on this order that I encourage you to read, where he draws parallels between AI and the early times of personal computers, where similar regulations were contemplated but not implemented. While we still don’t know the impact of the EO on the progress in large-scale AI, one thing is clear: it has dug a huge moat for large companies that are already working on LLMs, making it way more expensive for startups to enter the field.

The Avatar Dilemma

A digital world connects us all. Everyday people talk to each other via emails, chats and video instead of meetings and phone calls. Kids are on social networks like Roblox, and multiplayer games. The pandemic served as an acceleration to the trend that was already obvious. Increasingly more of these digital interactions are in 3D. People are used to exist in a three-dimensional space, so we are getting tired of staring at 2D cameras when meeting with people over Zoom. Once there are better VR headsets that are lighter and have higher resolution screens, increasingly more meetings will be in VR. This will allow users to see each other as in real life, use whiteboards, share video screens and documents.

Meeting people in a 3D experience (or the “metaverse” that is becoming a very trendy term these days) requires a digital representation of a person. You can’t get away with a flat video (a few companies tried having 3D models with superimposed video streams for personalization), so a lot of companies are looking for a solution that involves personal avatars. 

Playing a game of Fortnight, I may not feel like being recognized. If I am in the VRchat or a similar experience, I may choose to be anonymous on that platform, so VRChat avatars offer the flexibility to choose what kind of character I want to be. Perhaps I am a lawyer, and I want to be a cat in VRChat, I can be. However, if I am in a business meeting with customers, I need to show my face in order to be more trustworthy. This means the closer to reality my avatar is, the better. 10 years from now hardware may be powerful enough to render hyper-realistic avatars in real time, but today it is not available to consumers. So, what we need are realistic enough avatars that our friends or acquaintances can recognize us. This establishes credibility in this new and uncertain reality.

So why instead of realistic avatars, do we see so many cartoonish ones? There are a few challenges in creating realistic avatars. It is hard to create a likeness of a person from one or few photos collected with a mobile phone (Hollywood uses expensive — a few hundred thousand dollars worth — rigs). It is not easy to animate such a model so that it resembles the person. But mostly a lot of creators are cautious about enabling realistic avatars because of the uncanny valley effect, which refers to the mental uneasiness that occurs when an artificial figure tries but fails at mimicking a human.

Uncanny valley is a well known effect that an imperfect 3D model of a person creates an eerie feeling for people. Anyone interested in the details can start from the wiki article https://en.wikipedia.org/wiki/Uncanny_valley. Movie and game industries exploited uncanny valley for ages to create scary characters. Humans can have a strong, visceral aversion to such characters. This is why The Polar Express was so controversial when it was released.

There is an ongoing discussion that the uncanny valley is less applicable to many use cases of realistic avatars. Eduard Zell et al [1] show that the uncanny valley effect may be triggered not by making an avatar more realistic, but by introducing inconsistencies in the stylization/level of detail. Henriette C. Van Vugt et al [2] demonstrated that a recognizable (and not hyper-realistic) avatar does not necessarily fall into the uncanny valley. Katja Zibrek et al [3] discovered that life experience plays a role in experiencing the uncanny valley effect: people with computer gaming experience rated avatars less eerie than an average person. Perhaps we will find that the overall uncanny valley effect will be getting weaker over time, as more people play video games. On top of all that, the uncanny valley effect is often confused with the fact that a lot of people just don’t like how they look in photos. This doesn’t just mean that they don’t want a recognizable avatar: they want to look different in a digital metaverse compared to their real analog life. This is one of the really attractive aspects of virtual reality, the ability to look different, or better for a few hours.

So, there is a choice each creator of a metaverse has: use realistic or cartoonish avatars. Realistic avatars will be recognizable, which brings an emotional connection with the avatar for the person using it and for other people interacting with it. However, these emotions can also be negative because of the uncanny valley. Cartoonish avatars are much easier to design and develop. However, it is quite hard to make universally recognizable cartoonish avatars. And once the avatar is unrecognizable, so that there has to be a name tag over a 3D model’s head, the emotional connection is gone. “Is this my avatar? Meh, whatever”. 

My point is – observed in many interactions with customers – that not once I saw the “whatever” attitude towards recognizable avatars. People either love them (“yeah, that’s me, this is crazy!”) or hate them (“horrible”, “cringy”, “embarrassing” etc). The closer you get to a recognizable avatar, the more emotions it evokes in people. Positive emotions improve engagement, while negative drive people away from the platform. Use a realistic avatar of another person, and we are back to “meh, whatever” attitude. 

So the real dilemma for every metaverse or an avatar app creator is not about making realistic or cartoonish avatars, it is about recognizable vs unrecognizable avatars. We see two types of 3D experiences that use “whatever” avatars: those where players want to be unrecognized (for instance, some computer games) and those where developers have a “just don’t fuck up” mental state (avoid negative emotion, absence of positive emotion is fine). And there are other types of experiences that absolutely need a recognizable avatar maker: all kinds of work-related meetings, parties and certain types of computer games (for instance, sports games). 

Building a “whatever” cartoonish avatar solves many problems: many computer games created beautiful avatars with a very detailed editor that can adjust every tiny little detail of the body and face. Those unfamiliar with this subject can take a look at Sims and Cyberpunk. But how can one build a realistic avatar? 

VFX companies need to use a photogrammetry rig to create an avatar of an actor and that is then cleaned up by a staff of 3D artists and added to a film. This is in no way scalable for the billion or so consumers that will be coming into the metaverse space in the next decade. Avatar SDK https://avatarsdk.com uses neural networks to create an avatar from a photo of a person and then allows editing the result, as most people love custom avatars. This produces lower fidelity models, as there is a limited amount of information a character creator can get from a single selfie. 

But this is not that big of a problem as modern GPUs available in consumer hardware won’t be able to render a high-fidelity model in real-time anyway. Given that GPU power efficiency (FLOPS per Watt) doubles every 3-4 years, it will be some time before mobile devices can render MetaHuman level characters. What we need at this point is an avatar creator that results in recognizable, not hyper-realistic models.

So, how do we move forward from where we are? We can start adding more and more realistic avatars to 3D experiences without falling into the uncanny valley. It is like we are navigating a multidimensional space instead of moving along a single axis. There are ways around the uncanny valley that we can take instead. Finding and navigating these paths is what we at Avatar SDK are excited about. Once we saw our customers falling in love with their realistic avatars, we never looked back.

References

  1. Eduard Zell, Carlos Aliaga, Adrian Jarabo, Katja Zibrek, Diego Gutierrez, Rachel McDonnell, and Mario Botsch. 2015. To stylize or not to stylize? the effect of shape and material stylization on the perception of computer-generated faces. ACM Trans. Graph. 34, 6, Article 184 (November 2015). 
  2. Henriette C. Van Vugt, Jeremy N. Bailenson, Johan F. Hoorn, and Elly A. Konijn. 2008. Effects of facial similarity on user responses to embodied agents. ACM Trans. Comput.-Hum. Interact. 17, 2, Article 7 (May 2010). 
  3. K. Zibrek, E. Kokkinara and R. Mcdonnell, “The Effect of Realistic Appearance of Virtual Characters in Immersive Environments – Does the Character’s Personality Play a Role?,” in IEEE Transactions on Visualization and Computer Graphics, vol. 24, no. 4, pp. 1681-1690, April 2018.

A case for a decentralized social network

A wake-up call

I was on the first day of vacation when I received a notification from Facebook that my account does not meet community guidelines and has been canceled. First I wasn’t worried a lot.I have to admit that my relationship with Facebook is something of love and hate. I spend a lot of time there catching up with my friends, and I also use it as a news source for things related to my work. 

However, Facebook is dangerously addictive. It shows you a video, and if you spend just a tiny bit of time on it instead of swiping down your timeline (in my case it would be funny videos of cats and dogs and great tennis shots), the next thing you know, your timeline is filled with these videos. Seriously, I use Facebook to get news about my areas of professional interest (such as computer vision), but at times all I could see was kittens and dogs, puppies and cats, Federer, Federer, Federer… And if you click on a video you like, behold! You fall into a time hole, the next time you take your eyes off the screen is 30 minutes later, wondering what has happened. 

Once, after my iPhone’s screen time app started screaming at me, I deleted the Facebook app from the phone. Unfortunately, somehow I reinstalled it a few months later. I put it on the last screen in a folder, making it harder for myself to start it, which somewhat helped, but I still spend way more time on Facebook than I want to. So my initial reaction to the account ban was like yeah it’s OK I’m gonna have a better vacation without the Facebook addiction. But then the last phrase (“we may not be able to review your appellation”) caught my attention. So the ban may be permanent. I will never see those kittens again!

We live in a dystopian reality run by robots

This got me thinking: what other things in my life depend on my Facebook account (pitiful as it sounds)? My company has a couple product pages where my customers ask questions, I won’t be able to access those, but my colleagues can. I can’t login to several websites with my Facebook account (why did I not use email for those accounts? — stupid!). 

But, most importantly, I have to find other ways of talking to people I like and enjoy interacting with outside of my work. Kids pictures, solving math problems with a college friend, politics, tech discussions…There are more than a few friends that I’d have trouble reaching, as the FB account is my default connection to them.  FB is not just a social network now, it has become a utility company, where your social life somewhat depends on your account. What’s also important, it is an unregulated utility company, that can cancel your account without any explanation. Surely you don’t expect this from other utility companies, like gas, water and electricity suppliers. But Facebook not only can cancel your account in theory, but often does it. 

A lot of account cancellation is done by algorithms, so any — I cannot stress this enough — any account can get killed without any human supervision, no explanation and no accountability. For instance, CNN reported that between January and March 2020 FB killed 2.2B accounts! To put things in perspective, this is 7 times more than Twitter monthly active users, and about the same as the number of FB monthly active users. Assuming a false positive rate of 0.1% (a generous assumption, a lot of machine learning algorithms have higher error rates for similar problems), 2.2M accounts of real people were canceled. If you think CNN reporters just mixed up billions and millions, no, they didn’t: Youtube regularly terminates around 2M channels a quarter, nobody gets excited about it. 

Other social networks can also kill accounts, making a significant impact on your life and work. If you are a media company, you depend on your Twitter account. And if you are a tech professional, you don’t exist without your Linkedin profile. They are private businesses and can kill accounts as it suits them. If you are not scared, you should be.

When I almost finished this essay, Youtube terminated my teenage daughter’s channel. She wants to be a professional piano player, and we upload videos of her recitals to a Youtube channel. She has just been selected to participate in a prestigious contest that each year selects only 16 piano players from the whole world, a pretty important step in her future career. The selection was by video (everything is remote in our age of pandemic), so if the ban had happened a month earlier, chances are she wouldn’t have been selected to participate at all, — the selection committee would have seen bad links and moved to the next candidate. 

There were no warnings from Youtube, no “strikes”, just channel termination out of the blue. We sent an appeal and 24 hours later got a negative response. No specific reasons for termination were given in either communication. An escalation by a Google employee brought the account back within 1 hour (I have a strong feeling that this was the first time a living human being looked at the account and made an obvious decision that it doesn’t violate any policies). 

As I found out, Youtube killed many accounts, including pretty big ones, such as from Blender and MIT. Here is a good summary https://www.maxlaumeister.com/articles/youtube-is-deleting-your-favorite-videos/ showing that censoring decisions of Youtube are almost never reversed, unless channel owners are able to gain massive community support, usually on another platform like twitter. 

It feels, especially in the times of pandemic, when a lot of human interaction happens online, that we are subjects of autocracies run by hordes of robots. We know many such dystopian countries from science fiction. For example, The Matrix. Now imagine living in the Matrix where Agent Smith, instead of being solely an antivirus, is now motivated to get and then sell as much of your data as possible.

So far there is no hope that regulation, competition or internal processes push these digital countries towards democracy with transparent legal systems. Like Neo, we need to urgently take the red pill (in the old, 1999 meaning of the term) and become independent. 

Breaking free from ad-driven overlords

Here are a few things you can do to become less dependent on ad-driven social networks (admittedly, following these requires more technical skills than an average consumer possesses). The infrastructure for decentralized social networks that are decoupled from ad $$ is slowly growing. There are 3 aspects of being socially present that we need to be decentralized:

  • Self-hosting: a place where we can host our data. It can’t be servers that belong to ad-driven social networks. Fortunately, there are tons of companies that can help you to create your own website and host your data. Some of them are integrated with WordPress https://wordpress.org/hosting/, others like wix.com provide you a wysiwyg editor. Hosting videos is harder, but instead of sending people to Youtube consider providing a link to it on your website. Then, if your Youtube channel gets killed, you can reupload the video to another service and update the link, so that people visiting your website won’t notice a difference. Also, you can use tools like JW Player (there is a WordPress plugin) that allow you to host videos on your website and have your own customized player instead of the standard provided by Youtube. Hosting videos on a platform like Vimeo that sells ads but also gets revenue directly from hosting is probably safer than hosting on Youtube that makes money only from ads. It is important that backing up your website and moving it to a different hosting is immeasurably easier than moving your data to a new Facebook or Youtube account.
  • Authentication: if you want to share data that is only for your friends eyes, your website has to know who is looking at your data. There are open standards like OpenID https://en.wikipedia.org/wiki/OpenID that provide a decentralized authentication protocol that allows you as a webmaster to skip implementing your own authentication, and save your friends the trouble of logging to your website each time. The list of companies that have certified as OpenID providers includes big names like Microsoft, IBM, Oracle, Samsung and many more.
  • Social network protocol: in order to have truly social interactions on the web, you want some way of notifying others that you have new content, and a search that allows others to find you. You can’t send your friends an email each time you publish a new post on your website. There are a few standards in this area, ActivityPub developed by  W3C is an example. It basically allows one to build a timeline out of accounts hosted in a decentralized fashion, given a social graph and authentication. There are even protocols like PeerTube built on top of ActivityPub and WebTorrent to build an equivalent of Youtube service, allowing to find and view videos from different hosts that implement this protocol. This means that you can host videos on your website, and somebody using PeerTube will be able to find and view them without even knowing where they are hosted! 

Also, here is a list of various decentralized social network projects, features and protocols https://en.wikipedia.org/wiki/Comparison_of_software_and_protocols_for_distributed_social_networking

Some final thoughts

I am far from the idea that ad-driven social networks are the source of all evil and should be destroyed. They obviously brought a lot of good to the world. Successful business models allowed them to hire great engineers and build solutions that provided our communication during the times of pandemic. However, this model doesn’t work for increasingly many use cases, because of addictiveness and censorship. Decentralized social media may solve a part of these issues. 

It is not clear how decentralized solutions may become important enough for consumers. Facebook, Twitter, Google and others have occupied most of the market, and their moats seem very hard to penetrate, if at all possible. Here is a good discussion on HN about why Peertube won’t be mainstream anytime soon https://news.ycombinator.com/item?id=21513310. I also realize that going decentralized now is somewhat similar to using Linux in the 90s, it requires tech knowledge and skills, and takes time.

However, I believe we have to start relying less on the big social nets. This is why I am publishing this text not on my social media accounts and not on substack, but on my own website, hosted by a service that I pay for. I will still be present on social media, but will try to host increasingly more data on my website. It will take time and money, and won’t even put a dent in ad-driven companies. But the journey of a thousand miles begins with a single step. I am going to take it, and urge you to do the same.