My Concerns of AI
TLDR I want some counterpoints to arguments I have either not found solid rebuttals or rebuttals seem to miss part of the point. I plan to update with any rebuttals I receive
Lack of ability to predict future capabilities of LLMs (before 2100)
High probability of immediate extreme misalignment (before 2100)
Argument to patterns repeating, intelligence
Argument to patterns repeating, previous fooms
Background
I am not particularly well read on the subjects of ai or cognitive science.
I keep as up to date on chips as I can (mostly consumer electronics and networking but recently ai accelerators have popped into the mix)
About a quarter of my working hours over the past 6 months has been spent dealing with security flaws in our networking gear, which starts me out on a pessimistic view with anything related to computer security.
I have only started listening to AI Safety podcasts about 2 months ago (July 2023), and really only dealt with arguments from Robin Hanson, Rob Miles, Yudkowsky, Tegmark, Connor Leahy, Sam Altman, and LeCun.
Next definitions I will use on this webpage since semantics are complicated
Human: Me, and those that I would recognize as human living before the year 2100.
(Definitions excluding me could include sharing over 95% genomic similarity with Genghis Kkan, but that is not what I mean on this page)
Grok: to have an unbroken causal link that can be explained by human(s) all the way back to either basic mathematics or our current frontier physics (eg transistors ability to perform logical operations are grokked) This could be things simple enough for a single human to understand within their mind, or complex enough that to follow the logic would require information from many people and/or writing down the chains of logic.
Emergent capability: capability of a system not predicted before the system was built (building satellites is an emergent capabilities of humans generally but not of NASA)
Crimes against Humanity: Actions that reduce the number of humans below 1, or would receive any penalty as defined The Rome Statute of the ICC in 1998 while striking all waivers of criminal responsibility (including articles 31 and 32, and the exceptions layed out to 33) and were to be tried under that statute as a human would in 2002
VIs: systems able to transform inputs to outputs given inputs within a distribution
AIs: VIs with Emergent capabilities
AGIs: AIs able to outperform humans at enough tasks to have the capability to eradicate humanity
Misaligned AGI: AGIs that perform Crimes against humanity or leaving no humans alive (before the year 2100)
Types of systems in terms of safety:
Type 1 cannot be used safely, no matter how careful you are (the powder keg from Shrodinger's experiment)
Type 2 can be used safely if you are extremely careful (Nitroglycerine)
Type 3 is unsafe only if used in an unsafe manner (swords)
Type 4 is always safe without extreme intervention (air)
Premises that I assume are correct and obviously would dismantle the arguments
Humans can build systems that they don't Grok
Humans can build systems that have unpredicted capabilities
The capability to play chess of systems like deep blue and stockfish are groked by humans
Systems like and the human brain have capabilities that are not grokked yet,
Importantly: it is high likelihood (1:1 or worse) that some or all of these will remain not grokked until at least the year 2080
Premises that are common to arguments, some evidence to support, dismantling makes foom less likely!
Humans building AGIs is
Misalignment : Concrete Problems in AI Safety : https://arxiv.org/abs/1606.06565
The degree of misalignment is proportional to the capability, the more that you can do the more crazy appearing the misalignment would be.
High probability of immediate extreme misalignment (before 2100)
AGI Alignment is not solved or automatic
AGI development is faster than alignment development
Therefore we will have AGI before we can Align it.
Argument to patterns repeating, intelligence
common chimp and human ancestor was 5m years ago, development of the nuke was less than 100 years ago
humans have nuked as many humans as there are chimps in existence (300k ish)
chimps dont have any defence against nukes, and we developed nukes.
AGI is to humans what humans are to chimps, and AGI misalignment would do to us what nukes could do to chimps if we didnt want chimps around.
Argument to patterns repeating, previous fooms
Humans
1) Evolution via Natural Selection took half a billion ish years to go from nerves to homo sapiens
2) Human cooperation took 15k years ish to go from provably cooperating
3) Global economy is undergoing exponential growth still, easy to see takeoff from industrialization started around 1800,
Computers
1) First automated adding machine to now was under 400 years ago
2) Integrated circuits were invented 60(ish) years ago and have gone from a 64 transistors to 2.3 - 5.3 trillion transistors depending on your definition since then
2.3 Trillion : WSE-2
5.3 Trillion : Micron 2TB Nand chip
3) Internet, .04ish% of humans in 1990, 50% at the end of 2018
AGI
1) Artificial neural networks 1958 potentially originally demonstrated? probably earlier examples?
2) Attention is all you need, 2016
3) ?
Here is the analogy
Human is to nerves: Computer is to math machines: AGI is to neural networks
Human is to cooperation: Computer is to ICs : AGI is to transformers
Human is to Global economy: Computer is to Internet: AGI is to foom
If foom is misaligned, it will affect 50% of the population before 2050
Finally, Humans are to computers what AGI is to even bigger?