This legendary page from an internal IBM training in 1979 could not be more appropriate for our new age of AI. ![A COMPUTER CAN NEVER BE HELD ACCOUNTABLE. THEREFORE A …
Some people think that a hypothetical AGI being intelligent is enough to put it on charge of decisions; it is not. It needs to be a moral agent, and you can’t really have a moral agent that is not punishable or reward-able.
Most HN comments ITT are trash, so addressing the one with some merit:
The implication here is that unlike a computer, a person or a corporation can be held accountable. I’m not sure that’s true.
Emphasis in the original. True, a corporation cannot be held accountable for its actions; thus it should not be in charge of management decisions.
Holding someone accountable doesn’t undo their mistakes, once a decision is made, there is often nothing you can do about it. Humans make bad decisions too, whether unknowingly or intentionally. It’s clear that accountability isn’t some magic catch-all.
I find the idea that punishment and reward are prerequisites of morality rather pessimistic, do you believe people are entirely incompetent of acting morally in the absence of external motivation?
Whichever way, AI does essentially function on the principle punishment and reward, you could even say it has been pre-punished and rewarded in millions of iterations during its training.
AI simply has clear advantages in decision making. Without self-interest it can make truly selfless decisions, it far less prone to biases and takes much more information into account in its decisions.
Try throwing some moral, even political questions at LLMs, you will find they do surprisingly well, and these are models that aren’t optimized for decision making.
Holding someone accountable doesn’t undo their mistakes, once a decision is made, there is often nothing you can do about it. […] It’s clear that accountability isn’t some magic catch-all.
It is not some magic catch-all on a pragmatic level, but it is still regardless a necessary condition to avoid an otherwise rational agent fucking everything up.
Humans make bad decisions too, whether unknowingly or intentionally.
This “but what about humans…” does not contradict what I said. (Some humans cannot be held accountable for their decisions either.)
I find the idea that punishment and reward are prerequisites of morality rather pessimistic
I don’t care if this is “pessimistic” or “optimistic” or skibidi. I’m more concerned about it being contextually accurate or inaccurate, true or false, consistent or inconsistent, etc.
And at the end of the day, all morality boils down to
moral right = what gives me beneficial direct or indirect consequences. e.g. I pet the dog → people like me → people care about me [reward].
moral wrong = what gives me harmful direct or indirect consequences. e.g. I kick the dog → people hate me → they give me a hard time [punishment]; or I kick the dog → dog bites me.
If you have other grounds for morality feel free to share them.
do you believe people are entirely incompetent of acting morally in the absence of external motivation?
It is not a matter of incompetence. Incompetence, in this case, would be failure to generate moral rules based on the consequences of one’s actions.
Whichever way, AI does essentially function on the principle punishment and reward, you could even say it has been pre-punished and rewarded in millions of iterations during its training.
This is a sleight of hand, as the word “training” is being sloppily used to refer to both “tweaking an automated system” and “human skill development”, and human skill development often requiring rewards vs. punishment.
AI simply has clear advantages in decision making. Without self-interest it can make truly selfless decisions, it far less prone to biases and takes much more information into account in its decisions.
Emphasis mine. You’re being self-contradictory - first claiming AI can be rewarded/punished, then claiming that it lacks self-interest.
Try throwing some moral, even political questions at LLMs, you will find they do surprisingly well, and these are models that aren’t optimized for decision making.
I think that it’s rather clear that the ability to output info regarding a topic (in this case, morality) does not automatically grant an entity the attribute that said topic refers to.
While I’m focusing mostly on a hypothetical AGI, what I said is doubly true for those glorified text generators (LLMs). An AGI is amoral; LLMs are amoral and non-intelligent.
There’s evidence that even good quality LLMs are not able to handle moral matters in a remotely satisfactory way. Like this. (the video is called “gaslighting ChatGPT with ethical dilemmas”, but core issue is moral, not ethic.)
Contrariwise to the blatant assumption in your comment, I already tested this shit with both ChatGPT and Gemini. And it only reinforced my position.
The main issue with this idea of punishment and reward, in the sense that you mean them, is that their results depend entirely on the criteria by which you are punished or rewarded. Say, the law says being gay is illegal and the punishment is execution, does that mean it’s immoral?
Being moral boils down to making certain decisions, the method by which they are achieved is irrelevant if the decisions are “correct”. Most moral philosophies agree that moral decisions can be made by applying rational reasoning to some basic principles (e.g. the categorical imperative). We reason through language, and these models capture and simulate that. The question is not whether AI can make moral decisions, it’s whether it can be better than humans at it, and I believe it can.
I watched the video, honestly I don’t find anything too surprising. ChatGPT acknowledges that there are multiple moral traditions (as it should) and that which decision is right for you depends on which tradition you subscribe to. It avoids making clear choices because it is designed that way for legal reasons. When there exists a consensus in moral philosophy about the morality of a decision, it doesn’t hesitate to express that. The conclusions it comes to aren’t inconsistent, because it always clearly expresses that they pertain to a certain path of moral reasoning. Morality isn’t objective, taking a conclusive stance on an issue based on one moral framework (which humans like to do) isn’t superior to taking an inconclusive one based on many. Really this is one of our greatest weaknesses, not being able to admit we aren’t always entirely sure about things. If ChatGPT was designed to make conclusive moral decisions, it would likely take the majority stance on any issue, which is basically as universally moral as you can get.
The idea that AI could be immoral because it holds the stances of its developers is invalid, because it doesn’t. It is trained on a vast corpus of text, which captures popular views and not the views of the developers.
The main issue with this idea of punishment and reward, in the sense that you mean them, is that their results depend entirely on the criteria by which you are punished or rewarded.
Shifting the goalposts from “it’s pessimistic” to “it depends on criteria”.
I was being simplistic to keep it less verbose. I didn’t talk about consistency, for example, even if it also matters.
By “reward/punishment” I mean something closer to behaviourism than to law. I could’ve called them “negative stimulus” and “positive stimulus” too, it ends the same.
Say, the law says being gay is illegal and the punishment is execution, does that mean it’s immoral?
For the bigots implementing and supporting such idiotic laws? Yes, they consider being gay immoral. That’s the point for them.
For gay people? It throws them into a catch-22, because following such a law is also punishing: they’re forced into celibate and prevented from expressing their sexuality and sexual identity.
So even considering your example, rooting morality into punishment and reward still works. And you can even retrieve a few conclusions out of it:
Moral is subjective. No such thing as an universal set of moral values. (NB: the ability to set up those moral values still needs to be there, in order to get a moral agent.)
What 2+ people consider morally good might enter in conflict.
Trying to reason with bigots and hoping to change their views solely through logical arguments is 100% futile.
Since people promoting those laws are forcing gay people into a catch-22, they’re treating gay people as existentially immoral.
Back to AI. Without ability to be rewarded/punished, not even a hypothetical AGI would be a moral agent. At most it would be able to talk about the topic, but not generate a consistent set of moral rules for itself. And no, model feeding, tweaking parameters, etc. are clearly not reward/punishment.
Being moral boils down to making certain decisions, the method by which they are achieved is irrelevant if the decisions are “correct”.
That’s circular reasoning, given that the “correct decisions” will be dictated by moral.
Most moral philosophies agree that moral decisions can be made by applying rational reasoning to some basic principles (e.g. the categorical imperative).
That does not address the request.
I’ll rephrase it: since you disagree that moral values ultimately come from reward and punishment, I asked where you think that they come from. For example, plenty of the moral philosophies that you mentioned root their moral values into superstitions, like “God” or similar crap.
Language and reasoning:
We [humans] reason through language
That’s likely false, and also bullshit (i.e. a claim made up with no regards to its truth value).
That already dismantles your argument on its central point. But I’ll still dig further into it.
and these models capture and simulate that.
They don’t even capture language as a whole, let alone a different system like reasoning.
They output decent grammar and vocab. But they handle notoriously poorly meaning (semantics) and utterance purpose (pragmatics). They show blatant signs of not knowing what they are outputting.
You can test this by yourself by asking any LLM-powered chatbot of your choice about some topic that you know by heart. Then looking at the incorrect answers, and asking yourself why the bot is outputting that wrong piece of info (“hallucination”).
I watched the video, honestly I don’t find anything too surprising.
Except the fact that it directly contradicts your claim.
The idea that AI could be immoral because it holds the stances of its developers is invalid, because it doesn’t. It is trained on a vast corpus of text, which captures popular views and not the views of the developers.
Emphasis mine.
This is the cherry of the cake because it shows that you’re wasting my time with a subject that you’re completely clueless about. I’m saying that those systems are amoral, not immoral.
Sorry to be blunt but I’m not wasting my time further with you.
Just like a “corporate veil” I’m sure this will get spun into a way to avoid responsibility. I mean, it shouldn’t, but that’s precisely why it will.
“Your honor, Alphabet Inc. deeply regrets the misidentification & destruction of a civilian area, but the AI made an honest mistake. It’s unreasonable to expect manual review of EVERY neighborhood we target.”
This is likely true for what we already see. Fuck, people use even dumb word filters to avoid responsibility! (Cough Reddit mods “I didn’t do it, AutoMod did it” cough cough)
That said this specific problem could be solved by AGI or another truly intelligent system. My concern is more like the AGI knowingly bombing a civilian neighbourhood, because it claims to be better for everyone else, due to the lack of morality. That would be way, waaaaay worse than the false positives like in your example.
This is 100% true and moral.
Some people think that a hypothetical AGI being intelligent is enough to put it on charge of decisions; it is not. It needs to be a moral agent, and you can’t really have a moral agent that is not punishable or reward-able.
Most HN comments ITT are trash, so addressing the one with some merit:
Emphasis in the original. True, a corporation cannot be held accountable for its actions; thus it should not be in charge of management decisions.
Holding someone accountable doesn’t undo their mistakes, once a decision is made, there is often nothing you can do about it. Humans make bad decisions too, whether unknowingly or intentionally. It’s clear that accountability isn’t some magic catch-all.
I find the idea that punishment and reward are prerequisites of morality rather pessimistic, do you believe people are entirely incompetent of acting morally in the absence of external motivation?
Whichever way, AI does essentially function on the principle punishment and reward, you could even say it has been pre-punished and rewarded in millions of iterations during its training.
AI simply has clear advantages in decision making. Without self-interest it can make truly selfless decisions, it far less prone to biases and takes much more information into account in its decisions.
Try throwing some moral, even political questions at LLMs, you will find they do surprisingly well, and these are models that aren’t optimized for decision making.
It is not some magic catch-all on a pragmatic level, but it is still regardless a necessary condition to avoid an otherwise rational agent fucking everything up.
This “but what about humans…” does not contradict what I said. (Some humans cannot be held accountable for their decisions either.)
I don’t care if this is “pessimistic” or “optimistic” or skibidi. I’m more concerned about it being contextually accurate or inaccurate, true or false, consistent or inconsistent, etc.
And at the end of the day, all morality boils down to
If you have other grounds for morality feel free to share them.
It is not a matter of incompetence. Incompetence, in this case, would be failure to generate moral rules based on the consequences of one’s actions.
This is a sleight of hand, as the word “training” is being sloppily used to refer to both “tweaking an automated system” and “human skill development”, and human skill development often requiring rewards vs. punishment.
Emphasis mine. You’re being self-contradictory - first claiming AI can be rewarded/punished, then claiming that it lacks self-interest.
The main issue with this idea of punishment and reward, in the sense that you mean them, is that their results depend entirely on the criteria by which you are punished or rewarded. Say, the law says being gay is illegal and the punishment is execution, does that mean it’s immoral?
Being moral boils down to making certain decisions, the method by which they are achieved is irrelevant if the decisions are “correct”. Most moral philosophies agree that moral decisions can be made by applying rational reasoning to some basic principles (e.g. the categorical imperative). We reason through language, and these models capture and simulate that. The question is not whether AI can make moral decisions, it’s whether it can be better than humans at it, and I believe it can.
I watched the video, honestly I don’t find anything too surprising. ChatGPT acknowledges that there are multiple moral traditions (as it should) and that which decision is right for you depends on which tradition you subscribe to. It avoids making clear choices because it is designed that way for legal reasons. When there exists a consensus in moral philosophy about the morality of a decision, it doesn’t hesitate to express that. The conclusions it comes to aren’t inconsistent, because it always clearly expresses that they pertain to a certain path of moral reasoning. Morality isn’t objective, taking a conclusive stance on an issue based on one moral framework (which humans like to do) isn’t superior to taking an inconclusive one based on many. Really this is one of our greatest weaknesses, not being able to admit we aren’t always entirely sure about things. If ChatGPT was designed to make conclusive moral decisions, it would likely take the majority stance on any issue, which is basically as universally moral as you can get.
The idea that AI could be immoral because it holds the stances of its developers is invalid, because it doesn’t. It is trained on a vast corpus of text, which captures popular views and not the views of the developers.
On morality:
Shifting the goalposts from “it’s pessimistic” to “it depends on criteria”.
I was being simplistic to keep it less verbose. I didn’t talk about consistency, for example, even if it also matters.
By “reward/punishment” I mean something closer to behaviourism than to law. I could’ve called them “negative stimulus” and “positive stimulus” too, it ends the same.
For the bigots implementing and supporting such idiotic laws? Yes, they consider being gay immoral. That’s the point for them.
For gay people? It throws them into a catch-22, because following such a law is also punishing: they’re forced into celibate and prevented from expressing their sexuality and sexual identity.
So even considering your example, rooting morality into punishment and reward still works. And you can even retrieve a few conclusions out of it:
Back to AI. Without ability to be rewarded/punished, not even a hypothetical AGI would be a moral agent. At most it would be able to talk about the topic, but not generate a consistent set of moral rules for itself. And no, model feeding, tweaking parameters, etc. are clearly not reward/punishment.
That’s circular reasoning, given that the “correct decisions” will be dictated by moral.
That does not address the request.
I’ll rephrase it: since you disagree that moral values ultimately come from reward and punishment, I asked where you think that they come from. For example, plenty of the moral philosophies that you mentioned root their moral values into superstitions, like “God” or similar crap.
Language and reasoning:
That’s likely false, and also bullshit (i.e. a claim made up with no regards to its truth value).
While language and reasoning do interact with each other, “there is a substantial and growing body of evidence from across experimental fields indicating autonomy between language and reasoning” (paywall-free link.
That already dismantles your argument on its central point. But I’ll still dig further into it.
They don’t even capture language as a whole, let alone a different system like reasoning.
They output decent grammar and vocab. But they handle notoriously poorly meaning (semantics) and utterance purpose (pragmatics). They show blatant signs of not knowing what they are outputting.
You can test this by yourself by asking any LLM-powered chatbot of your choice about some topic that you know by heart. Then looking at the incorrect answers, and asking yourself why the bot is outputting that wrong piece of info (“hallucination”).
Example here.
Except the fact that it directly contradicts your claim.
Emphasis mine.
This is the cherry of the cake because it shows that you’re wasting my time with a subject that you’re completely clueless about. I’m saying that those systems are amoral, not immoral.
Sorry to be blunt but I’m not wasting my time further with you.
Just like a “corporate veil” I’m sure this will get spun into a way to avoid responsibility. I mean, it shouldn’t, but that’s precisely why it will.
“Your honor, Alphabet Inc. deeply regrets the misidentification & destruction of a civilian area, but the AI made an honest mistake. It’s unreasonable to expect manual review of EVERY neighborhood we target.”
This is likely true for what we already see. Fuck, people use even dumb word filters to avoid responsibility! (Cough Reddit mods “I didn’t do it, AutoMod did it” cough cough)
That said this specific problem could be solved by AGI or another truly intelligent system. My concern is more like the AGI knowingly bombing a civilian neighbourhood, because it claims to be better for everyone else, due to the lack of morality. That would be way, waaaaay worse than the false positives like in your example.
Ugh, yes, the machines “know what’s best”.
I was just assuming it would be used for blame management, regardless if it was an accident or not.
“See, it wasn’t me! It was ScapegoatGPT!”