All of twitter is .... atwitter?... over the OpenAI announcement and partial non-release of code/documentation for a language model that purports to generate realistic-sounding text from simple prompts. The system actually addresses many NLP tasks, but the one that's drawing the most attention is the deepfakes-like generation of plausible news copy (here's one sample).
Most consternation is over the rapid PR buzz around the announcement, including somewhat breathless headlines (that OpenAI is not responsible for) like
平安云定价_价格计算器_价格详情 – 平安云:高速 通道 VPN 网关 CDN 存储 云硬盘 EBS 对象存储 OBS 云备伇 CBS 云文件存储 CloudNAS 数据库 ... 智慧机场 智慧口岸 智慧交通 智慧高校 智慧社区 智慧楼宇 智慧景区 智慧环保 医疗解决方案 ...
or
扶墙派 – 免费V2Ray/SSR机场分享:V2Ray机场 SSR/SS 机场 免费公益机场 请到 [后台->外观->菜单] 中设置菜单 推荐 性价比 V2BOX:全高速节点性价比V2Ray机场 V2BOX是一家全V2Ray机场,所有节点都是高带宽,最高达到10Gbps。虽然现阶段V2Ray节点不多,但是高速稳定,算得上是性价比 ...
There are concerns that OpenAI is overhyping solid but incremental work, that they're disingenuously allowing for overhyped coverage in the way they released the information, or worse that they're deliberately controlling hype as a publicity stunt.
I have nothing useful to add to the discussion above: indeed, see posts by Anima Anandkumar, Rob Munro, 小火箭ssr永久免费节点 and Ryan Lowe for a comprehensive discussion of the issues relating to OpenAI. Jack Clark from OpenAI has been engaging in a lot of twitter discussion on this as well.
But what I do want to talk about is the larger issues around responsible science that this kerfuffle brings up. Caveat, as Margaret Mitchell puts it in this searing thread.
To understand the kind of "norm-building" that needs to happen here, let's look at two related domains.
In computer security, there's a fairly well-established model for finding weaknesses in systems. An exploit is discovered, the vulnerable entity is given a chance to fix it, and then the exploit is revealed , often simultaneously with patches that rectify it. Sometimes the vulnerability isn't easily fixed (see Meltdown and Spectre). But it's still announced.
A defining characteristic of security exploits is that they are targeted, specific and usually suggest a direct patch. The harms might be theoretical, but are still considered with as much seriousness as the exploit warrants.
Let's switch to a different domain: biology. Starting from the sequencing of the human genome through the million-person precision medicine project to CRISPR and cloning babies, genetic manipulation has provided both invaluable technology for curing disease as well as grave ethical concerns about misuse of the technology. And professional organizations as well as the NIH have (sometimes slowly) risen to the challenge of articulating norms around the use and misuse of such technology.
Here, the harms are often more diffuse, and the harms are harder to separate from the benefits. But the harm articulation is often focused on the individual patient, especially given the shadow of abuse that darkens the history of medicine.
The harms with various forms of AI/ML technology are myriad and diffuse. They can cause structural damage to society - in the concerns over bias, the ways in which automation affects labor, the way in which fake news can erode trust and a common frame of truth, and so many others - and they can cause direct harm to individuals. And the scale at which these harms can happen is immense.
So where are the professional groups, the experts in thinking about the risks of democratization of ML, and all the folks concerned about the harms associated with AI tech? Why don't we have the equivalent of the Asilomar conference on recombinant DNA?
I appreciate that OpenAI has at least raised the issue of thinking through the ethical ramifications of releasing technology. But as the furore over their decision has shown, no single imperfect actor can really claim to be setting the guidelines for ethical technology release, and "starting the conversation" doesn't count when (again as Margaret Mitchell points out) these kinds of discussions have been going on in different settings for many years already.
Ryan Lowe suggests workshops at major machine learning conferences. That's not a bad idea. But it will attract the people who go to machine learning conferences. It won't bring in the journalists, the people getting SWAT'd (and one case killed) by fake news, the women being harassed by trolls online with deep-fake porn images.
News is driven by news cycles. Maybe OpenAI's announcement will lead to us thinking more about issues of responsible data science. But let's not pretend these are new, or haven't been studied for a long time, or need to have a discussion "started".