• 你的位置:同城约炮 > 色尼姑亚洲 > 【MIRD-121】絶品痴女大乱交 小早川怜子 椎名ゆな ASUKA 乃亜 外刊精读|《经济学东说念主》到底AI是若何念念考的?

【MIRD-121】絶品痴女大乱交 小早川怜子 椎名ゆな ASUKA 乃亜 外刊精读|《经济学东说念主》到底AI是若何念念考的?

发布日期:2024-08-03 15:59    点击次数:165

【MIRD-121】絶品痴女大乱交 小早川怜子 椎名ゆな ASUKA 乃亜 外刊精读|《经济学东说念主》到底AI是若何念念考的?

入部属手翻译这篇著作的时辰,我正在听一位知名博主讲到【MIRD-121】絶品痴女大乱交 小早川怜子 椎名ゆな ASUKA 乃亜,咱们生存的一切八成率是被模拟出来的虚拟宇宙。这个埃隆·马斯克相配驯顺的主见,如今仍是不是什么惊天秘闻。《黑客帝国》在二十多年前提倡这一理念的时辰,我还在高中准备去大学读运筹帷幄机专科,在三部曲完成后的几年里我又读了模式识别、东说念主工智能的商榷生。看着外洋大学翻译过来的《神经采集》和《模式识别》课本,我觉得这种连鱼和东说念主都不成松驰分辨的算法远景飘渺。如今,AI学习迭代的速率惊东说念主,而最可怕的是咱们依然像我在商榷生阶段一样,无法相识它是若何念念考的。

当今最前沿的商榷东说念主员在尝试相识和操控AI的念念想和行为,正如往常几千年的圣东说念主与统带者一直在相识和操控东说念主类的念念想和行为一样。毫无疑问,在东说念主类试图相识AI若何念念考的时辰,AI正在相识东说念主类、杰出东说念主类、操控东说念主类。也许有一天,AI成为了阿谁圣东说念主和统带者,东说念主类也就厚爱完成了这个硅基漂后的启动。

图片

Inside the mind of AI

Researchers are finding ways to analyse the sometimes strange behaviour of large language models.

科研东说念主员为分析大讲话模子时常出现的奇怪行为,正在寻找多样圭表。

To most people,the inner workings of a car engine or a computer are a mystery.It might as well be a black box: never mind what goes on inside, as long as it works. Besides, the people who design and build such complex systems know how they work in great detail, and can diagnose and fix them when they go wrong. But that is not the case for large language models (LLMs), such as GPT-4, Claude, and Gemini, which are at the forefront of the boom in artificial intelligence (AI).

对于大多量东说念主来说,汽车发动机或者电脑的里面脱手机制是个迷。可能更像个黑河:岂论里面是若何运作,惟有它还在正常脱手就没啥好小心的。而忖度打算建造这些复杂系统的东说念主知说念详实的机理,在出现故障时也可以会诊和成就。但对于像GPT-4、Claude和Gemini这种AI发展风口浪尖的大讲话模子LLMs来说,情况却并不如斯。

色哥哥

LLMs are built using a technique called deep learning, in which a network of billions of neurons, simulated in software and modelled on the structure of the human brain, is exposed to trillions of examples of something to discover inherent patterns. Trained on text strings, LLMs can hold conversations, generate text in a variety of styles, write software code, translate between languages and more besides.

大讲话模子LLMs建立在被称作深度学习的技巧之上,这种技巧在软件中以东说念主类大脑的结构为原型,模拟出几十亿神经构成的神经元采集,面向千亿级的样本去学习事物内在的模式。基于对字符串的西宾,LLMs可以掌持对话、生成多种形势的笔墨、写代码以及翻译等等智商。

Models are essentially grown, rather than designed, says Josh Batson, a researcher at Anthropic, an AI startup.Because LLMs are not explicitly programmed nobody is entirely sure why they have such extraordinary abilities. Nor do they know why LLMs sometimes misbehave, or give wrong or made-up answers, known as 'hallucinations'. LLMs really are black boxes. This is worrying, given that they and other deep-learning systems are starting to be used for all kinds of things, from offering customer support to preparing document summaries to writing software code.

AI创业公司Anthropic的商榷员Josh Batson说,模子执行上是孕育出来的,而不是忖度打算出来的。因为LLMs不是用明确的形态进行编程,莫得东说念主能够十足说明它们为什么具备这些超凡的智商。也莫得东说念主知说念为什么LLMs随机辰会出现“幻觉”,操作失灵、给出特殊或杜撰的谜底。LLMs是着实意旨的黑盒。鉴于大讲话模子和其他深度学习系统仍是被浅薄期骗在从提供客服复旧,到准备编程的文档归来等宽广界限,这种不笃定性令东说念主担忧。

It would be helpful to be able to poke around inside an LLM to see what is going on, just as it is possible, given the right tools, to do with a car engine or a micro-processor. Being able to understand a model's inner workings in bottom-up, forensic detail is called 'mechanistic interpretability'. But it is a daunting task for networks with billions of internal neurons. That has not stopped people trying, including Dr. Batson and his colleagues. In a paper published in May, they explained how they have gained new insight into the workings of one of Anthropic's LLMs.

若是也能像检讨汽车引擎或者微惩办器一样,用适应的用具在LLMs里面摸索一番,将会对相识里面的脱手机制相配有匡助。这种从里到外透彻相识模子里面脱手机制的勘探细节,被称作“机制可解说性”。但面临数十亿量级的里面神经元采集,使命量大得令东说念主生畏。不外这些繁重没能禁锢Batson和他的共事们去探索。在五月发表的一篇论文中,他们阐释了他们若何针对Anthropic一个大模子脱手机制获取的全新知悉。

One might think individual neurons inside an LLM would correspond to specific words. Unfortunately, things are not that simple. Instead, individual words or concepts are associated with the activation of complex patterns of neurons, and individual neurons may be activated by many different words or concepts. This problem was pointed out in earlier work by researchers at Anthropic, published in 2022. They proposed—and subsequently tried—various workarounds, achieving good results on very small language models in 2023 with a so-called 'sparse autoencoder'. In their latest results, they have scaled up this approach to work with Claude3Sonnet, a full-sized LLM.

有东说念主可能觉得大模子的每个孤立的神经元可能对应具体的单词。但痛苦的是,事情莫得那么简便。相对的反而是,孤立的单词或者主见是与复杂的神经元模式的激活情状联系,每个孤立的神经元也可能被不同的单词或者主见激活。Anthropic的商榷员在2022年发表的早期论文中就指出了这种问题。他们假定并随后尝试了许多种圭表,并在2023年以被称为“零碎自动编码器”的圭表期骗在相配小的讲话模子时取得了可以的效果。在最新的效果中,他们仍是将这种圭表延展到正常范围的LLM  Claude 3 Sonnet上。

A sparse autoencoder is, essentially, a second, smaller neural network that is trained on the activity of an LLM, looking for distinct patterns in activity when 'sparse' (i.e., very small) groups of its neurons fire together. Once many such patterns, known as features, have been identified, the researchers can determine which words trigger which features. The Anthropic team found individual features that corresponded to specific cities, people, animals, and chemical elements, as well as higher-level concepts such as transport infrastructure, famous female tennis players, or the notion of secrecy. They performed this exercise three times, identifying 1m, 4m, and, on the last go, 34m features within the Sonnet LLM.

所谓的零碎自动编码器,执行上是一个用LLM行为模式西宾的微型神经采集,用以发现零碎(相配小)群组的神经元一皆被激活时的独到模式。一朝许多这么被称作特征的模式被识别出来,商榷东说念主员就能判断出哪些词激活了这些特征。这个Anthropic小组发现了与具体城市、东说念主物、动物和化学元素,甚而像交通基础关键、知名女网球选手、守密等复杂主见相对应的孤立特征。他们将这么的实验进行了三次,好色客亚洲分别来识别Sonnet LLM的一百万、四百万和三千四百万个特征。

The result is a sort of mind-map of the LLM, showing a small fraction of the concepts it has learned about from its training data. Places in the San Francisco Bay Area that are close geographically are also 'close' to each other in the concept space, as are related concepts, such as diseases or emotions. 'This is exciting because we have a partial conceptual map, a hazy one, of what's happening,' says Dr. Batson. 'And that's the starting point - we can enrich that map and branch out from there.'

这个纵容是LLM的一种脑图,展现出大模子基于西宾数据学习酿成的微型主见分支。旧金山湾区在地舆上接近的地方,在笼统主见的空间中也相互接近,举例多样疾病或多样厚谊等联系的主见也会在主见空间中相互接近。Batson博士说,“咱们对于在发生的事情,有了一张局部的暧昧的主见舆图,咱们可以从这个最先上不息丰富和延长这张舆图。”

Focus the mind | 聚焦于意志

As well as seeing parts of the LLM light up, as it were, in response to specific concepts, it is also possible to change its behaviour by manipulating individual features. Anthropic tested this idea by 'spiking' (i.e., turning up) a feature associated with the Golden Gate Bridge. The result was a version of Claude that was obsessed with the bridge, and mentioned it at any opportunity. When asked how to spend $10, for example, it suggested paying the toll and driving over the bridge; when asked to write a love story, it made up one about a lovelorn car that could not wait to cross it.

既然咱们可以看到LLM对应具体主见被激活的区域,就有可能通过操控孤立的特征进而更正它的行为。Anthropic小组磨砺了一个主见,通过“峰值”(进步特征值)与金门大桥联系的特征来进行测试。纵容大模子Claude的输出十足被这座桥所占据,它会尽一切可能说起这座大桥。举例,当被问及若何划掉10好意思元时,Claude建议开车通过大桥并支付过路费;当被条目写出一个爱情故事,它编出了一辆失恋之车迫不足待要通过大桥的情节。

That may sound silly, but the same principle could be used to discourage the model from talking about particular topics, such as bioweapons production. 'AI safety is a major goal here,' says Dr. Batson. It can also be applied to behaviors. By tuning specific features, models could be made more or less sycophantic, empathetic, or deceptive. Might a feature emerge that corresponds to the tendency to hallucinate? 'We didn't find a smoking gun,' says Dr. Batson. Whether hallucinations have an identifiable mechanism or signature is, he says, a 'million-dollar question'. And it is one addressed, by another group of researchers, in a new paper in Nature.

这也许听起来有些傻,但一样的原则也可以用来减少模子谈到某些特定的话题,举例化学刀兵的坐褥。Batson博士说:“东说念主工智能的安全性是一个主要的磋磨”。这种圭表也可以被期骗于对行为的操控。通过更始某些特征值,可以进步或者缩短大模子阐扬出来的不实投合、恻隐心或者骗取伪装等行为的进度。也许有一个特征值是对应于产生幻觉的进度?Batson博士说:”咱们莫得找到联系的实考把柄”。另一位在《当然》杂志上发表最新论文的另一组商榷员说,是否幻觉有一个可被识别的脱手机制或者信号,这是个价值百万的问题。

Sebastian Farquhar and colleagues at the University of Oxford used a measure called 'semantic entropy' to assess whether a statement from an LLM is likely to be a hallucination or not. Their technique is quite straightforward: essentially, an LLM is given the same prompt several times, and its answers are then clustered by 'semantic similarity' (i.e., according to their meaning). The researchers' hunch was that the 'entropy' of these answers - in other words, the degree of inconsistency - corresponds to the LLM's uncertainty, and thus the likelihood of hallucination. If all its answers are essentially variations on a theme, they are probably not hallucinations (though they may still be incorrect).

牛津大学的Sebastian Farquhar和共事们用一种被称作“语义熵”的测量圭表来评估LLM给出的描摹是否是幻觉。他们的技巧十分奏凯:执行上,对LLM重叠输入疏通词语许屡次,将它的输出纵容根据“语义相似性”(根据词语的含义)进行聚类。商榷东说念主员的直观判断是这些谜底的“熵”,也即是说,不一致的进度,这反馈了LLM的不笃定性,这反馈了产生幻觉的可能性。若是大模子给出的通盘谜原来质上都是并吞个主题的不同衍变,这不大可能是个幻觉(尽管谜底有可能是特殊的)。

In one example, the Oxford group asked an LLM which country is associated with fado music, and it consistently replied that fado is the national music of Portugal - which is correct, and not a hallucination. But when asked about the function of a protein called StarDio, the model gave several wildly different answers, which suggests hallucination. (The researchers prefer the term 'confabulation,' a subset of hallucinations they define as 'arbitrary and incorrect generations.') Overall, this approach was able to distinguish between accurate statements and hallucinations 79% of the time; ten percentage points better than previous methods. This work is complementary, in many ways, to Anthropic's.

举例,牛津大学的小组问LLM哪个国度与Fado音乐更联系,它一直回复Fado是葡萄牙的民族音乐,这即是个正确谜底而不是幻觉。而当问到卵白质StarD10的功能时,大模子给出了许多平淡不同的谜底,这应该即是幻觉了。(商榷员更倾向于使用“失忆症”这个术语,他们将这个幻想的子集界说为“审定和特殊的产生”)。总之,这种圭表可以在分辨正确输出和幻觉时达到79%的准确性,准确性比以往的圭表进步了10%。在许多方面,这种圭表和Anthropic的圭表酿成了很好的互补。

Others have also been lifting the lid on LLMs: the 'superalignment' team at OpenAI, maker of GPT-4 and ChatGPT, released its own paper on sparse autoencoders in June, though the team has now been dissolved after several researchers left the firm. But the OpenAI paper contained some innovative ideas, says Dr. Batson. 'We are really happy to see groups all over, working to understand models better,' he says. 'We want everybody doing it.'

许多东说念主也在力图揭开LLMs隐私的面纱:OpenAI(GPT-4和ChatGPT的设备者)的“超一致”小组在七月发表了对于零碎自动解码器的论文,尽管这个小组仍是在多名商榷员从公司辞职后而隔断。Batson博士说,这篇OpenAI的论文包含许多立异的圭表。“看到有许多组织都在尽心致力于于于更好地相识大模子,咱们确切相配快乐,也但愿通盘东说念主都能参与进来。”

*本文翻译自《经济学东说念主》2024年7月13日贸易著作《Inside the mind of AI》【MIRD-121】絶品痴女大乱交 小早川怜子 椎名ゆな ASUKA 乃亜,仅供英文交流学习使用,原图文版权归经济学东说念主杂志通盘

本站仅提供存储就业,通盘内容均由用户发布,如发现存害或侵权内容,请点击举报。



Powered by 同城约炮 @2013-2022 RSS地图 HTML地图

Copyright Powered by站群 © 2013-2022 版权所有