您当前的位置:科技中国网要闻正文

大数据不等于科学规律

放大字体  缩小字体 2019-10-09 19:20:48  阅读:8228 作者:责任编辑NO。郑子龙0371

咱们从地理学史获得的重要经验是,大数据自身是解说不了自己的。构建简化的数学模型,再将其与实在的物理世界联络起来,并加以完善,这才是从数据这块原始矿石中提炼出“含义”这颗稀有宝石的牢靠办法。

来历 | 大众号“蔻享学术”

作者 | Frank Wilczek (麻省理工学院教授、2004年诺贝尔奖得主)

翻译 | 梁丁当、胡风

地理学史标明,假如没有理论模型的解说,观测数据自身提醒的信息是有限的。

The history of astronomy shows that observations can only explain so much without the interpretive frame of theories and models.

现在,大数据和机器学习为许多科学问题供给了新的解决办法。而地理学史为咱们供给了一个风趣的视点去审视怎么运用数据引导科学,这或许是一个很好的警示。

Big data and machine learning are powering new approaches to many scienti c questions. But the history of astronomy o ers an interesting perspective on how data informs science—and perhaps a cautionary tale.

前期的巴比伦地理学家选用了今日咱们称之为纯“大数据”或许“形式识别”的办法。他们积累了数个世纪的太阳、月球和行星运动及日月食的观测数据,从中找出了不同的循 环周期。只需假定这些周期会继续下去,他们就能为栽培、灌溉和收割的时刻供给合理辅导,拟定出牢靠的占星术,并提早猜测月食发作的时刻。

Early Babylonian astronomers took what today we'd call a pure "big data" or "pattern recognition" approach. They accumulated observations of solar, lunar and planetary motion and eclipses for many centuries and identi ed various cycles that had repeated many times. Simply by assuming that those cycles would continue, they were able to give good advice for planting, irrigation and harvest times, to cast credible horoscopes and to predict in advance when lunar eclipses would occur.

古希腊地理学家则用了两种不同的办法来了解同一组数据。榜首种办法是构建几许模型,行将太阳、月亮、行星和恒星视为一个个笼统的发光点,别离固定在某个匀速旋转的天球上。

The ancient Greek astronomers used two distinct methods to understand the same data set. The rst was to make geometric models that treated the sun, moon, planets and stars as mathematical abstractions—shiny points carried upon uniformly rotating celestial spheres.

起先,希腊人的猜测并不比巴比伦人强,事实上差许多。为了改善,他们假定光点在天球上不是固定的,还在沿着额定的圆周轨迹运动,即本轮。公元2世纪时,这种模型系统在地理学家托勒密(Ptolemy)手中臻于完美。尽管在后人看来,托勒密的模型是冗繁蠢笨的,但在其时,它的确供给了一种相对紧凑的结构系统来容纳很多的地理数据,而且给出了有用的实践成果。

At rst, the Greeks' predictions were no better than those of the Babylonians—in fact, they were signi cantly worse. But they patched things up by postulating additional movements of the spheres, called epicycles. These models, which were perfected by the 2nd-century astronomer Ptolemy, seem ugly in retrospect, but they did package the astronomical data in a relatively compact form, and they gave useful practical results.

希腊地理学家选用的第二种办法是将天体视为具有物理性质的实在物体。这种办法的一个代表性成便是:公元前3世纪时,阿里斯塔克(Aristarchus)初次测算出了日地间隔与地月间隔的比值。阿里斯塔克假定月光来自反射的太阳光,当半个月亮和太阳一同出现在天空的时分,他使用简略的三角原理核算出了两者间隔的比值。

The second method used by Greek astronomers was to consider astronomical bodies as real objects with physical properties. Perhaps the high point of this e ort was the brilliant determination by Aristarchus, in the 3rd century B.C., of the ratio of the distances from the Earth to the sun and the moon. Assuming that the moon shines by re ected sunlight, and measuring the angle between the sun and the half-moon when both are visible in the sky, he calculated the ratio using simple trigonometry.

但是在数个世纪里,上述两种地理学办法——一个是数学的,一个是物理的——一向没能很好地结合起来。这是由于已有的“大数据”,即太阳、月亮和恒星那些简单观测到的运转形式,只不过是深层规则出现出来的隐晦表象。

Yet a proper synthesis of the mathematical and physical approaches to astronomy wasn’t achieved for many centuries. That’s because the available "big data"-the easily observable patterns of the sun, moon and stars-are cryptic, super cial signs of the deep structure beneath.

16世纪时,哥白尼(Copernicus)发现,假如把太阳而不是地球放在天球的中心,就可以得到一个愈加简练美丽的托勒密式模型。尽管托勒密模型在科学史上常常不受待见,但该模型在哥白尼的打破中起到了肯定要害的效果,由于它为模型参数之间的“偶然”供给了物理的解说。

Copernicus, in the 16th century, discovered that he could get more beautiful versions of Ptolemy-style models if he put the sun, rather than the Earth, at the center of the celestial spheres. Ptolemy's work typically gets rough treatment in the history of science, but it was absolutely essential to Copernicus's breakthrough in o ering a physical explanation of "coincidences" among the model's parameters.

在哥白尼提出日心说后不久,伽利略(Galileo)就使用克己的望远镜,成功观测到了金星的相位改变、木星的卫星——一个缩微的“太阳系”,以及月球的外表地貌。夜空不再是笼统几许点和虚拟球面的数学模型,而是一个向咱们展现实实在在的天体的窗口。终究,当牛顿提炼出了运动与引力的普遍规则后,巴比伦人和托勒密的“大数据”办法与阿里斯塔克和伽利略的物理总算被结合起来,然后敞开了实在的现代科学。

Not long after, Galileo's homemade telescope revealed the phases of Venus, Jupiter's attendant satellites—a "solar system" in miniature—and the topography of the moon. The night sky came to life as a showcase of tangible, physical bodies rather than an exercise in idealized points and imaginary spheres. When Isaac Newton distilled the universal laws of motion and gravity, he reunited the "big data" approach of the Babylonians and Ptolemy with the physics of Aristarchus and Galileo, launching truly modern science.

咱们从地理学史获得的重要经验是,大数据自身是解说不了自己的。构建简化的数学模型,再将其与实在的物理世界联络起来,并加以完善,这才是从数据这块原始矿石中提炼出“含义”这颗稀有宝石的牢靠办法。

The big lesson is that big data doesn't interpret itself. Making mathematical models, trying to keep them simple, connecting to the fullness of reality and aspiring to perfection—these are proven ways to re ne the raw ore of data into precious jewels of meaning.

作者简介

Frank Wilczek:弗兰克·维尔切克是麻省理工学院物理学教授、量子色动力学的奠基人之一。因在夸克粒子理论(强效果)方面所获得的成果,他在2004年获得了诺贝尔物理学奖。

特 别 提 示

1. 进入『返朴』微信大众号底部菜单“精品专栏“,可查阅不同主题系列科普文章。

2. 『返朴』供给按月检索文章功用。重视大众号,回复四位数组成的年份+月份,如“1903”,可获取2019年3月的文章索引,以此类推。

《返朴》,科学家领航的好科普。世界闻名物理学家文小刚与生物学家颜宁一同出任总编辑,与数十位不同范畴一流学者组成的编委会一同,与你一同求索。重视《返朴》(微信号:fanpu2019)参加更多评论。二次转载或协作请联络fanpusci@163.com。

“如果发现本网站发布的资讯影响到您的版权,可以联系本站!同时欢迎来本站投稿!