Friday, September 25, 2015

This AI algorithm can match the average American on real SAT questions

Yeah, yeah — of course a computer won at a math competition. That’s not the point. This story, which concerns a rather amazing program called GeoS from the Allen Institute for Artificial Intelligence (AI2), is about the ability of AI to usefully engage with the world. To a computer, with a brain literally structured for these sorts of operations, the math SAT is not a test on calculation, but reading comprehension. That’s why this story is so interesting: GeoS isn’t as good as the average American at geometry, it’s as good as the average American at the SAT itself.

Specifically, this AI program was able to score 49% accuratie on official SAT geometry questions, and 61% in practice questions. The 49% figure is basically identical to the average for real human test-takers. The program was not given digitized or specially labeled versions of the test, but looked at the exact same question layout as real students. It read the writing. It interpreted the diagrams. It figured out what the question was asking, and then it solved the problem. It only got the answer about half the time — which makes it roughly as fallible as a human being.
SAT AI 2Of course, GeoS makes errors for different reasons than high-schoolers. A human being might correctly interpret the question, then apply the wrong formula, or muck up the calculation. GeoS, being a computer, will virtually always get the correct answer so long as it truly understands the question. It might not be able to read a word correctly, or the grammar of a question might be too alien for the computer to parse. Regardless, what we’re really measuring here is the computer’s ability to understand human communication in a form that’s deliberately (pardon the pun) obtuse.
To do this, the researchers had to smash together a whole array of different software technologies. GeoS uses optical character recognition (OCR) algorithms to read the text, and custom language processing to try to understand what it reads. Geometry questions are structured to be difficult to parse, hiding important information as inferences and implications.
sat ai 3The other side of the coin is that though geometry questions are dense and hard to tease apart, they’re also extremely uniform in structure and subject matter. The AI’s programmers can plan for the strict design principles that go into writing the questions. It couldn’t take this same programming and directly apply it to calculus problems for instance, because they use somewhat different language and mathematical symbols to describe the problem. But a good GeometryBot would also be relatively easy to adapt to those few distinguishing rules. Each successive new area of competence would make the next one easier to acquire.
One intriguing implication of this research is that someday, we might have algorithms quality-checking SAT questions. We could have different AI programs intended to achieve different levels of succes on average questions, perhaps even for different reasons. Run proposed new questions through them, and their relative performance could not only weed out bad questions for point to the source of the problem. BadAtReadingAI and BadAtLogicAI did as expected on the question, but BadAtDiagramsAI did terribly — maybe the drawing simply needs to be a little clearer.
This isn’t a sign of the coming AI-pocalypse, or at least not a particularly immediate sign; as dense as geometry questions might be, they’re homogeneous and nowhere near as complex as something like conversational speech. But this study shows how the individual tools available to AI researchers can be assembled to create rather full-featured artificial intelligences. When things will really take off is when those same researchers start snapping together those amalgamations into something far more versatile and full-featured — something not entirely unlike a real biological mind.

No comments:

Post a Comment

Facebook Friends

Labels

Microsoft Nvidia Security Amd Google Android Apple Samsung Windows 10 Autos Science Smartphones Apps Automobiles Cars Encryption Gpu Intel Ios Iphone Nasa Pc Ps4 Software VW Volkswagen Xbox One 3d Printing Gaming Gtx 980 Hardware Internet Maxwell Mobile Operating Systems PC Gaming Privacy Ps3 Tesla Wearables Windows diesel diesel engines pollution space 3d Printers 4g 8K Apple Car Astrophysics BMW Comcast Cpu Developers Directx 12 Displays Google Fiber Graphics Gtx 970 Holograms How-To ICar Lte Malware Medicine Memory Nanoparticles OLED Oculus Rift PCS Piracy Project Morpheus Quantum Entanglement Robotics Smart Tv Spectrum Tablets Torrents Uhd Valve Virtual Reality Wearable Computing Windows 9 Wireless Communications Xbox 360 clean diesels diesels emissions government ipad lcd lg 1080p 2160p 21:9 34UC87C 3D 3d Photography 4K 4KTv A123 Systems ABP ARM AT&T ATS Acura RDX Ads Aero Air Gap Airmont Alienware Amazon American truck simulator App Apple Pay Apple Watch Apu Astronomy Asus Atmosphere Atom Audi Augmented Reality Auto Show Top Cars Auto Shows Avg Bandwidth Batteries Battery Life Bing Bittorrent Blizzard BlueStacks Bonan Brain Brains Branson Braswell Broadwell Business CALL OF DUTY: BLACK OPS 3 CNNIC Cameras Cancer Car Shows Carrington Event Certificate Authority Charging Chevrolet Equinox Chicago Auto Show Chips Chromebook Pixel Chromebook Pixel 2 Chromebooks Climate Change Console DX11 David Irvine Dci Deals Dell Diablo 3 Directx Dota 2 Downloads Dream Dx12 EPA EVs Eighth Generation Electric Vehicles Electromagnetic Electrons Enterprise Enthusiast Et European Union Exoplanets Female Festival Fiber Firefox 41 Fisker Karma Fitness Tracker Ford Explorer Ford Police Interceptor Formula E Fukushima Fukushima Daiichi Future GRIP Digital GTX Game Streaming Gamers Gchq Google Wireless Gorillapod Grid HDTV Health Heat Holographic Displays Holographic Storage Holographic Universe Holography Hololens Honda Pilot Htc Http Huang IETF ISPs Icera Illumiroom Imagination Technologies Imaging Inkjet Printers Internet Of Things IoT Iphone 6 Iphone 6 Plus Itanic Itanium Keller Kinect Kinectic Energy Kittson LED Laptops Lasers Latency Lenovo Lidar Light Liquid Metal Lithium-Ion Low Latency API MCS Holdings MCV MRI Machine Learning Magnetic Field Man In The Middle Mantle Masturbate Medical Medical Imaging Mercedes Microsoft Access Microsoft Excel Microsoft Office 2016 Microsoft Office 365 Microsoft One Microsoft PowerPoint Microsoft Research Microsoft Word Miscrosoft Mobile Computing Model S Modems Modems 0 Comments Mozilla Muon Tomography Mvno NOx NSA Navigation Net Neutrality Networking Neural Networks Neurology Nexus 6 Noaa Nokia Note Nova Nuclear Nuclear Power Nvidia Shield Office 365 Online Ouya Overclocking Paid Paintings Palmer Particles Performance Photography Physics Pirate Bay Plaintext Plasma PornHub Pornhub Wankband Porsche Poulson Power VR Project Tango ProtocolI Qualcomm Quantum Computing Quantum Mechanics Quantum Physics R9 290X RAM RPV Radeon Richard Branson Russia SCS SEC SLS SOEDESCO SSL/TLS Same-Day Delivery Samsung Galaxy Note SanDisk Scanning Sdk Search Seattle Senate Launch System Servers Setup Sharp Shield Silvermont Siri Slideshow Smartphone Smartwatches Snowden Society Soft Robotics Software As A Service Solar Sony Sound Sound Waves Spacetime Spectroscopy. ESO Spying Stars Steam Stellar Dynamics Observatory SDO SunLock Superfish Swarm TDI THQ TPB TSA TSA locks Tegra Terrible Posture Games Titan Titan Black Titan Z Torrentfreak Tower of Guns Toyota Avalon Toyota Camry Toyota Corolla Travel Sentry Uhdtv Ultra VR Verizon Versioning Very Large Telescope Video Games Virgin Virgin Atlantic Virtualbox Virtualization Viruses Voice Wankband Wide Wifi Windows 10 Technical Preview Windows 8 Windows Nt 6.4 Windows Phone WindowsI Wireless Spectrum World Of Warcraft Xbox Xbox Live Xeon Yahoo Yandex Zen ai apollo artificial intelligence artist backdoor biology bitcoin blender block broadband broadcast car sales chain chrome clean diesel corporate fraude court cpus diesel emissions dreamcast edgeadblock plus electric cars etherium ets ets2 euro truck simultor euro truck simultor 2 exploration fifth amendment fingerprint firefox freedom game development game theory games genetic engineering geometry gtx 980m heed Martin high speed cameras iMo iMove ibm internet explorer k12 konami language learning licenses mars math metal gear solid metal gear solid 5 microSD microsoft office missions mobile gaming model e model x moon nano-optics nanotechnology netflix orion os pHTTP/2 pachinko passcode programming robots rock paper scissors scandinavian security theater sega self-incrimination smart contracts space launch system telecommunications terahertz trucks tv waveguides x86