Recognition was an artificial intelligence program that compared artworks with up-to-the-minute photojournalism.

It used powerful algorithms to search through Tate’s vast collection database, looking for visual and thematic similarities between artworks and the endless stream of online news images.

Winner of IK Prize 2016 for digital innovation, Recognition was active from 2 September - 27 November 2016 as a website and installation at Tate Britain.

Highlights

Over the course of its three-month lifespan, Recognition produced 7271 matches between British art in the Tate collection and up-to-the-minute news images provided by Reuters. Here is a selection of some of the most striking, humorous, controversial, thought-provoking or simply beautiful.

How it Works

Recognition used four different algorithms to analyse images. Artworks and news images with a high similarity in one (or more) of these categories were selected as a match.

Object recognition is a process for identifying specific objects. Its algorithms rely on matching, learning, or pattern recognition using appearance-based or feature-based analysis.

LEFT

26/11/2016

Kitchen utensils are pictured after a fire broke out in a slum area in Jammu

© Mukesh Gupta / REUTERS

1 A METAL POT


RIGHT

1991

Monument

© Ian Hamilton Finlay / TATE

6 A SILVER POT

Facial recognition is a process for identifying human faces. In addition to locating the human faces in an image, it determines the age, gender, and emotional state of each subject it finds.

LEFT

11/10/2016

A supporter of Lebanon's Hezbollah leader Sayyed Hassan Nasrallah has his picture on his head during a public appearance by Nashrallah at a religious procession

© Aziz Taher / REUTERS

1 MAN WITH A BEARD
2 MAN WITH A HAT ON FACE


RIGHT

1545

A Man in a Black Cap

© John Bettes / TATE

2 MAN WITH A HAT ON
3 MAN HAS A BEARD

Composition recognition is a process for identifying prominent shapes and structures, visual layout, and colours.

LEFT

18/10/2016

File photo of farmers collecting corn for a cargo at a farm in Gaocheng

© Kim Kyung Hoon / REUTERS

COMPOSITION 80%


RIGHT

1980

Portrait of V.I. Lenin with cap, in the style of Jackson Pollock III

© Art & Language (Michael Baldwin, Mel Ramsden) / TATE

COMPOSITION 80%

Context recognition is a process which analyses the titles, dates, tags, and descriptions associated with each image.

LEFT

13/09/2016

A boy immerses an idol of Hindu god Ganesh, the deity of prosperity, into the Sabarmati river during the 10-day-long Ganesh Chaturthi festival, in Ahmedabad

© Amit Dave / REUTERS

WATER, BOAT, OUTDOOR, MAN, RIDING, POND, BAYOU, RIVER, CREEK, RAFT


RIGHT

1893-4

August Blue

© Henry Scott Tuke / TATE

WATER, SPORT, OUTDOOR, MAN, RIDING, RIVER, POND, RAFT, OCEAN, DOCK

Perspectives

Leading members of the art, technology, and journalism communities (all human intelligences) share their thoughts on their favourite match.

Emmanuel Benazera explains how intelligent machines use memory to ‘guess’

This is one of my favourites because it shows how neural networks (a sort of computerised pattern recogniser or ‘memory’) enables an artificial intelligence to ‘guess’, i.e. figure out what it ‘sees’ in relation to what it has already ‘seen’.

First, both images are tagged with the term 'office' in their respective databases. Then, Recognition detects male human figures in each image, matching them with each other which adds to the match's score.

Using this information, the algorithm near-correctly recognises the computer in the first image as a ‘laptop’, as it also does the book stand in the second image. This is interesting: the Recognition neural nets do not ‘know’ about book stands or other such 17th/18th century equipment as they are trained mostly on modern imagery. Thus, we can easily conjecture that given the position of the man in front of the object, the net considers that the square shape in front of him is most likely a computer. From there, both objects are matched as ‘laptops’.

Well, the book and its stand is yesterday’s computer, no?

Emmanuel Benazera developed the software that powered Recognition. He is a former AI researcher with CNRS, NASA and other institutions. His areas of interest include machine learning, search engines and automated decision making. He is a keen Open Source advocate and participates in the development of a handful of open deep learning, optimization and search tools.

Miguel Carvalhais discusses the creativity to be found in error

Recognition leads us through a serendipitous guided tour of Tate Britain’s collection with the aid of A.I. technologies. I have selected four matches (which you can browse in the slider above), starting with obvious formal similarities [matches 1 & 2], to a comparison featuring strangely similar scenes separated by two centuries and wildly differing contexts [3], and finally two juxtapositions that are not so much dependent on formal factors as they are on interpretation and semantic coincidence [4 & 5]. Looking at the breadth of match types such as these, one comes to understand that Recognition is not about form, per se, but rather about meaning and subtext. Through the recombination of two images, Recognition endeavours at creating new meaning (accidentally), starting from denotation and moving towards complex connotation and to the development of rich subtexts, sometimes achieving something — a connection, an insight — that was not apparent before.

But Recognition is not simply about juxtaposition. It’s about the reasons for selection and comparison. About the machine’s ‘thought’ process. These are what turn the project into something enlightening, as Recognition elegantly exposes its ‘thinking’ to the viewer (this can be seen in the accompanying data for each match in the Archive section of this site), assuming both machine and human errors, and laying them bare to the audience. Because of this, the least significant and interesting matches are perhaps those where errors do not happen [e.g. 2], where identification of subject and composition are correct in the case of both the photography and the artwork, and where, consequently, two very similar images are paired, not adding much to either of them. When Recognition shines is when through error (or misinterpretation) sequences are formed where the interstice between the two images is rich in substance. When that space is no longer a gap, but a rich semantic field that arises from the images, their interpretative descriptions, and the matches that are described.

Recognition thrives from error. But error is the starting point for creativity and creation. And error is responsible for the inception of meaning.

Miguel Carvalhais is a designer and musician. He is an Assistant Professor at the Faculty of Fine Arts of the University of Porto, and specialises in computational art and design practices.

The photojournalist’s perspective: Jonathan Ernst on his image of President Obama

The photograph of U.S. President Obama was taken during his official remarks during a visit to Laos, a very straightforward speech about the two countries moving ahead together in peace and prosperity.

Obama also announced substantial U.S. funding to continue the long-term effort to clear away unexploded bombs dropped by the U.S. during the war in Southeast Asia in the 1960's and 70's. In such situations, there are physical limitations on where photojournalists can stand in order to capture images. At this event, similar to many presidential speeches, we were relegated to a moat at the foot of the stage - "the buffer" in news parlance - and while this can be limiting, it also serves to challenge photographers to compose images that capture the significance of the event better than their competitors.

In this image, I was attempting to centre the American president in the exact midpoint between the two flags - themselves the most distilled visual representations of the two countries. These situations also require us to work quickly. We tend to compose, shoot, edit and upload all in one smooth workflow arc. So when someone asks me what's going through my mind when I'm working in a news situation, I'm always reminded of an old television ad with an Olympic hurdler. She's standing in the blocs waiting for the gun to sound on a training run, and the voiceover has her saying that people always want to know what's going through her head during a race. At that point the gun goes off the voice in her head starts counting her steps, yelling, "one two three four five six seven eight KICK!" over and over between the hurdles. Sometimes being a visual news journalist can feel just like that - being propelled from one (hopefully) interesting composition to the next without so much as a breath in between. It's frequently the polar opposite of the time and intention that conventional artists are able to devote to their compositions.

My first response to the comparison was deep confusion, but later I began to see a figure at the centre of the Typographer image as a plausible abstraction of the figure of President Obama, strictly visually speaking. I think the comparison would be even more interesting in three dimensions. Perhaps there’s a line of interpretation one could take, if one wanted to, from the work of a typographer - someone whose work was to set words down in a permanent form - to that of the modern politician, whose words can sometimes endure but are also frequently impermanent.

Jonathan Ernst is currently a Washington-based staff photographer for Reuters News Pictures. His photograph of President Obama was selected and compared to an artwork from the Tate collection by Recognition.

The photojournalist’s perspective: Vasily Fedosenko on photographing a prison cell in Belarus

The photo was taken in a detention centre in Belarus. The men were in a cell and I tried to shoot the photo through the door, to show the metal, locks, the small window, the small room for these men, no sunlight – very artificial.

I thought about them - how many kilometres and time they had covered from their country before they appeared in the cell - without understanding the local language or any other languages except their own.

Comparing this photo and the artwork - every subject sits in its own cell…

Vasily Fedosenko is currently a staff photographer for Reuters News Pictures based in Belarus. His photograph of men in a prison cell was selected and compared to an artwork from the Tate collection by Recognition.

Caitlin Hu on a moment in Plaza de la Revolución

It's the early morning and the solitary labourer is on his way to work. In a moment Plaza de la Revolución will fill up, and the night lamp on Surrey Canal will flicker off. Or maybe it's late afternoon: The sun has fallen low in Havana, the newspapers will hide their headlines until tomorrow, and the street lamps have just turned on, dotting all of Camberwell with light. I'm taken aback that Recognition found this very precise and ambiguous moment in frames 81 years apart: The cool still of an empty city at an odd hour, the feeling that something momentous has just occurred or perhaps is about to occur.

Caitlin Hu is a deputy news editor, directing coverage of visual culture and the arts at Quartz.

Natalie Kane discusses invisibility and colours that demand attention

In a 2012 essay on Albert Irvin, Sam Cornish writes that Irvin’s work so often signalled ‘an arrival or a bringing to attention.’ The boldness of colour and visible movement of brushstrokes in Irvin’s painting Flodden demands attention. So, when compared to Antonio Parinello’s similarly red-filled portrait of a migrant landing ashore in Sicily, viewers may feel an intense sense of urgency.

The Recognition software states that ‘no faces are detected’, when there is quite clearly one there, looking at you from a place that the vast majority of us have little or no experience of. Revealing the un-intelligent, context-ignorant nature of much artificial intelligence systems which cannot understand what we humans can see, the machine tells us that it recognises ‘a red blanket on that floor’ when in fact the image is of a man being rescued, succumbing to an involuntary calm. It is political to not be legible to a system, be it facial recognition technology that has often been shown not recognise darker skin tones or by being officially unrecognised by institutions or governments. The migrant crisis, although reported upon in the media, is invisible to those who choose not to see, or, to those who see it as an unwelcome invasion, simply a depersonalised and faceless mass.

The machine-unreadable man rescued by Medicin Sans Frontiers shows not panic, or relief, but the relinquishing of control. You have come this far now help is here to take you to a certain point. But after the moment of this picture, his journey and those of countless other potential subjects of photographs like this becomes uncertain, as yet more unfamiliar circumstances rear their heads. Those that make it (so many do not) must learn to become legible, or invisible, in new ways. Contrary to what we see, the A.I. reads the photograph as ‘a red suitcase’ - cargo - no way to see a person, in whatever state of motion.

Abstract expressionism has always shown a need to react: to be spontaneous, show action or visceral intensity. To depict the qualities of a body, for example, without a body being present. It is that strange red-orange colour taking-up space at the bottom of both images, below the swift strokes of crimson in the painting and below the crimson uniforms in the photograph, that grabbed my attention most of all. A colour of emergency and alarm. A warning sign.

Natalie D Kane is a curator, writer and researcher based in Manchester, UK. She is Curator & Editor at FutureEverything (UK) an innovation lab for digital culture and festival, and holds a research position at Changeist (NL), a research, consulting and creative group that helps organizations navigate complex futures.

Peter Kennard on stocks, the whole world and work

The words of Martin Creed's Work No 232: the whole world +the work = the whole world, takes on a whole new meaning placed next to the Dow Jones Industrial Average. The bland numbers of the Dow controls the workers of the world, industry creates an average poverty that fluctuates its profit according to invisible dealers making a killing on free market screens. A mouse-click creates virtual killing on the Dow that spills real blood somewhere on the world map. The whole world + the work becomes the Dow + it's victims.

Peter Kennard is a London born and based photomontage artist and Senior Research Reader in Photography, Art and the Public Domain at the Royal College of Art.

Erik Kessels discusses how we see, remember and reproduce iconic imagery

I see a particularly strong link between contemporary photojournalism and classical history or genre painting. To illustrate this I picked three matches produced by Recognition which compare beautifully composed paintings from the 187th and 19th centuries with contemporary news images. If we question what makes a photo ‘iconic’, we must consider its composition - the use of line as well as colour and light, to depict a scene. These tools were of course well known to the painters of the past. They caught light and shadow with pigments, the same way photographers do today with their lenses.

Classical styles of painting follow strict rules of geometrical composition in order to lead our eyes through a static image, piece by piece like a story, allowing us to understand the scene as if the events were unfolding in front us. The visual balance that results from this act of composition is what touches us, creating a sense of suspended animation emphasising the drama or importance of what is happening. A sense of importance is often what we mean by the word ‘iconic’, whether a depiction of anonymous people in an Afghan village or a famous historical figure from the past.

The need for iconic forms of composition in contemporary images results from our collective imagination. Our brains, and those of photographers documenting the world around us, are infused with visual references which structure how we see and represent the world, consciously or unconsciously. Paintings, photographs, films. Contemporary images, historical images, and in western culture, biblical images. Our memories are filled with a catalogue of carefully constructed representations, or archetypes, of the world. The structural and visual similarities that Recognition has found between artworks and press images should be understood in light of the fact that certain visual structures already exist in our minds. Consequently, such pre-existing images must play a significant role in the way photographers capture and construct the world – and (re)produce images of it.

Erik Kessels is a Dutch artist, designer and curator with a particular interest in photography, and creative director of KesselsKramer, an advertising agency in Amsterdam.

Matthew Plummer-Fernandez discusses car seats and algorithms as art (and context)

I never knew Henry Moore’s reclining figures had so much in common with the interior of a car. In the first image, the isolated seating resembles the human body it was designed to support – curved torsos with soft, oval-shaped headrest-heads. The comparison between the car and Henry Moore images accentuates the anthropomorphic character of both; the car seats echoing the gentle contours of Moore’s figures with their signature pinheads. The cavities of the vehicle as a space for bodies to inhabit – such as the hollow steering wheel for a driver’s hands, and the cavernous chambers for legs – mirror the negative spaces that Moore used to depict bodies flowing and emerging from naturalistic forms.

There are also similarities in the dark, satin-like finishes that make both car interiors and Moore’s figures so seductive and mystical, that when applied to abstract human forms, call upon primordial desires that can be harnessed by both consumerist and artistic enterprise. Both Moore and the car industry celebrate and visualise an optimism towards humanity and modernity as vital to our existence.

The two images have been paired together by an assemblage of algorithms; the software’s processes are made visible by the keywords and metrics that annotate the images (seen alongside each match in the Archive section of this site). “No faces found,” suggests that face-detection algorithms have analysed the images. A list of captions such as “1. The handle of a metal chair, 2. A black handle on a silver spoon” reveals that image classification algorithms have been used - a subset of ‘machine-learning’ algorithms. The relations between the words “sitting”, “chair” and “reclining”, suggests that some Natural Language analysis has also been carried out to identify connections between the captions and titles of images.

Although algorithms have been utilised by artists in the past, now that the significance of such technologies is more widely understood and experienced, it becomes increasingly possible, interesting (and necessary) for tech-related art practices to reflect upon this new reality as a subject and context for art. The algorithms exploited by the Recognition system are arguably the same, or very similar to, algorithmic processes that increasingly mediate everyday life and culture, from powering the processing of images shared on social media platforms such as Facebook, to analysing shopping behaviour on e-commerce sites such as Amazon.

Algorithmic art in the past had a ‘purist’ approach to using algorithms; it was generally powered by the “artist’s own algorithm” (bespoke software created by the artist), but now software practices are essentially entangled with open-source software coming from code repository sites, servers, Application Programming Interfaces (APIs), and pools of data such as large sets of tagged images, which are needed for image classification technology. A purist approach therefore makes for an out-dated software-art practice (i.e. one that fails to reflect the reality of how these new kinds of technologies are created and used in the wider world), and so I find it both commendable and of no surprise that Recognition is the product of technical and financial support from Microsoft, and with media content, mainly images, supplied by Reuters and Tate.

The project is interesting in the way it has reconfigured these disparate giants from the worlds of software, news and art into a new mode of co-operation, resulting in delightful and often astounding results. But, arguably, the project falls short by failing to critique a shared (and rising) faith in the unknown potential of new technologies. In my view, a work of art incorporating cutting-edge technologies like “artificial intelligence” should simultaneously operate critically: revealing and unpacking both its potential and its dangers. Perhaps this is why I have chosen to highlight the pairing of a car interior and a Henry Moore sculpture. This new wave of optimism towards emerging technology, and the use of art to promote it, has manifested in a project that has fortuitously identified similarities with the past: in the case of my selected example, modernism and the car age. It is a small reminder that today’s passion for algorithmic automation could evolve into tomorrow’s equivalent of traffic jams, air pollution and peak oil, and that art should tread carefully and critically when it attempts to champion these reoccurring technological and anthropocentric ideals. Artists now working with, and experts in, these emergent technologies must find a balance between close entanglement and critical distance, and bring to light, poke, or ridicule the challenges that will inevitably come with the great benefits.

Matthew Plummer Fernandez is a British/Columbian artist who incorporates software and internet practices into sculpture.

Anne Racine on destruction and transformation

These two images evoke destruction in very different ways (their stories are very different); but placed side by side, it is as if the debris strewn over what appears to be an old commercial street on the left (following an unspeakable act of violence) could have formed the raw material used in making the sculpture on the right. In each image, and through their comparison, remnants of destruction become symbolic. They refer to the fragility of everyday life, but also the violence under the surface. In the 19th century, artists like Caspar David Friedrich, Hubert Robert and Victor Hugo were interested in the destructive forces of nature and time, famously depicting the beauty of ruins. Here, even without understanding the context of each image, we can suppose the acts of violence were intentional. In comparison, the horrific sense of emptiness in the news image becomes a sense of fullness in the sculpture, as if the latter accumulates the material elements of the former. Does this comparison constitute paradox, analogy or enigma? It allows us to re-read art history (and all of visual culture, including 'news' photography) through the ideas of Marcel Duchamp (the first artist to show that any everyday object could be an artwork, what he called 'readymades') and his adage that in addition to the artist (or photographer), the viewer completes the work (as they find it).

Anne Racine is currently Head of Communication and Sponsorship at Jeau de Paume in Paris, a cultural centre for photography, cinema and video in Paris.

Caroline Sinders discusses the intersection of design and violence

The computer virus Stuxnet is believed to have been developed by the American and Israeli governments as a cyberweapon to attack Iran’s nuclear facilities (those aspects that are connected to and controlled via the internet). Stuxnet’s ability to cause damage to such facilities or systems is multi-fold, but put simply, its capabilities arise both from the programmatic design of the virus itself, and the design of the systems it lives in: an open internet and an interconnected the world.

The Recognition project at Tate Britain uses machine learning to find similarities between images from two ‘open’ sources – artworks from the Tate collection and current photojournalism from Reuters’ picture library – in attempt to see if the technology can reveal commonalities. I have selected the pairing of a photograph taken on 15 November 2016 by Thomas Mukoya featuring over 520 illicit firearms collected in Kenya near the capital Nairobi, and a painting entitled Canterbury Cathedral from 1987 by Dennis Creffield. The images may seem incredibly different, particularly in terms of subject matter, but are noticeably similar in composition. The Recognition machine learning system has attempted to describe the two images as ‘a bunch of a bananas’ and ‘a church with a large tree’ respectively.

The different algorithms used by the Recognition program allow it to scan each image in an attempt to isolate objects and shapes it recognises. On the Recognition site the code highlights sections of paired images where it thinks it has found identical objects or visual similarities – although often, in reality, these are quite different things, compositionally or structurally, they look the same. For example, in this pairing, both feature strong upward leaning lines that echo one another. The pictures are compositionally ‘75% similar’.

The poetic part lies in what the comparison reveals about the two subjects: the ‘designed’ nature of violent objects or systems, apparent in the first and latent in the second image. Juxtaposing a cathedral with a pile of firearms reminds me of Paola Anotelli’s reference to the Stuxnet virus from a lecture series on ‘Design and Violence’ at MoMA: that seemingly benign systems or structures can have unintended consequences, consequences aided by their inherent design. The images have been brought together because the algorithms recognise compositional (‘structural’) similarities, but unknown to the machine, they also reference violence – albeit at completely different ends of the spectrum. Furthermore, and most interestingly, both images make reference to violence through their compositionally similar features: depictions of designed structures: 1. a purposefully structured, almost architectural, pile of guns, and 2. a cathedral. The structure in the first image consists of individual objects of violence (guns) recomposed into a mega-structure itself bound for destruction (to be burnt by the authorities). The structure in the second image, a cathedral, represents the Christian church, whose physical and organisational structures have also had unintended consequences, the systematic violence of which is often forgotten. Serendipitously, in these images, the two structures look similar.

The beauty of the Recognition project is that it can reveal literal visual similarities across a multitude of subject matter. The topic of violence may have surfaced randomly, at least in terms of the machine’s intentionality, but in this case at least, the ability to highlight common visual (compositional or ‘structural’) similarities illustrates what appearances may be able to tell us about the reality of things.

Caroline Sinders is a machine learning user, designer, researcher, artist and digital anthropologist obsessed with language, culture and images.

Matthew Smith on a match that defies language

Like many, I adore the work of JW Turner, but I've always struggled to describe what it is about how he did light in his paintings that makes them so evocative. It's a case of our spoken and written language not being rich enough to explain how we feel. Recognition here explains it better than words by simply offering something from our more immediate experiences that captures it. Humorously, the textual accompaniments of neither image does them justice nor explains what it is about them that makes them such a good match.

Matthew Smith works in the Computational Science Lab at Microsoft Research, and is committed to improving society’s (people, businesses, governments) abilities to predict geotemporal phenomena (properties and processes that can be associated with geographical space and time).

Jon Snow on order and disorder

However ordered the process of election, no one can predict the disorder that may flow from it. In 2017 more poignantly than ever: the consequence of Trump and this year’s unknown electoral outcomes in Germany, in France, and in the Netherlands. The pictorial evidence here may be of chaos being tipped from the ballot box. But the order enshrined in David Hall’s highly ordered nine may also hide a bitter truth.

Jon Snow is an English journalist and television presenter, currently employed by ITN. He has been the main presenter of Channel 4 News since 1989.

The photojournalist’s perspective: Brian Snyder on his image of Hilary Clinton

My photograph was taken on-board presidential candidate Hillary Clinton's campaign plane as she came to talk to reporters at the back. It happened at the airport in White Plains, NY, before the take-off of her official campaign plane’s first flight. The plane was crowded. To get a vantage point, I was standing on the armrests of the seats with the top of my head up against the ceiling of the plane. Aside from simply making sure I could see Hilary, I wanted to show a bit of the setting - what the plane looks like and what it is like for the candidate to be on board with so many people.

It seems presumptuous to reach for cultural or political connections between my photograph and Gavin Hamilton's painting, given the disparity in subject matter and the span of time separating the events. I'll confess I had to do some research into the specifics of "Agrippa Landing at Brindisium with the Ashes of Germanicus". That said, both are a record/document of a specific moment: Hamilton's from ancient Roman history and mine from the 2016 U.S. Presidential campaign. Both involve a political scene: Agrippa carrying the ashes of her assassinated husband General Germanicus back to Rome and Hillary Clinton campaigning for the Presidency. Both involve political families: Agrippa is the granddaughter of Augustus, Rome's first Emporer and mother of the Emporer Caligula; Hillary Clinton is the wife of former U.S. President Bill Clinton, and a former U.S. Senator and Secretary of State. Most noteworthy is that a woman is at the centre of both images, which until Hillary Clinton, was unknown in U.S. presidential politics.

Formally there are clear similarities to the images. And while Hamilton created his scene, and placed the historical actors in it, I was working with a chaotic scene over which I did not have a lot of control. The perspective of both images and the layering of the people and things in the images have similarities.

The unanswerable question I'm left with is this: What would the painter Gavin Hamilton think of this comparison?

Brian Snyder is a Boston-based photojournalist, who covers local, national and international news stories and events. He is a Senior Staff Photographer with Thomson Reuters and a graduate of the School of the Museum of Fine Arts, Boston and Tufts University. Brian is a two-time Boston Press Photographers Association photographer of the year. His photograph of Hilary Clinton was paired with an artwork from the Tate collection by Recognition.

The photojournalist’s perspective: Darren Staples on his image of Diwali

Leicester’s Diwali celebrations are the biggest outside of India. The city’s Belgrave Road and a nearby park become a sea of colour and life where the generations meet.This photograph was taken at the switch on of the city’s Diwali lights, a couple of weeks before the day itself. The streets are always rammed with people, and it can be hard to move, but the atmosphere is always great. The dancers with their lit up umbrellas performed on a small stage, away from the hustle and bustle of the crowds and the muddy recreation field ground. They stood out against the night sky.

And that’s it really. Light against darkness. It’s a very human response and I suppose goes back to when we lived in caves and needed to survive attack from bears. I mainly photograph news and sport. When I take a picture, like all photographers, I look for light and colour. Spectacle. But I’m not an artist, I don’t set out to show anything except the moment. I’m usually thinking “what’s for tea” and “when does my parking run out”?

I can see similarities in the comparison, although I thought it might pick out the colours, rather than the shape. Do I think it has “aesthetic, theoretical, cultural, political or historic resonance?” I don’t know. I just took a photograph. It’s not for me to analyse, but if anyone else can see that, good on them. For me, it’s enough that people might see my picture and think “that’s nice”, before they turn the page of a newspaper.

Darren Staples is currently a UK-based staff photographer for Reuters News Pictures. His photograph of Diwali celebrations in Leeds was selected and compared to an artwork from the Tate collection by Recognition.

Installation

Running parallel to the Recognition website was an interactive installation at Tate Britain. Visitors were prompted to match an artwork with an image from the news—the same task as Recognition, albeit with a smaller data-set (the 50 most similar artworks rather than the entire Tate collection). The goal of this activity was to compare decisions made by Recognition to those made by human beings.

Visitor Matches

Over 4500 visitor matches were submitted throughout the project. This gallery shows a selection of visitor-generated matches in comparison to those of Recognition. Each set (match) features the same news image, matched with two unique Tate artworks: one selected by a visitor, and the other by Recognition.

Statistics

A comparison of the data from user-generated matches with those of Recognition reveals the following:

Composition

Visitors selected matches with a composition similarity 6% higher than Recognition.

(66.8% vs 60.7%)

Composition

Both visitors and Recognition selected matches with a near-equal object similarity.

(14.5% vs 14.8%)

Composition

Visitors selected matches with a lower facial similarity than Recognition, but neither were high.

(0.2% vs 1.2%)

Composition

Visitors selected matches less similar in context than Recognition.

(16.3% vs 23.2%)

Tate

In Partnership With

Microsoft

Created by

Fabrica

Content Provider

Reuters

Info

The software that powers Recognition incorporates a range of artificial intelligence technologies that simulate how humans see and understand visual images, including:

OBJECT RECOGNITION

Developed by JoliBrain using DeepDetect and Densecap. A deep neural network finds objects from the image, then tries to label them by crafting a short sentence. A similarity search engine then looks for the top object matches among Tate artworks.

FACIAL RECOGNITION

Provided by Microsoft Cognitive Services’ Computer Vision and Emotion APIs.

COMPOSITION ANALYSIS

Developed by JoliBrain using DeepDetect. A set of deep neural networks reads the image pixels and extracts a high number of salient features. These features are then fed into a search engine that looks for the nearest per feature matches from the Tate archive.

CONTEXT ANALYSIS

Developed by JoliBrain using DeepDetect and word2vec. A variety of deep neural networks process both the images and their captions and tries to find inner relations, either based on location or semantic matching among words and sentences.

Team

Coralie Gourguechon

Monica Lanaro

Angelo Semeraro

Isaac Vallentin

AI DEVELOPMENT

Emmuanel Benazera

WEBSITE DEVELOPMENT

Alexandre Girard

PRODUCER (TATE)

Tony Guillan

THANKS TO

Sam Baron

Carlo Tunioli

RECOGNITION WAS AN AUTONOMOUSLY OPERATING SOFTWARE PROGRAMME. ALL REASONABLE STEPS HAVE BEEN TAKEN TO PREVENT PUBLICATION OF CHALLENGING, OFFENSIVE OR INFRINGING CONTENT. COMPARISONS BETWEEN ARTISTIC WORKS AND OTHER MATERIAL ARE MADE BY THE SOFTWARE PROGRAMME AND ARE FOR THE PURPOSE OF STIMULATING DEBATE ABOUT ART, EXPRESSION AND REPRESENTATION.



TATE INVITES ONLINE DISCUSSION ABOUT THESE COMPARISONS AND ENCOURAGES USERS TO TREAT COPYRIGHT MATERIAL APPROPRIATELY ACCORDING TO THEIR LOCAL LAW.



IF YOU WOULD LIKE TO CONTACT TATE REGARDING CONTENT ON THIS SITE, EMAIL WEBSITE@TATE.ORG.UK

Loading....
-->