Category Archives: 2016 Final Projects

2016 Final Projects

Memory Architect

05/16/2016 zaman

Ricardo Jnani Gonzalez & Cagri Hakan Zaman

Abstract.

Memory Architect (MA) is a project about Space and Memory. We created a Virtual Reality (VR) framework that allows a user to compose past experiences, organize their memories, and create new connections between them in a spatially immersive virtual environment. In the Virtual Memory Palace, through the use of our MA App, the person is able embed images, videos, document files, sound recordings, and even real 3D spaces into digital models they create. Then, they are able to organize these spaces as a way to alter the memory’s context. In other words, they they can place two very different memories together as way to gain a different perspective on the scenarios; they can increase or decrease the size of certain memories; they can combine two separate events into one memory, or break one into its separate aspects. Through the use of Google’s Project Tango and Unity, these models are turned into a VR space in which the user is able to both physically, and virtually walk round in. The user, by moving, kneeling, walking, etc., is able to navigate their virtual architecture holding their composed memories.

Background & Vision

In this project we propose that remembering is a form of creation, rather than an information retrieval. In this active, embodied process, space plays a crucial role. We do not only reassemble past experiences in a given context, but we do it within a particular spatial setting. In this section, we provide evidences about our embodied-memory approach from the lens of neuroscience and social sciences.

Neuroscientific Account

Our brains spend a tremendous effort to resolve space, integrate the sensory information into practical information so that it can plan where to move, where to sit; remember a particular place so it can navigate back when it is necessary. Basic characteristics of spatial perception have been discovered through studies on rat hippocampus. (O’Keefe & Nadel, 1978; Ranck, 1973; Emery, Wilson & Chen, 2012). Hippocampus is a main processing unit in the brain where short term memories are encoded in long term memories. However, the majority of cell formations in hippocampus also play a central role spatial perception and navigation, namely the “place cells”. Different firing patterns in place cell region allow the animal to represents its location relative to the objects in the environment. While place cells respond to different visual cues in the environment, they also integrate proprioceptive information as to direction and velocity, as well as the sensory information other than visual ones such as sound or odor. They exhibit firing patterns regardless of the sensory medium as the firing patterns have been discovered in deaf and blind rats. (Hill & Best, 1981;Save, Cressant, Thinus-Blanc, & Poucet, 1998). Spike pattern of a particular place cell is correlated to the change in size, orientation and shape of the environment (O’Keefe, 1998). In addition to evidently location selective place cells, there are other neural compositions in brain such as grid cells, head direction cells and boundary cells, which as a whole provide a reliable information about the environment. PPA in human brain is selective to environmental structures, which would be integrating high and low level features for detection and recognition of visual scenes.

Space, Body and Action

In order to understand the dynamic interaction between action and space one should consider the human body as the natural locus of the experience. The geographer Yi-Fu Tuan states “Upright, man is ready to act. Space opens out before him and is immediately differentiable into front-back and right–left axes in conformity with the structure of his body” (2001). Spatial experience is imposed directly by the structure of the body as well as its ability to respond within physical constraints. Equipped with sensory-motor apparatus, the human body drives the experience by actively engaging with the environment. The French social scientist Michel de Certeau proposes that space can only be experienced through this active participation “when one takes into consideration vectors of direction, velocities, and time variables”.

Memory Architect: Remembering as Reconsideration

In this project we explore the possibilities of using the immediate interaction between memory and space, and allow users to creatively interact with their own memories. Memory Architect is a place for creative remembering which would allow users to augment their memories, trigger their creative thinking as well as share their subjective points of views and memories with each other.
Our system consists of three stages: Build, Log and Explore. Through a mobile app, users can build memory cells –basic units of memory containers– and attach their memories inside. The mobile app allows users to create, edit and merge different memory cells. Users may upload any type of media –from texts and images to three dimensional models to their virtual memory palace. Once they are uploaded, the system is ready to be explored. We use Google Tango for freely navigate in VR without being limited to certain area. We believe, it is particularly important to let users to engage with the virtual content with their bodies, by moving and exploring the space naturally.

The Mobile Interface

Here we see the Home screen of MA. First, the user can choose to Create a New Memory Cell in which to place some form of media that captures a memory. If they’ve already created memory cells, they can go back to view or edit what lives in each of them. Similarly, the app would allow them to organize previous cells, as a way to compose, recreate, or shake up, their stored memories. We envision that as the user creates virtual spaces filled with their experiences, they may want to share some of them with others. This feature would allow you to send a particular space you’ve created and filled with moments to others. Similarly, the user could choose to publish a particular memory for everyone in the network to have access to. Whether it’s a personal memory space they wish everyone to see, or a space of learning that captures the many aspects of a topic, this Library of Shared Memories would allow selected access to a larger world of composed experiences, and curated memories. These five functions of the system will be discussed in more detail in the following sections.

Build

We envision a very simple interface of creating and composing virtual spaces. Through a 3D modeling system embedded in the app, the person would have access to a series of basic 3D geometries and architecture typologies. In other words, they could choose from a series of simple square rooms, domes, vaulted hallways, tunnels, etc. They could choose any one of them, and combine several to create their own virtual architecture. The user would also have basic tools like move, scale, rotate. These features would enrich their ability to organize their memories, tweak the hierarchy of their experiences, and curate lived scenarios.

Log

Once a person created a cell, or a series of cells, he/she could insert memories in different ways. Whether it’s through simply taking a snapshot of an event, recording a video or voice, or simply downloading a document file, the app would allow the person to upload a range of media into your own virtual memory palace. The user could choose to create an entire space for a single event, or embed snippets into 3D objects inside the room. We envision the possibility of 3D scanning real spaces into the virtual environment. Users can imagine capturing a pavilion you visited in a vacation, placing it in their virtual memory palace, and merging it with the unalike scenario of their workspace as a way to gain inspiration, prompt bisociation, and induce creativity.

01 ModelsOfAtmo 01

Explore

MA also allows you to store memories in objects arranged throughout the spaces you create. As the user walks through their memory palace, they are be able to insert media of memories into a series of reflective spheres. This is yet another way mixing different types of memories. You can, for example, create a space for a class you’re taking. Throughout the space, there could be images, videos of classes, reference reading files etc., all along the wall. Then, you can add spheres of memories that aren’t directly of the class, but rather relate to the class. These 3D objects in the virtual palace, would be a an additional layer to organizing, composing, and re-creating memories.

06 ModelsOfAtmo Sphere 02
Like the virtual spaces, the object itself may have any type of media embedded into it. The spheres however, instead of continuously displaying the memory media, could be triggered into revealing the memory only when you approach it. In this manner, the spheres serve as a simple method of augmenting the virtual space. Through reflecting scenarios that are not necessarily of the space they are placed in, they enhance the experience of one memory, with the insertion of another. The reflectivity of its material, creates a type of memory montage; where multiple scenarios are overlayed over one another.

Organizational Schemas

The organizational layouts of a memory palace are near infinite; you could arrange it almost anyway you want. However, beside giving the user full control over the composition of their memories, the system would have a series of elementary organizational strategies. Location Based for example; you can have the system arrange spaces so that every scenario that happens in a particular place stays within the confines of a given room. That way, the user can go to that space, and speed backwards and forwards in time and view the many stored events of that one place. This would be a very simple way of one type of filtering; keeping work at work, and home at home for example. You could then enhance their separation, so that as you navigate in a memory space of “home”, you will not have any visual access to events from the workplace.

Another simple organizational method, would be to arrange memories through activity. Here again, the user could being to compose the different type of activities, and playing with their proximity. A space of reading and studying for example, could be interestingly enriched by having a portal to the memory of your first picnic in Paris. The user would be able to walk through a dome all of their travels; a valley of all of their hikes; an oculus of their favorite skies.

Prototype

We have developed a prototype in Unity Game Engine for Google Tango, where the app downloads the contents of a memory palace from the server and places them inside a memory cell we designed. The prototype virtual palace can be explored using the Tango and VR goggles.

Future Work

We envision utilizing space as an interface. Through being able to physically walk in a virtual space, motion, pace, volume, material, mass, and flow become tools of engaging the digital world. Without the typical restrain of the physical world, the virtual palace could warp in ways that would allow the user to see spaces beyond where they stand. This method would give them access to see themselves and the memory they are walking through in relation to the overall palace. This method of warping effectively redefines space of memory.

warp copy copy

As the person develops their memory palace, they could begin to shape and model it after the real world by capturing 3D spatial scans. This build up over time is accelerated and enhanced through the sharing feature. Eventually the user could have a virtual duplicate of their home, school, town, etc., in which they compose and montage events and moments they want to remember of that particular place.

Time Shelves

References

Place Cells and Memory

[1]Park, S.Brady, T.F.Greene, M.R., Oliva, A. 2011.Disentangling scene content from its spatial
boundary: Complementary roles for the parahippocampal place area and lateral occipital complex
in representing real-world scenes Journal of Neuroscience, 31(4), 1333-1340.
[2] Oliva, A., Torralba, A. 2001. Modeling the Shape of the Scene: a Holistic Representation of the
Spatial Envelope. International Journal in Computer Vision, 42, 145-175
[3] Oliva, A. Torralba, A.2006.Building the Gist of a Scene: The Role of Global Image Features in
Recognition. Progress in Brain Research: Visual perception,
[4] O’Keefe, J. and Nadel, L. 1978. The Hippocampus as a Cognitive Map. Oxford: Oxford University.
[5] O’Keefe J Burgess N Donnett J G Jeery K J Maguire E A .1998 Place cells, navigational
accuracy, and the human hippocampus Philos Trans R Soc Lond B Biol Sci. 1998 Aug 29;
353(1373): 1333{1340.
[6] Ranck JB., Jr. 1973.Studies on single neurons in dorsal hippocampal formation and septum in
unrestrained rats. I. Behavioral correlates and ring repertoires. Exp Neurol.41:461{531.
[7] Chen, Z, Kloosterman, F, Brown E N, Wilson, M A,2012. Uncovering spatial topology represented by rat hippocampal population neuronal codes Journal of Computational Neuroscience,
October 2012, Volume 33, Issue 2, pp 227-255

Memory & Space

Foer, Joshua. 2012. Moonwalking with Einstein: The Art and Science of Remembering Everything. London: Penguin Books.
Yates, Frances Amelia. 2002. The Art of Memory. Nachdr. Chicago, Ill.: Univ. of Chicago Press.
Bachelard, Gaston. The Poetics of Space. Boston: Beacon Press, 1969. Print.
Pallasmaa, Juhani. The Eyes of the Skin: Architecture and the Senses. Chichester: Wiley-Academy, 2005. Print.
Abbott, Edwin A. Flatland: A Romance of Many Dimensions. Champaign, Ill: Project Gutenberg, 1990. Internet resource.
Bergson, Henri, Nancy M. Paul, and William S. Palmer. Matter and Memory. London: G. Allen & Co, 1912. Print.
Ai, Weiwei, Anthony Pins, and Weiwei Ai. Ai Weiwei: Spatial Matters : Art Architecture and Activism. , 2014. Print.
Grynsztejn, Madeleine, Elíasson Ólafur, Daniel Birnbaum, and Michael Speaks. Olafur Eliasson. London: Phaidon Press, 2002. Print.
Olafur, Eliasson, and Günther Vogt. Olafur Eliasson – the Mediated Motion. Köln: W. König, 2001. Print.
Beccaria, Marcella, Elíasson Ólafur, and Simon Turner. Olafur Eliasson: Oe. , 2013. Print.
Pallasmaa, Juhani, and Peter Zumthor. Sfeer Bouwen =: Building Atmosphere. Rotterdam: Nai010 Pub, 2013. Print.
Jones, Caroline A, and Bill Arning. Sensorium: Embodied Experience, Technology, and Contemporary Art. Cambridge, Mass: MIT Press, 2006. Print.
Rosenau, Helen, Etienne L. Boullee, and Etienne L. Boullee. Boullee & Visionary Architecture: Including Boullee’s Architecture, Essay on Art. London: Academy Editions, 1976. Print.
Lemagny, Jean-Claude. Visionary Architects: Boullée, Ledoux, Lequeu. Houston, Tex.: Printed by Gulf Print. Co., 1968. Print.

2016 Final Projects

Synesthetic Interfaces

05/12/2016 athpap

Athina Papadopoulou & Harpreet Singh Sareen

Idea and Motivation: Synesthetic interfaces

We traditionally think of sensory augmentation as rendering one or more sense more acute, enhancing our perception beyond the human level. Or, similarly by adding an extra-sense not possessed by humans, such as magnetoreception. Contrary to this approach, we suggest that we can achieve sensory augmentation only by engaging the full spectrum of all of our existing human senses. The idea of increasing our sensory interaction with objects is not a new idea; there is a substantial amount of research in the field of multimodal interaction on how to integrate in digital interfaces not only vision but also sound and touch. However, even if multimodality in this approach allows us to interact with virtual environments similarly to a physical one, it doesn’t really enrich or augment our perception. Synesthetic interfaces can go beyond mere simulation of the physical experience through enhanced modes of human-machine interactions that promote creativity.

Case Study: Pottery Machine with Sonic Feedback

maxresdefault — Pottery machine with sonic feedback. Papadopoulou, A. & Singh Sareen, H. 2016

To explore the idea of synesthetic interfaces we designed experiments and prototypes for the development of a pottery machine with sonic feedback. Through this case study we aimed at gaining a better understanding of the shape-sound correlations as well as experiment on possible implementations of a multimodal creative system founded upon synesthetic associations. The basic idea for the implementation of the synesthetic pottery machine is that the sonic feedback system reads the profile of the ceramic pots in the making and generates sounds that match the specific shape. One basic reason that we chose to study pottery as a case study is that the symmetric nature of the pots facilitates the translation of shape to sound. In fact, rule-based profile reading of pots is an already established method in archeology as it is used for pottery classification and 3d simulation based on found fragments [1]. Thus, we believe that the method we propose could easily become intuitive to the pottery makers and the scientific community surrounding this craft.

Drawing1Edit — Pottery machine with sonic feedback. Papadopoulou, A. & Singh Sareen, H. 2016

Background & Related work: Synesthesia and Creativity

Synesthesia is a condition that allows one to have sensory impressions on an additional sensory modality that the one stimulated. Although traditionally perceived as a malfunction of the brain, synesthesia has recently been discussed as a creative state of the mind that offers augmented cognitive and sensory abilities [2]. According to studies, people that naturally acquire synesthesia after a sudden incident, demonstrate increased creativity, and/or augmented cognitive abilities in a certain activity, or even excellence in certain artistic endeavor with no prior knowledge or experience. This all happens because the brain becomes rewired elevating some skills beyond usual levels and providing cognitive access to “hidden” parts of the brain [2]. Recent evidence from cognitive experiments demonstrates that synesthesia, is not only an innate or naturally acquired condition but it can be learned upon training [3]. In other words, through training we can increase our sensory interactions with the world and acquire augmented creative skills.

51Xbv1cLk7L._SY344_BO1,204,203,200_ — Synesthetes as “Superhumans.” Brogaard B, and Marlow, B. The Superhuman Mind (Cover)

Synesthesia as a form of creative human-machine symbiosis was explored by the cybernetician Gordon Pask in the 1950’s. In 1953, Gordon Pask developed the Musicolour System, a machine that used sound as inputs and projected colors as outputs. The Musicolour System was used in musical performances to promote a creative exchange between the conductor and the machine [4].

musicolour — Gordon Pask, Musicolour machine1953

Artists have also been exploring synesthetic cross-modal correlations. In the Bauhaus school, Wassily Kandinsky experimented with the integration of different sensory modalities through “synesthetic” multimedia experiments [5]. Today, contemporary artists like Timothy Laiden explore visual-auditory correlations focusing on the visual manifestations of sounds [6]. Research in cognitive science has shown that certain sound-shape associations may have a common embodied and cognitive basis. For example, as the “Bouba/Kiki effect” demonstrates, people tend to associate rounded shapes with rounded vowels and unrounded vowels with angular shapes [7]. The physical responsiveness of materials and shapes to sound can also indicate ways to define sound-shape correlations. Research in human echolocation demonstrates methods for perceiving shapes and materials through sounds [8].

Design and Implementation: Method, Experiments and Developed Prototypes

Before implementing the synesthetic pottery machine system we conducted a pilot experiment to test whether the chosen sound-shape associations would be perceptible by the users. Because in human echolocation methods an increase in amplitude or frequency corresponds to different properties in spatial features, we decided that a focus on either invariant frequency and variant amplitude or variant frequency and invariant amplitude would be more intuitive for the users.

We finally chose to vary the frequency as it generated more pleasant sound mappings. For the experiment we designed four groups of shapes. The first group consisted of very basic 2D shapes, the second group of basic 2D shape combinations, the third group of complex 2D shape combinations and the fourth group of 2D pottery profiles. We used the first group – the group of basic shapes – to train the participants in the shape-sound associations by asking them to listen to the sounds while observing the corresponding shape on a computer screen. After this quick training session, we played the sounds of the second group – the basic shape combinations – and asked the participants to draw the shapes. This time the corresponding shapes were not shown to the partici

pots-13 — Group 3. Advanced Shape Combinations

pants. The same procedure was repeated for the third group of shapes – complex shape combinations – and the fourth group – pottery profiles. The system was reading the profile from top to bottom mapping each pixel to sound sequentially. The sound output was set to produce higher frequency when the pixel was further away from a set boundary and lower frequency when the pixel was closer to the set boundary.

Seven students took part in this pilot experiment. On average, most of the participants were able after a short training to identify the first and second group of shapes although the sharp angles and smooth curvatures were often difficult to be distinguished. The third group of complex shapes was difficult to be perceived through sonic information. Judging from the participants comments, a longer duration of the produced sound would facilitate shape perception. The fourth group – pottery profiles – was not identifiable in detail but, on average, the basic curvatures – positive/negative – and sequence of shapes were reproduced by the participants in the drawings. Again, longer duration of the produced sound would facilitate shape perception. We also noticed that participants with music knowledge performed better on the experiment. Another observation was that although distinguishing a separate tone for each tone sounded more musical, smoothing out the difference between the different frequencies into a continuous sound facilitated shape perception.
Two prototypes were developed to test sound mappings of pots. For the first prototype we used a Kinect sensor and programmed the system to read the profile of the pot from top to bottom, reading each detected point sequentially. We tested bost smoother and more segmented mappings. Capturing the profile was a real-time process where the user could also observe on the screen a red line being drawn on the pot, matching the speed of the sound mapping. As in the previously described shape-sound experiments, the profile shape was mapped into a sound output based on distance-frequency correlation where the distanced points would correspond higher pitch than the points read closer.

The second prototype was made using an array of 16 pairs of infrared LED emitters and receivers which were controlled through a microcontroller. Similarly to the first prototype, we programmed the system to map the distance of the object to the sensor to a range of frequencies (the closer the object, the lower the frequency). The second prototype although not as precise as the first, allows for a more intuitive seeing-hearing-doing interaction which we believe would prove to be useful in the actual process of making.

Usage Scenario: Synesthetic Pottery-making

Although limited at the moment on shape sound correlations based on distance and easiness of perception, the synesthetic pottery machine could potentially allow for a creative feedback loop to take place between the user and the system. The craftsman could train the sonic system on their aesthetic preferences and the sonic system could in turn help the craftsman to be trained on synesthetic associations. If a sound-to-shape correlation could be established for a variety of combinations of amplitude and pitch, then any sequence of sounds, whether an instrument, or song could be translated into shape. Eventually different pottery styles could emerge through different music genres and styles.

Drawing2
Conclusion and Future Work

Research in cognitive science has demonstrated the creative effects of cross-modal interaction and sensory integration. Moreover, synesthesia has lately been regarded as condition that can foster creativity. Previous work in cybernetics and the arts has demonstrated ways to combine seemingly unrelated modalities promoting synesthetic associations. In this project we explored the idea of synesthetic interfaces as a means to induce cross-modal correlations in the pottery making process. Through the study case of pottery profile sound mappings we demonstrated that simple samples can be easily perceived through sound. We believe that after the necessary training, more complex shapes, and combinations of these would be also easily perceived by the user/maker. Another aspect that we aim to further explore is how the shape-sound associations affect the makers’ aesthetic judgement and choices of form in the making process.

Citations

1. Kample, M. & Sablatnig, R. (2007) Rule based system for archaeological pottery classification. Pattern Recognition Letters, 28 (6), pp. 740-747
2. Brogaard, B., & Marlow, K. (2015). The superhuman mind: Free the genius in your brain. New York: Penguin
3. Bor, D. et al (2014) Adults can be trained to acquire synesthetic experiences. Scientific Reports 4
4. Reichardt, J. (ed) (1971). Cybernetics, Art and Ideas. New York Graphic Society
5. Ione, A. & Tyler, C. (2003) Neurohistory and the Arts: Was Kandinsky a Synesthete?. Journal of the History of Neurosciences, 12 (2) pp. 223–226
6. http://theshapeofsounds.com/
7. Maurer D, Pathman T & Mondloch CJ (2006). “The shape of boubas: Sound-shape correspondences in toddlers and adults” (PDF). Developmental Science 9 (3): 316–322.
8. Kolarik AJ., Cirstea S., Pardhan., S., Moore BC., (2014), “A summary of research investigating echolocation abilities of blind and sighted humans,” Hearing Research, Volume 310, April 2014, Pages 60–68

Contact :
Athina Papadopoulou (athpap@mit.edu)
Harpreet Singh Sareen (sareen@mit.edu)

2016 Final Projects

Emotive Skins: Revisiting the Somatic Marker Hypothesis

05/11/2016 wonder

Human computer symbiosis final skins final project-01

2016 Final Projects

Shoulder Angel

05/11/2016 lukeglw

Shoulder Angel:

Proactive Machine Augmentation for Interaction with Information

Luke Guolong Wang

Idea & Motivation:

Apart from what we can first-handedly see and hear, our perception of the world depends on information presented to us by the mass media. We therefore effectively live in a world of information that is mostly “weaved” by the mass media. However, the mass media tends to cover stories that are more likely to go viral or stories that resonate with what we may want to hear down inside, rather than pursue accuracy of depiction. We cannot deny the fact that such tendencies will sway the way we perceive the world, and go on to affect our judgements and decisions to different extends subliminally.

So what if, with the development of artificial intelligence technologies, a proactive machine agent is able to complement our current information intake with a broader context and alternating views? What if a machine agent was able to “read” alongside us, while scanning the sea of information available online, and proactively enlighten us about the biases and tendencies of the content that we are consuming, real time?

Such is the motivation of this exploration, but we must note that it would be unreasonable to expect to achieve a “perfectly accurate” perception of our surrounding world through a machine agent, as there will always be issues in our society that are controversial and be highly dependent on point of view. Rather, the hope of this exploration is to think about machine agents that can proactively assist us in broadening our horizons and heightening our awareness.

Background Work – Biased information, motivations for bias, and implications of consuming biased information

There are two important psychological models regarding information processing and attitude change that are particularly relevant to our exploration. One is the heuristic-systematic model of information processing, developed by Shelly Chaiken in the early 80s, and the other one is the elaboration likelihood model of persuasion developed by Richard Petty and John Cacioppo in the mid 80s.

Systematic processing is processing of information that involves comprehensive and analytic cognitive processing of information. Heuristic processing, on the other hand, makes use of knowledge structures preconceived as judgement criteria. For example, opinions such as “experts can be trusted”, “what the majority believes in should be more correct”, or “longer articles should be more comprehensive and accurate” can all be foundations of heuristic processing. It is quite obvious that systematic processing of information is much better if accuracy of judgement and perception were of single priority, but in reality, we do not always have the mental energy to scrutinize every single issue, and thus heuristic processing still plays a big role in many situations. For example, you may meet a young gentleman at a cocktail party, and after hearing that the young man studied physics at MIT, most people will come away thinking that this person must be smart, even though deep down inside, we all know that such deductions from this piece of information are highly irrational.

The elaboration likelihood model explains the motivations behind our choices for using a more analytical and deliberate approach to processing information versus a more effortless approach based on previous heuristics. We are more likely to pursue an analytical and deliberate approach when there is sufficient motivation and ability, and absence of either will lead us to pursue a less critical approach.

The reality happens to be that most of the time, when we are casually consuming media content, there is no strong motivation tied to our interests to think deliberately, and it is also hard for us to assess information spanning a wide range of topics that we do not possess expertise in. Thus, this means that we are particularly susceptible to being unconsciously affected by biased information.

Furthermore, from the media perspective, as illustrated by Bernhardt in the Journal of Public Economics [1], in the political sphere, there is motivation for media outlets to maximize their profits (reader base) by suppressing information that the audience may not want to hear and promoting those that are more welcome. We can imagine that in sectors other than politics, such effects probably exist too. This means that there is possibly an undesired “positive feedback effect” where biased information misinforms our citizens and the stereotypes and heuristics that arise from such biased information will then motivate media outlets to cater to such archetypes and reinforce this process further.

So what are the undesirable effects for media bias, is there motivation for us to try and mitigate this effect? As portrayed, if somewhat extremely in Entman’s article[2], media biases have very long-term implications on public framing, priming, and agenda-setting, thus greatly affecting the influence of different schools of opinions, and this, in the political sector, can have great impact on power dynamics. Bernhardt’s article mentioned above also holds a similar belief where bias leads to polarization and also electoral mistakes when severe.

Additionally, as Shelly mentions in her work[3], systematic processing of information is related to heuristic processing, as the notion of information source credibility is important in our logical assessment of information, but such notions of credibility is, to different degrees, based on heuristics. Therefore, there is also the implication that biased information will affect our deliberate inspections of information in the long term by affecting our heuristic processing of information.

Given these implications of biased information, we propose “Shoulder Angel” as a machine augmentation that assists in your information processing, allowing you to exercise a more “systematic processing” of information when you are only browsing information casually and thus are more heuristic and casual in judgement.

Related Work

Digital assistants have played a role in automating customer service queries for quite a time, sorting out calls for the likes of banks and airlines. More recently, personal assistants and digital chatbots have become a hot field, with Google, Microsoft, Apple, Amazon, and Facebook all releasing personalized digital assistants of their own.

While these new digital assistants are indeed designed for consumers, and meant to provide as natural a form of interaction as possible, they are still mostly passive agents that cater to your specific wishes such as scheduling a meeting or playing a song.

The exception is Google Now, already available on most Android phones on the leftmost screen, which does proactively present you articles and information such as scores for basketball matches depending on your past search history.

However, what we are hoping for is a proactive agent that compliments your current information influx with different viewpoints and a broader context, not just bolster it with similar content.

Design and Usage Scenario

We’ve considered many different formats of information expression of our “Shoulder Angel” from infographic charts to index-like numbers to plain text, but in the end we’ve come to believe that plain text may actually be the best choice.

While graphs and charts offer more information bandwidth, and are able to depict information accurately, it is also true that there is a certain learning curve involved in understanding infographics. Likewise, index-like numbers are favored by professionals in social science areas as a concise and relatively accurate way to grasp situations, but the general public will probably find this format quite unintuitive.

Thus, somewhat following the trend of tech companies moving towards chatbots in their AI efforts nowadays, we chose a plain text, conversational format for our “Shoulder Angel”, this choice was made largely based on pursuits for intuitiveness and a close, “human” feel.

Usage Scenarios:

We highlight a couple of possible usage scenarios below, the first of which demonstrates a scenario when our user has a newfound interest in the debates surrounding the national elections. Shoulder Angel will pick up this trend, and provide feedback about our user’s penchants that he/her may not have been aware of by himself/herself. Shoulder Angel also aims to provide this feedback in a non-rigid, somewhat casual and witty fashion, which should help to make this feedback more humane and easier to relate to.

Another usage scenario is when our user is writing an email to a friend/colleague, and shows subliminal cues of strongly biased opinions. Shoulder Angel will whisper to you about your

Shoulder Angel will also notify you of biases and stance of media outlets, so that you can be consciously aware of how the emphasis of your content has been intentionally selected to portray a certain biased inclination. This will allow you to be more mentally aware and consciously keep a check on what you read in your mind, real time.

Implementation

The full vision of our Shoulder Angel would require several very difficult technical challenges in natural language processing that are beyond the scope of this design project to realize, and thus we’ve only achieved a very elementary mockup of our final vision.

We had two simple schemes for training our machine to “understand” media content. One was a document summarization model that relied on TextRank[4], a graph-based model for ranking the most important sentences in a document, thus achieving a “summarization” of the document. The second was a Latent Dirichlet Allocation[5] over the content to model the topic distribution, so that our machine can “understand” what the key topics are in the content we are consuming.

These primitive levels of “understanding”, particularly the latter, will allow for a limited number of pre-engineered logic based responses, but more natural responses to summaries will need to rely on large amounts of corpus training data.

Future Work

Despite difficulties, the realization of our Shoulder Angel concept should be doable with existing state-of-the-art deep learning natural language processing techniques, as products such as Google Now and Siri are achieving astonishing levels of conversational capability, thus we would like to dive deeper into more sophisticated models of implementation with larger datasets in the future to realize a fuller version of our Shoulder Angel concept.

Citations

[1] Political polarization and the electoral effects of media bias, Bernhdart, 2008, http://www.sciencedirect.com/science/article/pii/S0047272708000236

[2] Framing Bias: Media in the Distribution of Power, Entman, 2007 http://onlinelibrary.wiley.com/doi/10.1111/j.1460-2466.2006.00336.x/full

[3] Heuristic processing can bias systematic processing: Effects of source credibility, argument ambiguity, and task importance on attitude judgment, Chaiken, Shelly, 1994 http://search.proquest.com/docview/614307529?accountid=12492

[4] TextRank: Bringing Order into Texts, Mihalcea,Tarau, 2004 https://web.eecs.umich.edu/~mihalcea/papers/mihalcea.emnlp04.pdf

[5] Latent dirichlet allocation, Andrew Ng, Michael Jordan, 2003, http://dl.acm.org/citation.cfm?id=944937

2016 Final Projects

Minimalision

05/11/2016 sdpenman

Dishita Turakhia + Scott Penman

IDEA/VISION
In today’s age of ubiquitous advertising, we are constantly bombarded by bits of information vying for our attention. The “noise” of our environment has reached a fever pitch for almost all of our senses. While our sensory modalities are designed to efficiently filter out much of the sensory data that reaches us and focus only on the relevant information, we believe that combining this filtering process with technology in a human-machine symbiotic intervention can help augment our ability to focus – and, in turn, help us kick the bad habit of constantly diverting our attention to technology. Our intervention is an eyewear that is designed to block the user’s view whenever he or she is distracted by mobile phones. The eyewear recognizes when the user looks at the mobile phone screens and actively shuts the eyewear lenses.

MOTIVATION + BACKGROUND WORK
While technology and gadgets like mobile phones assist in efficient task management and ceaseless connectivity, the downside of this pervasive technology is evident in our daily lives. Our phones pose a constant distraction in various contexts like driving, social gathering, personal/romantic bonding or even family get-togethers. This distraction can have adverse repercussions in our social lives but also result in a habitual lack of focus and shortened attention span.

Our goal is to use technology to augment user focus by blocking distractions. Reality is already suffused with information, so our aim is to clarify it, rather than complicate it.

Another motivation driving this project is to assist children suffering from ADHD. According to a study[1][2] two million more children in the United States have been diagnosed with attention-deficit/hyperactivity disorder (ADHD) and one million more U.S. children were taking medication for ADHD over an 8 year period (2003-2004 to 2011-2012). In 2011-2012, 11 percent of U.S. children 4-17 years of age had been diagnosed with ADHD. Nearly one in five high school boys and one in 11 high school girls in the United States were reported by their parents as having been diagnosed with ADHD by a healthcare provider. This device can be used to enable focused study hours, especially for children struggling with the difficulty to concentrate on a given task.

In addition to being a focus-enabling device, we envisioned this gear as an artifact that gives a social message in the technologically-savvy society. The act of lens shutting was deliberately made performative by coupling it with red lights that flash to demand attention both from the user (for being distracted) and from people around (to make them realize that the user is trying to focus).

RELATED WORK
One of the inspirations that motivated our work was Nicolas Damiens’ Tokyo No Ads. In this work, Damiens photoshops all of the visible advertisements out of typical Tokyo street scenes, and presents them as before-and-after animations. In doing so, he provokes the viewer to consider: What would life be like without ads? What if we could visibly “tune them out”?

2015-02-10-Tokyo_no_ad_ 2015-02-10-HD_Tokyo_no_ads_07 2015-02-10-Tokyo_no_ad_03

The concept of a device that enforces concentration by isolating surrounding sensory noise was creatively applied in Hugo Gernsback’s The Isolator. This multimodal work from 1925 involved both hearing and vision, as it rendered the user deaf and restricted vision to tiny apertures. Oxygen is piped in via tube.

We also were inspired by the design used in Cyrus Kabiru’s sculptures. Kabiru creates these works from trash that he finds in his hometown of Nairobi. His work is part art, part performance, part stress-relieving humor-therapy. On top of exemplifying human-machine symbiosis, we see our eye gear as an art piece that makes a socially relevant statement of awareness about pervasive technology.

DESIGN AND IMPLEMENTATION
The diagram for the system is simple. The decision of what to filter (in this case, cell phones) is offloaded to a CPU. A webcam captures what the user is seeing; that information is passed to the CPU, which determines whether or not a cell phone is present; and this ON/OFF decision is sent via Arduino to two servo motors that raise and lower lenses at the front of the armature. The device is thus a filtering interface that recognizes distractions and blocks the user from diverting their attention away from object of focus.

Our early prototypes of this used a pair of sunglasses as the armature, with paper shapes for lenses.

The final version was made from lasercut plexiglass and designed such that the webcam and servo motors (and accompanying wires) were properly attached to the armature.

armature axon - black background

The lenses were cut from cardstock and folded like a fan. This way, they could be discretely folded up into the armature when closed, and expanded out to fill the lens space when open.

lens animation

In addition to these this, we chose to make the design of the eyewear very visually “loud.” We did this to provoke discussion around the idea of public accountability – do we break habits faster when the world can monitor our progress? To heighten this, we included two red LEDs that glow upon activation of the mechanism, illuminating the armature.

IMG_20160506_121919

USAGE SCENARIO
In our usage scenario, we have a student that is trying to study for her final exam. She grows tired of this, however, and attempts to look at her phone for entertainment. The eyewear notices the phone in her vision and promptly shuts the lenses. Only when she looks away (and back at her book) do the lenses retract.

CONCLUSION AND FUTURE WORK
We have provided a provocation-of-concept in the form of eyewear that transforms to publicly block the wearer’s vision when he or she looks at a phone. This project is part of a larger vision: by cognitively offloading our filtering ability to machines, we can actively tune out what we consider to be “noise” in our lives, and enjoy the augmented quiet that results.

This framework (offloading the decision to filter to a machine that has the ability to do so) can easily be extended within vision as well as to other modalities. In addition, with smarter filters (or even classifying neural networks or other advanced artificial intelligence mechanisms) added to the system, more robust filtering definitions could be described. We provide the following scenarios as imaginative extensions:

“I don’t want to see anything other than my book while I’m studying.”

“I want to focus on the road while I’m driving.”

“I only want to hear positive thoughts today.”

In addition to addressing the adverse effects of ceaseless connective technology, we also aim to use this device to create interpersonal empathetic connection, by enforcing people to focus on each other and not be distracted. In today’s digital age, as we become more and more shielded from direct confrontation with alternative opinions, we become more and more critical of them, engaging an egocentric cycle that results in a complete loss of empathy. But perhaps, instead of hindering our ability to be humane and sensitive, this gear can augment it.

CITATIONS
Collins, Franklin M. 2014. “The Relationship between Social Media and Empathy.” http://digitalcommons.georgiasouthern.edu/etd/1150/.

CDC: ADHD Estimates Rise

CDC: State-based Prevalence Data of Parent Reported ADHD Diagnosis by a Health Care Provider

TeenSafe: Do Today’s Tech-Obsessed Teens Have Less Empathy?

NYTimes: Found on Facebook: Empathy

Scientific American: What, Me Care? Young are Less Empathetic

Damiens, Nicolas – Tokyo No Ads

Gernsback, Hugo – The Isolator

Kabiru, Cyrus

2016 Final Projects

Mastering Memory with the Memory Palace and Augmented Reality

05/10/2016 rosello

Marc Exposito, Oscar Rosello

Vision

The memory palace is an ancient greek technique that can be used to memorize (almost) anything. It is used for tasks such as remembering full decks of cards, historic dates, long lists of words or foreign language vocabulary. Many memory contest champions, including eight-time world memory champion Dominic O’Brien, claim to use this method (Jusczyk, 1980).

The memory palace is a mnemonic technique that involves populating an imaginary scene with mental images that will help us remember the content we intend to memorize. However, getting started with the memory palace can be a demanding cognitive task. For subjects that are not trained as spatial thinkers imagining a scene vividly can prove to be challenging. To address this problem, we propose to make the memory palace real: This means offloading part of the cognitive task dedicated to imagining scenes to reality by using the tangible architectural spaces that surround us. We use the fact we all know how to navigate space naturally and use that to our advantage. We propose to combine the memory palace technique with augmented reality technology to create a study tool that will help anyone memorize more effectively.

The memory palace technique

The memory palace technique works as follows: First, find a concept you want to memorize and use a visual mental symbol to help you remember that association. For example, if you want to remember that the Dallas Cowboys won the Super Bowl in 1972, you could imagine a cowboy on a rodeo. Second, take the image you just created and link it to an architectural scene. It can be a scene of a place you know or an imaginary scene. This means, that for example, you should imagine the cowboy in front your office. Finally, to recall your memories, imagine the scene we just created in your head. The Dallas Cowboys will naturally emerge.

In order to remember more that one concept, simply attach an adjacent mental scene and link a new mental image to that scene. When recalling the full sequence, you need to revisit those scenes mentally in order you dropped the concepts you intended to remember. Intuitively, it might seem a counter-intuitive strategy since you need to remember both a place and the actual content you want to remember, but it’s, in fact, the opposite: Current neurological research has proven spatial navigation and memory both relate to the same part of the brain, the hippocampus (O’Keefe, 1978). Brain scans of “superior memorizers”, 90% of whom use the method of loci technique, have shown that it involves activation of regions of the brain involved in spatial awareness, such as the medial parietal cortex, retrosplenial cortex, and the right posterior hippocampus (Maguire, 2002). This technique takes advantage of this fact to facilitate encoding, storing and retrieval of information.

NeverMind Interface

We have taken a step towards understanding the role spatial navigation plays in memory augmentation by developing NEVERMIND, a learning interface that is in line with how memories are stored internally. This system is divided into two main parts. The first part is an iPhone application dedicated to user interaction. It has three different modes: Encode, Store, and Retrieve.

In Encode mode, users create routes and pair them with images. In Store mode, users can train memory based on once the content is set by physically visiting their memory palaces. In Retrieve mode, users can recover previous knowledge lists and link them to a specific route. Additionally, the system allows users to share knowledge content playlists with other users and pair them with their own memory palaces.

The second part of the interface is dedicated to the display of images and runs images on the Epson Moverio BT-300 augmented reality glasses. We developed a program that runs in the Unity3D video game engine that is responsible for receiving the images from the iPhone sends and displaying them on the glasses.

NeverMind running on the Moverio AR display.

Experiments

We have tested the NEVERMIND interface on five different subjects with favorable results. The subjects were asked to memorize two similar lists of 10 items, one using NEVERMIND and the second using a printed list of items. The tasks were not time constrained. Results showed all users were able to remember the lists immediately after running the test. However, all five subjects were able to remember the content of the list 24 hours after the experiment and none was able to accurately recall items on the printed list. When asked about the two study methods, users claimed that studying with the NEVERMIND interface was more enjoyable and nearly effortless compared to traditional study methods.

We tested the NEVERMIND interface on a list of 10 Super Bowl winners from 1967 to 1976. We verified that the subjects had no previous knowledge of the content to remember. For the experiments, we predefined the mental imagery for the users. For example, we used a picture of a man on a horse to represent the Dallas Cowboys or a picture of an airplane to represent the Houston Jets. In all cases, we used routes users were familiar with.

Reusing the palace

We conducted preliminary studies on the reusability of the palace. We used a location of the previous memory palace to remember 15 digits of Pi. Our intuition points that the most demanding part of the technique is loading the palace in your head for the first time. Once the palace is loaded, scenes can be reused to store other content effectively.

The memory palaces can be reused: Once the user is familiarized with a specific memory palace, new content can placed and easily recalled. — Once the user is familiarized with a specific memory palace, new content can placed and easily recalled.

Usage scenarios

Our motivation is to change the way students memorize. We spend a long time memorizing based on repetition. In our experiments, we suggest that there are more effective methods that are in line with the way our brain stores information. We propose an experiential way of learning, where the retrieval process is the act of mentally aligning ourselves to the location.

Screen Shot 2016-05-10 at 10.39.29 PM — Towards an experiential way of learning: NeverMind allows a physically active way of studying. The user could be preparing for a presentation in the afternoon.

We see potential uses in education, as a technique to bootstrap knowledge as a starting point before tracing associations and inferences that are characteristic of higher levels of understanding. This interface could be used, for example, to help biology students study. Other uses include public speaking, speeches, toasts, presentations etc.

Future steps

Mixed reality: We are planning on implementing a mixed reality version of NEVERMIND. At the moment, the graphical content supplied by the interface is not anchored to a specific spatial location. When the user approaches the target location, the image appears. This means, that AR images move with the user’s head motion. We have the intuition that anchoring images accurately with reality will lead to more memorable results.

Chunking the palace: Controlling image placement accurately would also open up new features of the interface. For example, adding hierarchy to the palace. With this feature, the user could control the amount of detail the user wants to remember. This would lead to remembering a set of concepts at different levels of hierarchy. The essential content could be recalled just enough to make a 30 second elevator pitch to an investor, more details could be added to make a 7 minute pechakucha presentation or if we recall all the content, we could deliver a 20 minute presentation on an idea.

Video review: We are planning on adding a set of features that will allow revising or studying the content of the palace without the need of being physically there. A first implementation will include a recording of the routes as the user sees them with overlays of the images to remember. This would result in a video that could be played at 10x speed, slowing down when the content appears and resembling memory consolidation that goes on through the REM phases of sleep. The video should be easily played forward and backward to help the memorization process.

A video review feature could be used to facilitate memory consolidation, simulating the REM sleep cycle.

Knowledge playlists: Each user’s memory palace is personal, but the content to remember can be shared.
We propose a social platform to the share and download knowledge with friends or classmates. Each user can use their own palaces and populate them automatically with content downloaded from the web. Predefined graphic associations could be built in and the user could alter the content. This would build a database of concept-image pairings sourced from the community of users. Potential applications of this feature include bootstrapping content into the student’s memory before a class.

Is memory obsolete?

Conducting this study also raised several questions related to the relationship between memory and technology. Why memorize? What role does memory play in the learning process? Is memory still relevant in the age of Google? We are becoming symbiotic with our computer tools, growing into interconnected systems that remember less by knowing information than by knowing where the information can be found. However, we believe that memories are one of our most precious possessions; they grow with our experience and vanish when we die. Like Ebbinghaus states, mnemonic techniques such as the memory palace can increase our ability to retain it will help us preserve our personal experiences stored in memory.

Screen Shot 2016-05-10 at 10.24.10 PM — Replication and Analysis of Ebbinghaus’ Forgetting Curve. Jaap M. J. Murre. July 2015.

Previous Work

Previous interfaces dedicated to augmenting memory include the Remembrance Agent (Rhodes, 1997), that proactively logged the information the user needed to remember. Other studies include the use of virtual reality to rehabilitate memory with patients suffering from Alzheimer’s (Brooks, 2003). There have been previous studies on the recreation of a virtual memory palace. Most methods involve either a computer simulation model or virtual reality (Legge, 2012). We found another example that pairs an early version of a head mounted display with an interpretation of the memory palace technique (Ikei, 2008). However, the design of the interface of the hardware, software and interpretation of the memory palace technique is substantially different from the one described here.

Contributions

We designed a learning interface to make memorization more durable and enjoyable. We have designed a model to help users master memory based on the coupling of space and memory. We have implemented an interface prototype, NEVERMIND, that facilitates memory encoding, retrieval, and storage. We have shown experimentally that using our interface, memories become more durable and subjects claim that the process is more enjoyable and effortless. With this work, we hope to make the memory palace accessible to the general user. We have designed a new way to memorize, based on a symbiotic relationship with technology, that enables us to learn in line with how our brains store information.

References and additional readings

Artificial Intelligence
Brooks, Rodney. “Intelligence Without Reason.”
Brooks, Rodney. “Planning Is Just a Way of Avoiding Figuring out What to Do next.”
Brooks, Rodney Allen. 1999. Cambrian Intelligence: The Early History of the New AI. Mit Press.
Laird, John. 2012. The Soar Cognitive Architecture. Cambridge,Mass. ; London, England: MIT Press.
Licklider, J. C. R. n.d. “Man-Computer Symbiosis.pdf.”
Markoff, John. 2011. “A Fight to Win the Future: Computers vs. Humans.” New York Times.
Marr, David, and Tomaso Poggio. 1976. “From Understanding Computation to Understanding Neural Circuitry.”
Schank, Roger C. 1990. Tell Me a Story: A New Look at Real and Artificial Memory. New York: Scribner

Human Memory
Barbey, Aron K, Antonio Belli, Ann Logan, Rachael Rubin, Marta Zamroziewicz, and Joachim T Operskalski. 2015. “Network Topology and Dynamics in Traumatic Brain Injury.” Current Opinion in Behavioral Sciences 4. 2015.
Bird, Chris M., Dennis Chan, Tom Hartley, Yolande A. Pijnenburg, Martin N. Rossor, and Neil Burgess. 2009. “Topographical Short-Term Memory Differentiates Alzheimer’s Disease from Frontotemporal Lobar Degeneration.” Hippocampus 20 (10): 1154–69. doi:10.1002/hipo.20715.
Draaisma, Douwe. “Metaphors of Memory.”
Draschkow, D., J. M. Wolfe, and M. L.- H. Vo. 2014. “Seek and You Shall Remember: Scene Semantics Interact with Visual Search to Build Better Memories.” Journal of Vision 14 (8): 10–10. doi:10.1167/14.8.10.
Drew, Trafton, Sage E. P. Boettcher, and Jeremy M. Wolfe. 2016. “Searching While Loaded: Visual Working Memory Does Not Interfere with Hybrid Search Efficiency but Hybrid Search Uses Working Memory Capacity.” Psychonomic Bulletin & Review 23 (1): 201–12. doi:10.3758/s13423-015-0874-8.
Foster, David J., and Matthew A. Wilson. 2006. “Reverse Replay of Behavioural Sequences in Hippocampal Place Cells during the Awake State.” Nature 440 (7084): 680–83. doi:10.1038/nature04587.
Hassabis, Demis, Carlton Chu, Geraint Rees, Nikolaus Weiskopf, Peter D. Molyneux, and Eleanor A. Maguire. 2009. “Decoding Neuronal Ensembles in the Human Hippocampus.” Current Biology 19 (7): 546–54. doi:10.1016/j.cub.2009.02.033.
Ishai, A. 2002. “Visual Imagery of Famous Faces: Effects of Memory and Attention Revealed by fMRI.” NeuroImage 17 (4): 1729–41. doi:10.1006/nimg.2002.1330.
Klein, Stanley B., Leda Cosmides, John Tooby, and Sarah Chance. 2002. “Decisions and the Evolution of Memory: Multiple Systems, Multiple Functions.” Psychological Review 109 (2): 306–29. doi:10.1037//0033-295X.109.2.306.
Kondo, Yumiko, Maki Suzuki, Shunji Mugikura, Nobuhito Abe, Shoki Takahashi, Toshio Iijima, and Toshikatsu Fujii. 2005. “Changes in Brain Activation Associated with Use of a Memory Strategy: A Functional MRI Study.” NeuroImage 24 (4): 1154–63. doi:10.1016/j.neuroimage.2004.10.033.
Llewellyn, Sue. 2013. “Such Stuff as Dreams Are Made on? Elaborative Encoding, the Ancient Art of Memory, and the Hippocampus.” Behavioral and Brain Sciences 36 (06): 589–607. doi:10.1017/S0140525X12003135.
Madl, Tamas, Ke Chen, Daniela Montaldi, and Robert Trappl. 2015. “Computational Cognitive Models of Spatial Memory in Navigation Space: A Review.” Neural Networks 65 (May): 18–43. doi:10.1016/j.neunet.2015.01.002.
Magnussen, Svein. 2009. “Implicit Visual Working Memory.” Scandinavian Journal of Psychology 50 (6): 535–42. doi:10.1111/j.1467-9450.2009.00783.x.
Maguire, Eleanor A., Elizabeth R. Valentine, John M. Wilding, and Narinder Kapur. 2002. “Routes to Remembering: The Brains behind Superior Memory.” Nature Neuroscience 6 (1): 90–95. doi:10.1038/nn988.
NAIRNE, JAMES S., and JOSEFA NS PANDEIRADA. 2016. “Adaptive Memory: The Evolutionary Significance of Survival Processing.” Accessed April 30. http://evo.psych.purdue.edu/downloads/2016_Nairne_Pandeirada.pdf.
Nairne, James S., Sarah R. Thompson, and Josefa N. S. Pandeirada. 2007. “Adaptive Memory: Survival Processing Enhances Retention.” Journal of Experimental Psychology: Learning, Memory, and Cognition 33 (2): 263–73. doi:10.1037/0278-7393.33.2.263.
Naya, Y., and W. A. Suzuki. 2011. “Integrating What and When Across the Primate Medial Temporal Lobe.” Science 333 (6043): 773–76. doi:10.1126/science.1206773.
O’Keefe, John, and Jonathan Dostrovsky. 1971. “The Hippocampus as a Spatial Map. Preliminary Evidence from Unit Activity in the Freely-Moving Rat.” Brain Research 34 (1): 171–75.
O’Keefe, John, and Lynn Nadel. 1978. The Hippocampus as a Cognitive Map. Oxford : New York: Clarendon Press ; Oxford University Press.
Papassotiropoulos, Andreas, and Dominique J-F de Quervain. 2015. “Genetics of Human Memory Functions in Healthy Cohorts.” Current Opinion in Behavioral Sciences 4 (August): 73–80. doi:10.1016/j.cobeha.2015.04.004.
Peters, Marco, Mónica Muñoz-López, and Richard GM Morris. 2015. “Spatial Memory and Hippocampal Enhancement.” Current Opinion in Behavioral Sciences 4 (August): 81–91. doi:10.1016/j.cobeha.2015.03.005.
Schiller, Daniela. n.d. “Memory and Space: Towards an Understanding of the Cognitive Map.”
Sparrow, Betsy, Jenny Liu, and Daniel Wenger. 2011. “Google Effects on Memory: Cognitive Consequences of Having Information at Our Fingertips.” Science 333.
Squire, Larry R., and Carolyn Backer Cave. 1991. “The Hippocampus, Memory, and Space.” Hippocampus 1 (3): 269–71.
Wixted, John T., and Ebbe B. Ebbesen. 1991. “On the Form of Forgetting.” Psychological Science 2 (6): 409–15.

Memory Palace
Foer, Joshua. 2012. Moonwalking with Einstein: The Art and Science of Remembering Everything. London: Penguin Books.
Qureshi, A., F. Rizvi, A. Syed, A. Shahid, and H. Manzoor. 2014. “The Method of Loci as a Mnemonic Device to Facilitate Learning in Endocrinology Leads to Improvement in Student Performance as Measured by Assessments.” AJP: Advances in Physiology Education 38 (2): 140–44. doi:10.1152/advan.00092.2013.
Yates, Frances Amelia. 2002. The Art of Memory. Nachdr. Chicago, Ill.: Univ. of Chicago Press.

Memory Augmentation Interfaces
Brooks, B. M., and F. D. Rose. 2003. “The Use of Virtual Reality in Memory Rehabilitation: Current Findings and Future Directions.” NeuroRehabilitation 18 (2): 147–57.
Colley, Ashley, Jonna Häkkilä, and Juho Rantakari. 2014. “Augmenting the Home to Remember: Initial User Perceptions.” In , 1369–72. ACM Press. doi:10.1145/2638728.2641717.
DeVaul, Richard W., Vicka R. Corey, and others. 2003. “The Memory Glasses: Subliminal vs. Overt Memory Support with Imperfect Information.” In Null, 146. IEEE. http://www.computer.org/csdl/proceedings/iswc/2003/2034/00/20340146.pdf.
Feiner, Steven, ACM Digital Library, ACM Special Interest Group on Computer-Human Interaction, and ACM Special Interest Group on Computer Graphics and Interactive Techniques. 2008. Virtual Reality as a Tool for Assessing Episodic Memory. New York, NY: ACM. http://dl.acm.org/citation.cfm?id=1450579.
Feiner, Steven, Blair MacIntyre, Tobias Höllerer, and Anthony Webster. 1997. “A Touring Machine: Prototyping 3D Mobile Augmented Reality Systems for Exploring the Urban Environment.” Personal Technologies 1 (4): 208–17.
Green, C. Shawn, and Daphne Bavelier. 2015. “Action Video Game Training for Cognitive Enhancement.” Current Opinion in Behavioral Sciences 4 (August): 103–8. doi:10.1016/j.cobeha.2015.04.012.
Harman, Joshua. 2001. “Creating a Memory Palace Using a Computer.” In CHI’01 Extended Abstracts on Human Factors in Computing Systems, 407–8. ACM. http://dl.acm.org/citation.cfm?id=634306.
Hou, Lei, Xiangyu Wang, Leonhard Bernold, and Peter E. D. Love. 2013. “Using Animated Augmented Reality to Cognitively Guide Assembly.” Journal of Computing in Civil Engineering 27 (5): 439–51. doi:10.1061/(ASCE)CP.1943-5487.0000184.
Ikei, Yasushi, and Hirofumi Ota. 2008. “Spatial Electronic Mnemonics for Augmentation of Human Memory.” In Virtual Reality Conference, 2008. VR’08. IEEE, 217–24. IEEE. http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=4480777.
Kawamura, Tatsuyuki, Tomohiro Fukuhara, Hideaki Takeda, Yasuyuki Kono, and Masatsugu Kidode. 2007. “Ubiquitous Memories: A Memory Externalization System Using Physical Objects.” Personal and Ubiquitous Computing 11 (4): 287–98. doi:10.1007/s00779-006-0085-4.
Legge, Eric L.G., Christopher R. Madan, Enoch T. Ng, and Jeremy B. Caplan. 2012. “Building a Memory Palace in Minutes: Equivalent Memory Performance Using Virtual versus Conventional Environments with the Method of Loci.” Acta Psychologica 141 (3): 380–90. doi:10.1016/j.actpsy.2012.09.002.
Quintana, Eduardo, and Jesus Favela. 2013. “Augmented Reality Annotations to Assist Persons with Alzheimers and Their Caregivers.” Personal and Ubiquitous Computing 17 (6): 1105–16. doi:10.1007/s00779-012-0558-6.
Ragan, Eric D., Doug A. Bowman, and Karl J. Huber. 2012. “Supporting Cognitive Processing with Spatial Information Presentations in Virtual Environments.” Virtual Reality 16 (4): 301–14. doi:10.1007/s10055-012-0211-8.
Rhodes, Bradley J. 2016. “The Wearable Remembrance Agent.” In Proceedings of 1st International Symposium on Wearable Computers, ISWC’97, 123–28.

2016 Final Projects

Thinking FastAR

05/07/2016 kennethf

This report is available as a PDF here, and is replicated on my personal website here.

Abstract

When faced with a decision, it is often difficult for people to choose the best option for their long term well-being. Augmented reality can enable users to see more than what is physically present. Here, I propose and demonstrate an augmented reality system that helps users make healthier choices. Using object recognition as a heuristic for decision recognition, the system guides the user toward objects that align with a user’s personalized health goals. The current implementation of this system involves predefined objects with hard coded health goals. For future work, recent advances in object recognition appear promising for a more universal version of the system, which will provide more flexibility and customization for users.

Introduction

How often have you reached for a soda, even though you know a bottle of water would have been better for you? How many times have you taken the elevator to the third floor, instead of walking two flights of stairs? When making a decision, people are often nearsighted. It can be hard for people to make the best decision for their long term well-being, when it is so much easier to think about short term gains.

Recent work in cognitive psychology and behavioral economics have catalogued the biases people face that affect their decision making abilities. The knowledge of these biases are beneficial in reflecting on previous decisions, but do little to improve real-time decision making. Real-time decision making relies almost entirely on what can be “seen” in the moment. The question then arises, can we build technology that enables us to “see” more?

Augmented reality (AR) is the perfect solution. AR can literally augment what the user can see. Here, I investigate whether AR can be used to augment decision making. I propose FastAR¹, a framework that can detect and provide feedback for certain types of decision opportunities in real time. I also present a successful, but early, implementation of the framework in the form of an iOS application.

[1]: pronounced “fast-ar” or “fast-a-r” interchangeably

Vision

In theory, people should make decisions that benefit them in the long term. However in reality, short term results are much easier to consider. This phenomenon in human behavior is due to a multitude of cognitive biases outlined in Thinking, Fast & Slow, by Daniel Kahneman. These biases can be summarized in Khaneman’s acronym WYSIATI: What You See Is All There Is. That is, human decision making is based on information that is immediately available.

kahnemansystems

Kahneman’s model of cognitive thinking, in which System 1 acts quickly and System 2 plans more care- fully. FastAR attempts to modify System 1 decision making without decreasing System 1’s speed.
(Figure from D. Kahneman)

I believe augmented reality can be used to augment this information, and therefore improve decision making. FastAR is a framework and implementation of this vision. The framework is a conceptualization of a computer system that can understand when a user is making a decision, and then present suggestions to the user to make a better long term decision. The implementation is an iOS application meant to demonstrate the fundamental ideas of the framework. Because the iOS app is an incomplete execution of the vision, the framework and implementation are discussed independently, below.

Background and Related Work

FastAR draws on background work from both psychology and augmented reality. In psychology, the system draws on the work of Kahneman and Tversky’s nobel-prize winning work in judgement and decision making. Many of these ideas are summarized in Kahneman’s Thinking, Fast & Slow.

In a humorous account of a conversation with Kahneman, Richard Thaler[2] explains that Kahneman was working on a book in 1996 and claimed that the book would be ready in six months. The book was published in 2000, 4 years later. Kahneman had fallen victim to the planning fallacy (a phrase that he himself had coined). If Kahneman can’t avoid cognitive biases, no one can. It’s clear from Kahneman’s work, knowledge alone isn’t enough to overcome cognitive biases.

In augmented reality, FastAR draws on a number of previously developed systems. Most notably, many elements of AfterMath. AfterMath is an AR system that enables users to see into the future. While AfterMath demonstrates the effects of a user’s action on the outside world, FastAR attempts show the personal effects of a user’s decision.

[2]: a professor of economics who has often collaborated with Kahneman

FastAR Framework

There are three main components to the FastAR framework. First, there is a backend representation of the user. This is useful for personalizing the system to a given user. Second, FastAR has a universal object recognition system. This can detect an object in sight, and check to see if that object is relevant to the user. Third, the AR component presents the user with a suggestion to guide them toward a better solution. Each component will be discussed in detail.

fastarFlow

First, an internal representation of the user is a key component of the system. Individuals differ in their health goals and health desires. It would be problematic for a system to dictate what choices are best for the user. Therefore, upon acquisition of the system, the user is prompted to select health goals, such as “exercise more” or “drink less soda.” It is also possible for user’s to create customized goals by informing the system of categories of objects that would be either beneficial or harmful to the goal. For example, a particular user might want to reduce the number of times they eat at fast food restaurants. This would be a user defined goal, so the user simply informs the system that “McDonalds” and “Burger King” are “negative objects” (explained below). With this simple rule-based method, FastAR can easily decide if a particular object helps or hinders the goals of the user.

The second component, universal object recognition, enables the system to understand the world around it. Recent breakthroughs in deep learning systems allow for real-time object recognition (such as UC Berkley’s accurate object recognition system). Once an object is recognized, it determines whether an object is a “positive” or “negative” object based on whether the object is or is not beneficial for the given user’s health goals.

The third component provides the user with feedback. Positive objects result in a reward for the user, while negative objects prompt suggestions towards alternatives. The feedback starts at the subliminal level. For example, a negative object might result in a quiet but irritating sound effect, or a blurring of the visual field over the object. The suggestions increase in force over time. For example, after many failed attempts at avoiding the soda, a user might be shown a video describing the negative side effects of a high-sugar drinks.

Together, these three components enable the system to understand the goals of the user, recognize objects, and suggest alternatives to a given object based on the user’s goal.

FastAR Implementation

I have implemented a successful but early version of FastAR as an iOS application. While AR goggles would improve the just-in-time nature of the system, the hand-held version demonstrates the system’s potential.

The internal representation of the current system exists, but is limited to hard coded goals. These goals can be turned on or off, but new goals can not currently be added without modifying the code. As a demonstration, the system understands the goals “exercising more” (which understands stairs as a positive object and elevator as a negative object) and “drinking less soda” (which recognizes Mtn Dew and Coke as negative object and water as a positive object).

The object recognition module is powered by Vuforia, which uses targets to recognize objects. This is perfect for objects with brand logos, such as sodas. However the method breaks down when detecting general objects. Therefore, the elevator demonstration will only detect the specific elevator at the MIT Media Lab. Deep learning systems would make this far more generalizable, and they can already detect objects in real-time locally (such as Jetpac’s DeepBelief System).

For the AR content, the app displays a video based on the selected object. In the demonstration, FastAR first detects a soda and then plays a video explaining the negative effects of soda. This likely an extreme augmentation, and it would be rarely used in a complete version of the framework.

Usage Scenarios

A complete implementation of the framework would be widely general and extremely flexible. There are only two requirements for a given use case. First, the decision must be based on a visual, physical object. For example, choosing a drink is based on a physical object: the drink itself. The system would not be able to help a user decide to stand up and take a break from work (which other systems can easily do). Second, the object must be widely understood to be either positive or negative for your health. Presented with an unknown object, the system would not be able to provide feedback.

The FastAR implementation demo is limited to hard coded demos. Soda v. Water and Elevator v. Stairs are the two primary examples described throughout this report and easily accessible on the iOS app.

Future Work

Obvious future improvements involve upgrading the iOS app to implement the complete framework. This includes transitioning from the Vuforia SDK to a deep neural network system. It would also be necessary to upgrade the back-end representation of the user to be more flexible.

More interestingly, the best way to display the suggestions should be researched. Currently, the system presents obtrusive videos, however more nuanced approaches might be much more successful. Subliminal interfaces, or barely notable FOV modifications might change user behavior without conscious understanding of the change.

Finally, the framework add the feature of being able to see into the future further. It would be interesting to experiment with being able to simulate what happens to the user when a decision is made. For example, unhealthy decisions could result in a change to physical appearance.

Contributions

I have presented FastAR, a framework for using augmented reality to enhance a user’s decision making capabilities. I have drawn on research from psychology and prior AR systems to develop a system that helps users make healthier choices. Also, I have implemented a prototype of this framework on an iOS device. This handheld version provides a glimpse at what is possible with modern technology and demonstrates the advantages of this framework.

Citations

Psychology

Thinking, Fast & Slow
Planning Fallacy
Nudge: Improving Decisions About Health, Wealth, and Happiness by Richard Thaler

Augmented Reality

This project report is replicated my personal website, here: http://kennethfriedman.org/projects/fastar/

2016 Final Projects

Final project

01/28/2016 sangwon